Article - How long is Too Long? Data Retention by Search Engines and Australian Privacy Law (August 2007)

[ Galexia Dots ]

Related Galexia services and solutions


  • Privacy Law Bulletin, volume 4, number 3, 2007


The powerful search engines that respond to individuals’ online queries for information have become an established presence in the online marketplace, offering an indispensable service to millions of users everyday. Access to the records of individuals’ past searches entered into these engines can reveal untold amounts of information about them, including the most private details of their interests, habits, and preferences.

Recent months have seen Google, the world’s most popular search engine, at the centre of controversy concerning the length of time it retains records of its users’ search queries. This is not the first time the practices of search engines have hit the headlines or attracted privacy watchdogs’ attention. Nor is Google alone among search engines in retaining users’ search queries for extended periods. The Google case does, however, provide a very useful illustration of the significance of data protection concerns raised by the retention of data on past searches and the importance for search engines of implementing robust privacy policies and practices.

This article examines these issues from the perspective of Australian privacy law and suggests steps that search engines can take to ensure compliance with the law and promote trust among users.

The Google Controversy

In March 2007, Google announced a change to its policy of data retention whereby users’ search logs would be anonymised within 18 to 24 months. Previously search logs, including Internet Protocol (IP) addresses and cookie data, were retained by Google indefinitely, making it simple to link a specific search back to an individual computer or user.

While the new policy was ostensibly aimed at providing its users more privacy protection, Google’s announcement backfired by drawing attention to the fact that such logs were maintained in identifiable form at all. Privacy advocates were quick to denounce the new policy as inadequate and the European Commission’s Article 29 Data Protection Working Party began an investigation into the company’s data retention practices. In May 2007, the Working Party wrote to Google stating that the retention of search data in an identifiable form for 18-24 months ‘does not seem to meet the requirements of the European legal data protection framework’ and requested more detailed information on the company’s practices for further consideration.[1]

Google’s privacy woes were further intensified with the release, in June 2007, of a report by Privacy International describing the results of a six-month investigation into the ‘best and worst’ privacy practices of a number of major Internet service companies.[2] The report gave Google its lowest ranking describing it as presenting an ‘endemic threat to privacy.’ Among the reasons for this ranking, the report noted the 18-24 month retention period that it described as ‘unacceptable, and possibly unlawful in many parts of the world.’

Google has since taken some measures to address the concerns relating to retention of search data. In June 2007 it agreed to anonymise the search logs after 18 months[3] and in July 2007 it announced that it would reduce the lifetime of the cookies it places on user’s computers to a renewable 2-year period.

Whether these actions will be sufficient to satisfy the concerns of the EC Article 29 Working Party remains to be seen. The outcome of the investigation, which was expanded in June 2007 to cover all search engines,[4] will centre on whether the retention period is ‘no longer than necessary’ to achieve a valid purpose in accordance with Article 6(1) of the EU Data Protection Directive 1995.[5]

One interesting question to have arisen in this regard is whether search engines are subject to the requirements increasingly imposed on communications providers to retain traffic data. Such requirements are designed to assist law enforcement by preserving information that may be useful in the prevention, investigation, detection, and prosecution of criminal offences, in particular organised crime and terrorism. Google has asserted the requirement to comply with the EU Data Retention Directive 2005,[6] and possible new requirements in the US, as one of its main reasons for retaining search records. EC officials have already indicated that such requirements do not apply because search engines are not communications providers and because search data qualifies as content rather than traffic data. [7] Therefore, Google and other search engines will likely need to rely on different grounds to justify this kind of data retention.

Risks Associated with Search Data Retention

In seeking to understand the reaction to Google’s privacy policy, and the sensitivity surrounding this issue generally, it is worth considering the risks associated with retention of users’ search records.

As with any storage of electronic data, there is always a risk of a security breach relating to users’ search logs. In 2006, this risk became a reality for over 650,000 users when AOL revealed three months of their search records on its website. Although no personal data was included in the logs, each user was assigned a unique number making it possible to uncover his/her true identity. To demonstrate how easily this could be done, within days a New York Times reporter identified one user by following the detailed record of her searches included in the logs.[8] Predictably the incident provoked public outrage, prompting the resignation of AOL’s Chief Technology Officer and a host of legal action.

A further risk is that search data will be requested by government authorities in legal proceedings. A recent case in the United States clearly illustrates this risk. In August 2005, the US Department of Justice served four search engines (Google, AOL, Yahoo and MSN) with a subpoena seeking the delivery of two months worth of users’ search queries. All of the search engines, except Google who successfully challenged the subpoena, complied with the request and delivered the records. Although the request was tailored to exclude any personal information associated with the queries, the incident highlighted the possibility that full search logs - including personal information - would be requested and delivered to law enforcement authorities in the future without users’ consent or without compliance with the procedural safeguards (i.e. a court issued warrant) that would normally be expected for the production of this kind of private material.

There is also the possibility that individuals’ search records will be requested through discovery by private parties in civil litigation. There is already a growing body of civil cases in the United States in which information relating to individual users has been requested, with mixed success, from Internet Service Providers or website operators. In one recent copyright-related case, a US court ordered a website operator to preserve and produce ‘extremely relevant’ server log data even though the website didn’t normally retain such data in the normal course of its business.[9] Developments such as these have significant implications for the privacy of individuals’ search queries making it more likely that, if retained, they will be routinely requested by third parties in legal proceedings.

Although presented from the perspective of an individual user, the above scenarios can also present risks to the search engine company retaining the data. Most obviously, there is a risk of damage to the company’s reputation if it is seen as inadequately protecting its customers’ privacy. In addition, increased reliance on search data by third parties in legal proceedings may impose serious time and financial burdens not only in terms of recovering the data but also in terms of determining the legal obligation to comply with such requests.

Australian Regulation of Search Data Retention

In Australia, the handling of personal information is regulated at the federal level by the Privacy Act 1998 (as amended). The Act defines personal information as ‘information or an opinion... about an individual whose identity is apparent, or can reasonably be ascertained, from the information or opinion.’[10] The adequacy of this definition to cover the kind of information routinely collected on individuals in the electronic age, including search data, is currently being considered by the Australian Law Reform Commission (ALRC) in its inquiry into the Privacy Act. In a 2006 Issues Paper, the ALRC notes that in some circumstances information, such as an IP address, which enables individuals to be contacted, tracked, or profiled, may not be considered personal information because it does not allow for the identity of the individual to be ‘reasonably ascertained’.[11] As such, there is not yet an authoritative answer to the question of whether search data, which will contain an IP address but not necessarily information directly identifying the individual searcher, constitutes personal information within the scope of the Privacy Act.

It is clear, however, that search data can, and already has, been used to reveal the identity of individual searchers. While any individual element contained in the record may not reveal an individual’s identity, when taken as a whole the record can lead to identity being ‘reasonably ascertained’. Therefore, there is a strong argument that the circumstances in which this data is collected and stored means it is personal information. At least one leading commentator on Australian Internet law has argued that this information is protected under the Privacy Act.[12] In addition, there is a definite trend elsewhere towards regarding search data as protected personal information. The Resolution on Privacy Protection and Search Engines, adopted by International Privacy Commissioners in November 2006 states that ‘search histories stored by providers of search engines now in many cases may constitute personally identifiable data.’ [13] As demonstrated by its ongoing investigation, the EC Article 29 Working Party also considers search logs that can be linked to individuals as personal information.

Given these influential developments, and in the absence of legal certainty domestically, it would therefore be advisable for any search engine service, not covered by the small business exception,[14] operating within Australia to handle search data in accordance with the Privacy Act and the National Privacy Principles contained therein. The extra-territorial scope of the Privacy Act[15] means that search engines outside Australia should also comply with these principles if they are handling personal information relating to an Australian citizen or resident and provided there is some organisational or other link with Australia.

The following NPPs stand out as of particular relevance to the terms of collection and retention of individuals’ search records by search engines.

NPP 10: Collection of sensitive information

NPP 10 provides heightened protection for ‘sensitive information’ which is defined in the Act to include, among other things, personal information or opinion about an individual’s religious beliefs and affiliations, political opinions or associations, and sexual preferences. The NPPs do not distinguish between the incidental collection of sensitive data (i.e. in the course of collection of other data) and the intended collection of sensitive data. As search data will very often reveal information about an individual user’s beliefs, affiliations and preferences, it can plausibly be argued that search data is subject to this higher standard.

Unlike regular personal information, which under NPP 1 requires simple notice of the collection, sensitive data may be collected by a private organisation only where strict preconditions are met. These include where the subject consents, where the collection is required by law, or where the collection is necessary to prevent or lesson a threat to the life or health of an individual.

With regard to search data, there are no legal requirements for routine collection (see below) nor is it likely that the collection of such data will ever be necessary to safeguard the life or health of any individual. Therefore, assuming that the higher standard for sensitive data applies, search engines should obtain consent before maintaining any record of individuals’ search queries in an identifiable form.

The difficulty of this requirement, in practical terms, is that a search engine does not know in advance what information a user will enter and consequently cannot determine whether consent is necessary. Guidance from the Federal Office of the Privacy Commissioner (OPC) puts forward one option for overcoming this problem. It suggests that by providing detailed notice to individuals with respect to the possible uses and disclosures of any information collected (sensitive or not) an organisation can assume it has consent for such activity. [16]

NPP 2: Secondary uses of personal information

The general rule under NPP 2 is that personal information should not be used or disclosed for a purpose other than that for which it was collected. The principle does provide exceptions for certain secondary uses including most relevantly where: (1) the secondary use relates (or directly relates in the case of sensitive information) to the primary purpose of collection and such use could be reasonably expected by the individual; or (2) the individual consents or (3) the secondary use is direct marketing and its impracticable to seek express consent and the individual has the opportunity to opt-out.

With respect to the secondary use of search queries by search engines, it is highly unlikely that this would fall within the first exception. Even if the use does relate to the provision of a search service (e.g. improving the service) it would be difficult to demonstrate that the average user would reasonably expect such use of his/her information in identifiable form. As such, search engines intending to retain search queries in identifiable form would be advised to obtain the individual’s consent.

NPP 4: security of personal information

NPP 4.1 requires organisations to ‘take reasonable steps to protect the personal information it holds from misuse and loss and from unauthorised access, modification or disclosure.’ Thus an incident such as the above-mentioned leak of users’ search records by AOL would likely breach this provision as would disclosure due to a malicious attack if adequate security measures were not in place.

Even more pertinent to the issue of search data retention is the requirement in NPP 4.2 for an organisation to ‘take reasonable steps to destroy or permanently de-identify personal information’ once it is no longer needed for any lawful secondary purpose. The OPC Guidance on this principle notes that ‘reasonable steps’ would include ‘ensuring that the de-identified information cannot be re-identified in the hands of an organisation receiving the data.’ In the context of search engines, application of this principle would mean stripping the search logs of IP addresses or any other kind of unique number that could be used to uncover the identity of the individual user.

Data retention requirements

As noted earlier in this article, the idea that search engines may have a duty under new data retention laws to retain individuals’ search records has been dismissed within the European context. In Australia, communications providers are currently not subject to data retention requirements. Even if such requirements were to be introduced in the future, it is unlikely that they would be so broad as to apply to the retention of search queries by search engines. Consequently, while search engines in Australia may be obliged in certain cases to reveal search data pursuant to a valid warrant or court order, they do not have any requirement to pre-emptively keep this data nor is it likely that they will be subject to any such requirements in the future.

Steps to Minimise Legal Liability and Promote User Trust

In today’s environment, there is a clear opportunity for companies to use privacy as a market differentiator. There are already signs of this in the search engine market, as Google’s competitors have rushed to take advantage of its recent negative publicity and lure away disillusioned users with offers of more privacy friendly services.[17] This opportunity, as well as the need to comply with differing data protection laws, provides strong incentives for search engines to re-examine and strengthen their privacy policies and practices. The following sets out some simple steps that search engines can take to promote trust and loyalty among users and avoid legal difficulties.

Provide fair and transparent notice

Despite recent media attention to the issue, most users would still be shocked to learn the extent of the data retention performed by search engines. To ensure users are better informed, search engines should prominently display privacy policies that state in clear language what information is collected, for what purpose, for how long it will be kept, and under what circumstances it will be disclosed to third parties.

Request consent for retention of data

Recognising the sensitive nature of search data, search engines should request users to expressly consent, or at a minimum offer them an opt-out, before retaining this data in an identifiable form. Express consent is the standard recommended by the International Privacy Commissioners, in the Resolution on Privacy Protection and Search Engines 2006.

Retain data only for as long as is necessary

In any case where search data is stored in identifiable form, it should only be retained for as long as is necessary to fulfil a lawful business objective. After this time, it should be destroyed or, at a minimum, de-identified. As there are no bright line rules to determine at what point data will be deemed ‘no longer necessary,’ this will always involve an element of subjectivity. A good test is for the company to ask whether it would be possible to meet its lawful business objective without this data and / or whether that objective could be met by maintaining the data in an anonymous form.

Ensure proper security

Finally, in all cases where personally identifiable information is retained it should be subject to strict security measures - both technical and managerial - to ensure against unauthorised access, disclosure, or other misuse.

[ Galexia Dots ]

[1] Letter from Peter Schaar, Chairman of the Article 29 Data Protection Working Party, to Peter Fleischer, Google Privacy Counsel, 16 May 2007, <>.

[2] Privacy International, A Race to the Bottom: Privacy Ranking of Internet Service Companies, 9 June 2007, <[347]=x-347-553961>.

[3] Letter from Peter Fleischer, Google Privacy Counsel, to Peter Schaar, Chairman of the Article 29 Data Protection Working Party, 10 June 2007, <>.

[4] Article 29 Data Protection Working Party, Press Release, 21 June 2007, <>.

[5] Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data, Official Journal of the European Communities L281/31, 23 November 1995, <>.

[6] Directive 2006/24/EC of the European Parliament and of the Council of 15 March 2006 on the retention of data generated or processed in connection with the provision of publicly available electronic communications services or of public communications networks and amending Directive 2002/58/EC, Official Journal of the European Union L105/54, 13 April 2006, <>.

[7] OUT-LAW News, Data retention laws do not cover Google searches, says Europe, 13 June 2007, <>.

[8] M Barbaro and T Zeller, A Face Is Exposed for AOL Searcher No. 4417749, New York Times, 9 August 2006, <>.

[9] Columbia Pictures Indus. v. Bunneli, CV 06-1093 FMC(JCx), US District Court Central District of California, 5 May 2007.

[10] Section 6(1), Privacy Act 1988 (Cth)

[11] Australian Law Reform Commission, Review of Privacy: Issues Paper 31, October 2006, para 11.117, <>.

[12] See generally, Y F Lim, Cyberspace Law: Commentaries and Materials, 2nd edn, Oxford University Press, 2007, pp127-136.

[13] International Data Protection and Privacy Commissioners’ Conference, Resolution on Privacy Protection and Search Engines, 2-3 November 2006, <>.

[14] Section 6C of the Privacy Act creates an exception for most small businesses, which are defined in section 6D as businesses with an annual turnover of $3,000,000 or less.

[15] The extraterritorial provisions are set out in Section 5B of the Privacy Act.

[16] Office of the Privacy Commissioner, Guidelines to the National Privacy Principles, September 2001, <>.

[17] In July 2007, announced it would make a new anonymous search feature available by the end of 2007. Yahoo and Microsoft also announced improvements to their privacy policies.