Jul 05

One Month is Enough

Tag: IdeasOren Hurvitz @ 8:31 pm
No Photos! (photo by vinnie bezoomny)

Dear friends,

The first rule of crisis management is to get ahead of the story. Since my shameful secret is about to be revealed, I decided to break it here first. I’d rather you heard it from me than from the media:

In March 2008 I watched Rick Astley’s music video Never Gonna Give You Up on YouTube. It’s widely considered to be the most corny music video ever created. I have no excuse; I can’t even claim to have been RickRolled. I heard about the video, and willingly went and viewed it. It was me, just me, officer!


The reason for this confession is that Google is about to hand over to Viacom a complete list of every video watched by YouTube users:

[...] the judge granted a Viacom motion that records of every video watched by YouTube users, including their login names and IP addresses, be turned over to the entertainment giant.

The order prevents Viacom from using this information to target lawsuits at users. But it makes no sense to give this information to Viacom in the first place: Google could easily make this data anonymous, and they’ve asked Viacom to do just that. Viacom have said that they won’t use any personally identifiable data, but they haven’t replied to Google’s request directly. These mixed signals make me lunge for my tin foil hat: what could explain Viacom’s behavior? Perhaps, once they have the logs in their possession, they intend to ask the judge to allow them greater use of the data. Or perhaps the data will be “accidentally” leaked — after all, that sort of thing happens all the time.

But criticizing a media company like Viacom for ignoring users’ privacy is like berating a toddler for getting food all over themselves: it’s in their nature, and they’re going to keep doing it. Let’s beat up on Google instead, that never gets old. Google shouldn’t have kept this data around for Viacom to subpoena. Google deletes personally identifiable user data after 18 months, which isn’t enough to hide my Rick Astley obsession. Google’s track record on privacy is spotty in general. For example, after a lot of pressure they finally added a link to their privacy policy on the Google homepage in July 2008, not before bitching and moaning like a teenager whose parents have forced him to clean his room.

Google has some of the most sensitive data in the world; in particular, they know every search that a user makes. In their Privacy FAQ they list several good reasons why they need to keep this data:

  • To improve search results
  • To maintain the security of their systems
  • To prevent fraud and other abuses

It’s true that in order to achieve these goals Google needs to save the search logs. However, the problem isn’t that they keep the search logs; it’s that they keep personally identifiable information in the logs, which lets them (or anyone else, such as Viacom) associate searches and clicks with real people. Google keeps this information for 18 months, and that’s far too long. They could erase the personal information much sooner and still achieve all of the goals described above.

For example, Google use the search logs to find common spelling mistakes made by users, so that they can offer automatic suggestions for the correct spelling. This doesn’t require any personally identifiable information. Another use for the search logs is to detect click fraud. For this purpose it is indeed useful to look at the search and click history of individual users. However, the benefit of this personal data quickly diminishes with time. Data about click fraud that is over a month old should be considered prehistoric; the perpetrators are long gone from whatever IP they had been using.

Private Property (photo by Zervas)

Google’s privacy policy doesn’t say how long they keep search logs; probably forever. The only promise they make is to scrub out personally identifiable information after 18 months. Google are very vague about where this figure of “18 months” comes from; perhaps it has some religious significance. From Google’s Privacy FAQ:

Why are logs kept for 18 months before being anonymized?

We strike a reasonable balance between the competing pressures we face, such as the privacy of our users, the security of our systems and the need for innovation. We believe 18 months strikes the right balance.

It’s time we told Google: 18 months is too long. One month would strike the right balance between privacy, security and the need for innovation. With one month of personally identifiable information, Google will be able to catch all the fraud they are ever likely to catch. After that, it’s time to anonymize the data. The anonymized data is still useful for improving their search engine.

Go to Google’s Privacy Feedback page and ask them to reduce the amount of time they keep personally identifiable data in their logs. You could use a message such as this one:

Dear Google,

I’m concerned about your data retention policy: you keep user identifiable information in your search logs for 18 months, and that’s too long. As we have seen with the recent lawsuit by Viacom, this information can easily fall into the hands of third parties. To protect my privacy and the privacy of the rest of your users, please reduce the amount of time you keep personally identifiable data to one month. Thank you.

Google isn’t alone in this. Microsoft also anonymizes its logs after 18 months. Yahoo makes do with just 13 months (how did they come up with that number? Perhaps it also holds occult significance). Ask.com, the fourth-largest search provider, gives its users the option of making completely anonymous searches. But we should focus on Google: where the market leader goes, the rest will surely follow.



30 Responses to “One Month is Enough”

  1. Seth says:

    Letter sent to Google. Thanks for the great idea.

  2. Kiall says:

    Sent…

  3. Robin H Johnson says:

    See what my boss @ IsoHunt wrote on it :-)

  4. William Rea says:

    Sent.

  5. John Hasler says:

    It seems that all of your problems with lack of Google privacy follow from your having a Google account. Why do you need one? I have no difficulty using Google’s search service with no account and without even allowing them to set cookies.

  6. humm says:

    I have a very simple answer for this. Since the topic of the length of time Google holds info. I don’t use Google directly. Nor do I use Yahoo. I use Scroogle, a proxied search engine that uses Google’s data base without the ads in the middle. Google receives an IP but not the one I use. It gets Scroogle’s IP. Since Scroogle is a lot more sensitive about their users data, I can live with their idea far better than Google’s.

    Not to mention that I don’t see 4 pages with ads mixed in the middle. I detest ads with a passion. Ever since I quit watching TV, I find that I am even more sensitive to the intrusion of unwanted advertisement. I take it out on companies every time I shop that have intruded on my privacy. When I look at a product, the first thing that comes to my mind is have I heard of these people by ad? If I have, next item. Ad budgets come back to you in hidden prices you pay. If a companies products are good, they don’t have to advertise. It’s those that do, that I think something is wrong with the product.

    I would not have cared about Google’s retention policy had it not been for the idea this isn’t the first time it’s come up. It’s the first time the results have had impact on it’s users that is coming home to roost.

    I care about my privacy. I have also found if you won’t protect your privacy, no one will do it for you. They will however be quite willing to take advantage of that. They will get a database to sell from your data and you in return will get to pay an extra cost of seeing more ads, taking your valuable time to wade through. I’m sure you have heard that time is money.

  7. Michael says:

    Sent, great idea.

  8. Dallas says:

    Letter sent. Thanks for the heads up.

  9. omar says:

    I sent this I think they just have to anonymizes it. get rid of the problem

    Dear Google,

    I’m concerned about your data retention policy: you keep user identifiable information in your search logs for 18 months, and that’s too long. As we have seen with the recent lawsuit by Viacom, this information can easily fall into the hands of third parties. To protect my privacy and the privacy of the rest of your users, please reduce the amount of time you keep personally identifiable data to one month or anonymizes
    it. thank you

  10. Torsten says:

    Why one month? Why keep personally identifiable data at all? Why not make it anonymous before saving it?

  11. Rodrigo says:

    Sent

    Q: How long will my email address (which i’ve just entered abouve) be kept on the servers?

  12. Frank says:

    If I start getting email that can be traced back to Viacom, I’ll know why now won’t I.

  13. Z says:

    Hm, how about people that search the web for “how to kill a wife using an X-type gun” and then kill their wife with an X-type gun? For such evidence, I guess it’s reasonable to keep personal data for more than just a month…

  14. Dj says:

    Aww, boohoo. They have my IP address and are going to come after me for watching Family Guy on YouTube. There was not much of a problem before but now that this happens, we should get involved because it’s a big dog company in a tight spot. If you really have a problem with it then use an anonymizer or just don’t use that websites. Oh and for the guy with ad problems on, how bout getting Firefox with “Adblock Plus” and “Remove It Permanently”. I personally don’t see virtually any ads, and the ones I do are gone almost forever with a 2 clicks of my mouse.

  15. SL says:

    Below was my response:

    ____________________________________
    Well … most of my friends are no longer laughing at my refusal to allow Google, or the Google children of Gmail and YouTube, to set ANY cookies whatsoever.

    And I don’t blame Verizon. I blame Google for this latest privacy fiasco.

    The data Google retains about it’s users is an ‘attractive nuisance’*, and as such it was Google’s responsibility to limit personally identifying information from their logs.

    Shame on you Google.

    Oh, and I’ve also switched my search to the privacy protecting PrivacyFinder.org. A little delay in getting results, yes, but keeping my data from your hands is synonymous with keeping my data from Verizon’s hands.

    *
    http://en.wikipedia.org/wiki/Attractive_nuisance_doctrine
    “Under the attractive nuisance doctrine of the law of torts, a landowner may be held liable for injuries to children trespassing on the land if the injury is caused by a hazardous object or condition on the land that is likely to attract children, who are unable to appreciate the risk posed by the object or condition. The doctrine has been applied to hold landowners liable for injuries caused by abandoned cars, piles of lumber or sand, trampolines, and swimming pools. However, it can be applied to virtually anything on the property of the landowner.” (WIKI)

  16. Anonymous says:

    Shout it from the rooftops!
    Tell everyone you know to Copy&Paste the above copypasta into an email to Google.
    Post it on every message board you frequent.
    Put it on your Myspace, Facebook, Bebo, what have you profile.
    Think this doesn’t effect you? Well you are wrong.
    This is only the beginning.
    If this problem isn’t correct now it will only evolve into something far worse.
    Do Your Part!

  17. IonOtter says:

    Of course, you could always add googleanalytics.com, googleadservices.com and googlesyndication.com to your AdBlock and NoScript blacklist? It’s not perfect, but it does help throw things off a bit.

  18. Anon says:

    Google only partially anonymizes your IP after 18 months. Google is the internet’s Biggest Brother.

  19. Oren Hurvitz says:

    @John Hasler: most people do find Google’s services useful, and not just search: Gmail, YouTube, etc. These sevices (other than search) do require a login in order to gain the full benefit (e.g., uploading videos to YouTube). If you want complete privacy then limiting your internet usage to 1995 levels will certainly help. But for most of us, the solution has to come from Google limiting its use of our data; not by giving up on all modern web services.

  20. Oren Hurvitz says:

    @omar: after 18 months, Google anonymizes the data. They don’t delete the full logs; they only remove the personally identifiable information. What we’re asking for is that they do this after 1 month instead of 18 months.

  21. Oren Hurvitz says:

    @Torsten: there are valid uses for the personal data, such as detecting fraud. Asking Google to eliminate it altogether is not realistic. But they could definitely keep it for a much shorter period of time than 18 months.

  22. Oren Hurvitz says:

    @Rodrigo: Yes, now Google has the email addresses of the most vocal critics of their privacy policy… Let’s see if they even reply; I promise to follow up.

  23. David V says:

    sent

  24. Jason says:

    Sent y0

  25. Neil says:

    Sent :)

  26. Frank Mashraqi says:

    Sent!

  27. foo says:

    I don’t really see the problem with Google keeping the data. If you don’t want people to know what you’re searching for, don’t enter it on their website. I actually like Google keeping a record of my search history so that I can search my own search history. There is nothing in the history that I wouldn’t share with the world.

  28. Deryck says:

    sent

  29. Ray R says:

    I sent one also. Thanks for the link and others please do the same now.

  30. jistanidiot says:

    Sending a letter to google won’t help fix this horrible mess. Yes the should anonymize their data immediately (not just the 1month you’re asking for). However we must stop the transfer of records now. Therefore someone needs to file an injunction against this action. Unfortunately that lets them know you’ve been up to something, so most people will just pray no one connects their IP address or username with whatever “bad” content they were downloading. I hope someone will have the courage (and the money) to take the correct action.

Leave a Reply