<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="http://feeds.feedburner.com/~d/styles/atom10full.xsl" type="text/xsl" media="screen"?><?xml-stylesheet href="http://feeds.feedburner.com/~d/styles/itemcontent.css" type="text/css" media="screen"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" xml:lang="en" xml:base="http://hurvitz.org/blog/wp-atom.php">
	<title type="text">Cookies are for Closers: Oren Hurvitz's Blog</title>
	<subtitle type="text">Not a baking blog, but possibly half-baked</subtitle>

	<updated>2008-09-12T08:51:46Z</updated>
	<generator uri="http://wordpress.org/" version="2.6.3">WordPress</generator>

	<link rel="alternate" type="text/html" href="http://hurvitz.org/blog" />
	<id>http://hurvitz.org/blog/feed/atom</id>
	

			<link rel="self" href="http://feeds.feedburner.com/CookiesAreForClosers" type="application/atom+xml" /><entry>
		<author>
			<name>Oren Hurvitz</name>
					</author>
		<title type="html"><![CDATA[Google Improves Privacy, Petulantly]]></title>
		<link rel="alternate" type="text/html" href="http://feeds.feedburner.com/~r/CookiesAreForClosers/~3/388979326/google-improves-privacy-petulantly" />
		<id>http://hurvitz.org/blog/?p=61</id>
		<updated>2008-09-12T07:14:12Z</updated>
		<published>2008-09-10T20:19:14Z</published>
		<category scheme="http://hurvitz.org/blog" term="Uncategorized" />		<summary type="html"><![CDATA[Google have announced that they&#8217;ll reduce the amount of time that they keep individually-identifiable information about searches from 18 months to 9 months. I would like to think that my previous post on this topic played a small part in this decision, but it looks like it was mostly due to pressure from the European [...]]]></summary>
		<content type="html" xml:base="http://hurvitz.org/blog/2008/09/google-improves-privacy-petulantly">&lt;p&gt;Google &lt;a href="http://googleblog.blogspot.com/2008/09/another-step-to-protect-user-privacy.html"&gt;have announced&lt;/a&gt; that they&amp;#8217;ll reduce the amount of time that they keep individually-identifiable information about searches from 18 months to 9 months. I would like to think that &lt;a href="http://hurvitz.org/blog/2008/07/one-month-is-enough"&gt;my previous post&lt;/a&gt; on this topic played a small part in this decision, but it looks like it was mostly due to pressure from the European Union.&lt;/p&gt;
&lt;p&gt;In that post, I showed a request that I had sent to Google, asking them to reduce the amount of time they keep private data to just one month. I asked everyone who read the post to send a similar request to Google, and judging from the comments some people did so. For the record, I did receive a reply from Google, but it was just a standard email that didn&amp;#8217;t actually address my points: it just reiterated their position on data retention. In fine Google tradition, it would appear that no human was involved in sending that response.&lt;/p&gt;
&lt;p&gt;Although I would like to see Google reduce the retention period further, to one month, this is a big step in the right direction. Google deserves credit for listening to the public and changing their practices. It is therefore unfortunate that they chose to pepper this announcement with vague threats:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://hurvitz.org/blog/wp-content/uploads/2008/09/petulant_ballerina_2522191203_2281175cfe.jpg"&gt;&lt;img class="size-medium wp-image-69 alignright" title="Petulant Ballerina" src="http://hurvitz.org/blog/wp-content/uploads/2008/09/petulant_ballerina_2522191203_2281175cfe-199x300.jpg" alt="" width="199" height="300" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&amp;#8220;Back in March 2007, Google became the first leading search engine to announce a policy to anonymize our search server logs in the interests of privacy. [...] Although that was good for privacy, it was a difficult decision because the routine server log data we collect has always been a critical ingredient of innovation.&amp;#8221;&lt;/li&gt;
&lt;li&gt;&amp;#8220;When we began anonymizing after 18 months, we knew it meant sacrifices in future innovations in all of these areas [search quality, security, fighting fraud and reducing spam]. We believed further reducing the period before anonymizing would degrade the utility of the data too much and outweigh the incremental privacy benefit for users.&amp;#8221;&lt;/li&gt;
&lt;li&gt;&amp;#8220;While we&amp;#8217;re glad that this will bring some additional improvement in privacy, we&amp;#8217;re also concerned about the potential loss of security, quality, and innovation that may result from having less data.&amp;#8221;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What on earth could they mean?&lt;/p&gt;
&lt;p&gt;Translation #1: &amp;#8220;Ok world, you win, we&amp;#8217;ll keep the data for less time. But you&amp;#8217;re going to be sorry!&amp;#8221;&lt;/p&gt;
&lt;p&gt;Translation #2: &amp;#8220;We&amp;#8217;ve now reduced our retention period as far as humanly possible, and then some. Please don&amp;#8217;t make us reduce it any more!&amp;#8221;&lt;/p&gt;
&lt;p&gt;Google are keeping this discussion (of how long to keep the data) at a superficial level: they throw a number (&amp;#8221;18 months&amp;#8221;), the European Union throws a number (&amp;#8221;6 months&amp;#8221;?), I throw a number (&amp;#8221;1 month&amp;#8221;). You, too, can become a highly respected privacy advocate by coming up with your own number (that no one else has claimed yet) and writing about it!&lt;/p&gt;
&lt;p&gt;A more substantive discussion would require Google to reveal some of their cards: how much of a benefit to fraud protection do they derive from keeping this data for 9 months (vs. a shorter length of time)? How do 9 months of individually-identifiable information help them improve their algorithms vs. 1 month of such information, especially given that they will always have an unlimited amount of anonymized data?&lt;/p&gt;
&lt;p&gt;Of course, Google will never reveal this information because it would hurt their competitive position. But an experienced programmer, well-versed in the art, can make some reasonable guesses.&lt;/p&gt;
&lt;p&gt;Individually-identifiable information is most important for security, fraud prevention, and fighting spam. But since these are time-sensitive tasks, the information quickly loses its value. I believe that the residual value of this information is close to zero after a few weeks have passed.&lt;/p&gt;
&lt;p&gt;The other use for this data, improving search quality, can be handled with anonymized data for the most part. One example that Google commonly give is their automatic spell checker. But they don&amp;#8217;t need individually-identifiable information in order to figure out that people who search for &amp;#8220;brittaney&amp;#8221; really mean &amp;#8220;britney&amp;#8221;. Yes, I can envision some types of search quality improvements that would benefit from studying individually-identifiable information, but they are a minority, and Google can learn how to do that while keeping data for a shorter period of time. I therefore stand by my position that 1 month of private data would strike the right balance between privacy and security/fraud prevention/spam detection/search quality.&lt;/p&gt;
&lt;p&gt;(Photo by &lt;a href="http://www.flickr.com/photos/26610383@N04/"&gt;lesprit_descalier&lt;/a&gt;)&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=ds4WL"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=ds4WL" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=v0Etl"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=v0Etl" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=juBwL"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=juBwL" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=f9pqL"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=f9pqL" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/CookiesAreForClosers/~4/388979326" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://hurvitz.org/blog/2008/09/google-improves-privacy-petulantly#comments" thr:count="0" />
		<link rel="replies" type="application/atom+xml" href="http://hurvitz.org/blog/2008/09/google-improves-privacy-petulantly/feed/atom" thr:count="0" />
		<thr:total>0</thr:total>
	<feedburner:origLink>http://hurvitz.org/blog/2008/09/google-improves-privacy-petulantly</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Oren Hurvitz</name>
					</author>
		<title type="html"><![CDATA[Lookin&#8217; For Tall Kids]]></title>
		<link rel="alternate" type="text/html" href="http://feeds.feedburner.com/~r/CookiesAreForClosers/~3/369368151/lookin-for-tall-kids" />
		<id>http://hurvitz.org/blog/?p=45</id>
		<updated>2008-08-19T21:14:34Z</updated>
		<published>2008-08-19T21:14:34Z</published>
		<category scheme="http://hurvitz.org/blog" term="Sports" />		<summary type="html"><![CDATA[In the towns, in fields, in basketball courts, the talent scouts &#8212; looking for runners, looking for tall kids who can become championship sprinters.

Hey mister, that&#8217;s my kid over there! See him? He&#8217;s the one you&#8217;re looking for. Look how fast he is, he&#8217;ll break the world record.
No way, too short. Looks like he won&#8217;t [...]]]></summary>
		<content type="html" xml:base="http://hurvitz.org/blog/2008/08/lookin-for-tall-kids">&lt;p&gt;In the towns, in fields, in basketball courts, the talent scouts &amp;#8212; looking for runners, looking for tall kids who can become championship sprinters.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/puck90/2376860644/in/pool-outdoorbasketball"&gt;&lt;img class="size-medium wp-image-48 alignleft" title="Kids playing basketball" src="http://hurvitz.org/blog/wp-content/uploads/2008/08/kids-basketball-bw-300x199.jpg" alt="Kids playing basketball" width="300" height="199" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Hey mister, that&amp;#8217;s my kid over there! See him? He&amp;#8217;s the one you&amp;#8217;re looking for. Look how fast he is, he&amp;#8217;ll break the world record.&lt;/p&gt;
&lt;p&gt;No way, too short. Looks like he won&amp;#8217;t be more&amp;#8217;n five-ten, five-eleven. That doesn&amp;#8217;t cut it anymore. Usain Bolt is six-five. That&amp;#8217;s what I want. Give me ten tall kids, and I&amp;#8217;ll find a champion.&lt;/p&gt;
&lt;p&gt;Wait, wait! Tyson Gay is five-eleven. Maurice Greene is just five-nine. And they won plenty of races!&lt;/p&gt;
&lt;p&gt;That&amp;#8217;s ancient history. We seen Bolt at the 2008 Olympics. Ran the 100 meters, set a new world record, 9.69 seconds. First man under 9.70, and he didn&amp;#8217;t even try hard. With those big strides, he overtook ever&amp;#8217;one else, then slowed down at the end to wave at the crowd and beat on his chest. Now that&amp;#8217;s a champion! All those stubby runners behind him looked like children. No, short runners are the past; tall ones are hot. I&amp;#8217;m looking here today, tomorrow I&amp;#8217;ll be in Oklahoma City.&lt;/p&gt;
&lt;p&gt;God, if I could only get a hundred tall kids. I don&amp;#8217;t care if they run or not. We&amp;#8217;ll take &amp;#8216;em, train &amp;#8216;em, see how fast they can go. Ninety-nine kids out of a hundred won&amp;#8217;t be good enough. Tall runners always have problems: their reaction time is slow, they can&amp;#8217;t move their feet fast enough. That&amp;#8217;s why everyone used to think that sprinters can&amp;#8217;t be tall. But that all changed after we saw Bolt run in the Olympics. Now everyone wants tall runners, runners with the potential to be the next Usain Bolt.&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s like we always thought automobiles should be big, as big as you can get &amp;#8216;em, and then one day we&amp;#8217;re told that small automobiles are better. Everyone&amp;#8217;s dumping their big trucks and gettin&amp;#8217; small cars. Jus&amp;#8217; like that, the world turned upside down. Well, that&amp;#8217;s how it is with runners now. We don&amp;#8217;t want them short; short runners are finished. Only tall ones. Can&amp;#8217;t find tall adult runners; we would have washed them out long ago. Gotta get &amp;#8216;em young, train a whole new generation of runners. When we tell our grandchildren we saw runners under six feet win gold medals they ain&amp;#8217;t gonna believe us. Christ, if I could only get five hundred tall kids! Find &amp;#8216;em, sign &amp;#8216;em up, get them to a trainer, collect the finder&amp;#8217;s fee! If I had enough tall kids I&amp;#8217;d retire in six months.&lt;/p&gt;
&lt;p&gt;What do you mean, you don&amp;#8217;t want to run? Now, look here. I&amp;#8217;m tryin&amp;#8217; to help you become a star, and you took all this time. I might a signed three kids while I been talkin&amp;#8217; to you. I&amp;#8217;m disgusted. Yeah, sign right here, and get your parents to sign over there. I&amp;#8217;ll send you the details later.&lt;/p&gt;
&lt;p&gt;Jesus, I wisht I had a thousand tall kids!&lt;/p&gt;
&lt;p&gt;(With apologies to &lt;a href="http://www.amazon.com/Grapes-Wrath-Centennial-John-Steinbeck/dp/0142000663/"&gt;John Steinbeck&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;(Photo by &lt;a href="http://www.flickr.com/photos/puck90/"&gt;puck90&lt;/a&gt;)&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=aDf76K"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=aDf76K" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=8ZRgUk"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=8ZRgUk" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=B8qlvK"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=B8qlvK" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=2PU5WK"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=2PU5WK" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/CookiesAreForClosers/~4/369368151" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://hurvitz.org/blog/2008/08/lookin-for-tall-kids#comments" thr:count="0" />
		<link rel="replies" type="application/atom+xml" href="http://hurvitz.org/blog/2008/08/lookin-for-tall-kids/feed/atom" thr:count="0" />
		<thr:total>0</thr:total>
	<feedburner:origLink>http://hurvitz.org/blog/2008/08/lookin-for-tall-kids</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Oren Hurvitz</name>
					</author>
		<title type="html"><![CDATA[Man Being Eaten By Alligator]]></title>
		<link rel="alternate" type="text/html" href="http://feeds.feedburner.com/~r/CookiesAreForClosers/~3/331090393/man-being-eaten-by-alligator" />
		<id>http://hurvitz.org/blog/?p=40</id>
		<updated>2008-07-09T21:35:22Z</updated>
		<published>2008-07-09T20:27:18Z</published>
		<category scheme="http://hurvitz.org/blog" term="Fun" />		<summary type="html"><![CDATA[The Wilhelm Scream (originally called &#8220;Man being eaten by alligator&#8221;) is an inside joke in Hollywood: it&#8217;s a short but distinctive scream that has been used in hundreds of movies. Someone with too much time on his hands compiled this video with clips of many of the movies where the scream appears:

The more you watch [...]]]></summary>
		<content type="html" xml:base="http://hurvitz.org/blog/2008/07/man-being-eaten-by-alligator">&lt;p&gt;The &lt;a href="http://en.wikipedia.org/wiki/Wilhelm_scream"&gt;Wilhelm Scream&lt;/a&gt; (originally called &amp;#8220;Man being eaten by alligator&amp;#8221;) is an inside joke in Hollywood: it&amp;#8217;s a short but distinctive scream that has been used in hundreds of movies. Someone with too much time on his hands compiled this video with clips of many of the movies where the scream appears:&lt;/p&gt;
&lt;p&gt;&lt;object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="344" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"&gt;&lt;param name="wmode" value="transparent" /&gt;&lt;param name="src" value="http://www.youtube.com/v/4YDpuA90KEY&amp;amp;hl=en" /&gt;&lt;embed type="application/x-shockwave-flash" width="425" height="344" src="http://www.youtube.com/v/4YDpuA90KEY&amp;amp;hl=en" wmode="transparent"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/p&gt;
&lt;p&gt;The more you watch this video the funnier it gets. Villains scream it, usually while flying through the air. Animated characters scream it. Every Star Wars movie features the scream, as do Star Wars spoofs. Sometimes even heroes scream it. It gets so ridiculous you want to scream it yourself. Oiaaaaargh! The next time you host a party, have a Wilhelm Scream Competition. The winner gets a Lacoste shirt.&lt;/p&gt;
&lt;p&gt;Compare the Wilhelm Scream to Howard Dean&amp;#8217;s famous scream during the 2004 Democratic presidential primaries, the one that &lt;a href="http://www.pbs.org/newshour/extra/features/jan-june04/dean_2-18.html"&gt;ended his chances&lt;/a&gt;. He really nailed it! (Turn up the volume for full effect.)&lt;/p&gt;
&lt;p&gt;&lt;object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="344" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"&gt;&lt;param name="allowFullScreen" value="true" /&gt;&lt;param name="src" value="http://www.youtube.com/v/KDwODbl3muE&amp;amp;hl=en&amp;amp;fs=1" /&gt;&lt;embed type="application/x-shockwave-flash" width="425" height="344" src="http://www.youtube.com/v/KDwODbl3muE&amp;amp;hl=en&amp;amp;fs=1" allowfullscreen="true"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/p&gt;
&lt;p&gt;(Via &lt;a href="http://freakonomics.blogs.nytimes.com/2008/07/08/whats-your-wilhelm-scream/"&gt;Freakonomics&lt;/a&gt;)&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=9Bh7CJ"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=9Bh7CJ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=EfTTPj"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=EfTTPj" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=16tDkJ"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=16tDkJ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=aEneAJ"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=aEneAJ" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/CookiesAreForClosers/~4/331090393" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://hurvitz.org/blog/2008/07/man-being-eaten-by-alligator#comments" thr:count="0" />
		<link rel="replies" type="application/atom+xml" href="http://hurvitz.org/blog/2008/07/man-being-eaten-by-alligator/feed/atom" thr:count="0" />
		<thr:total>0</thr:total>
	<feedburner:origLink>http://hurvitz.org/blog/2008/07/man-being-eaten-by-alligator</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Oren Hurvitz</name>
					</author>
		<title type="html"><![CDATA[One Month is Enough]]></title>
		<link rel="alternate" type="text/html" href="http://feeds.feedburner.com/~r/CookiesAreForClosers/~3/327498786/one-month-is-enough" />
		<id>http://hurvitz.org/blog/?p=37</id>
		<updated>2008-09-12T08:51:46Z</updated>
		<published>2008-07-05T17:31:55Z</published>
		<category scheme="http://hurvitz.org/blog" term="Ideas" />		<summary type="html"><![CDATA[
Dear friends,
The first rule of crisis management is to get ahead of the story. Since my shameful secret is about to be revealed, I decided to break it here first. I&#8217;d rather you heard it from me than from the media:
In March 2008 I watched Rick Astley&#8217;s music video Never Gonna Give You Up on [...]]]></summary>
		<content type="html" xml:base="http://hurvitz.org/blog/2008/07/one-month-is-enough">&lt;div id="attachment_38" class="wp-caption alignnone" style="width: 310px"&gt;&lt;a href="http://www.flickr.com/photos/willi-r-7/"&gt;&lt;img class="size-medium wp-image-38" title="No Photos!" src="http://hurvitz.org/blog/wp-content/uploads/2008/07/190650889_c22d065d59-300x241.jpg" alt="No Photos! (photo by vinnie bezoomny)" width="300" height="241" /&gt;&lt;/a&gt;&lt;p class="wp-caption-text"&gt;No Photos! (photo by vinnie bezoomny)&lt;/p&gt;&lt;/div&gt;
&lt;p&gt;Dear friends,&lt;/p&gt;
&lt;p&gt;The first rule of crisis management is to get ahead of the story. Since my shameful secret is about to be revealed, I decided to break it here first. I&amp;#8217;d rather you heard it from me than from the media:&lt;/p&gt;
&lt;p&gt;In March 2008 I watched Rick Astley&amp;#8217;s music video &lt;a href="http://youtube.com/watch?v=Yu_moia-oVI"&gt;Never Gonna Give You Up&lt;/a&gt; on YouTube. It&amp;#8217;s widely considered to be the most corny music video ever created. I have no excuse; I can&amp;#8217;t even claim to have been &lt;a href="http://en.wikipedia.org/wiki/Rickroll"&gt;RickRolled&lt;/a&gt;. I heard about the video, and willingly went and viewed it. It was me, just me, officer!&lt;/p&gt;
&lt;p&gt;The reason for this confession is that Google is about to hand over to Viacom a complete list of &lt;a href="http://news.cnet.com/8301-10784_3-9983511-7.html"&gt;every video watched by YouTube users&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;[...] the judge granted a Viacom motion that records of every video watched by YouTube users, &lt;strong&gt;including their login names and IP addresses&lt;/strong&gt;, be turned over to the entertainment giant.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;The order prevents Viacom from using this information to target lawsuits at users. But it makes no sense to give this information to Viacom in the first place: Google could easily make this data anonymous, and &lt;a href="http://news.cnet.com/8301-10784_3-9983702-7.html"&gt;they&amp;#8217;ve asked Viacom to do just that&lt;/a&gt;. Viacom have said that they won&amp;#8217;t use any personally identifiable data, but they haven&amp;#8217;t replied to Google&amp;#8217;s request directly. These mixed signals make me lunge for my tin foil hat: what could explain Viacom&amp;#8217;s behavior? Perhaps, once they have the logs in their possession, they intend to ask the judge to allow them greater use of the data. Or perhaps the data will be &amp;#8220;accidentally&amp;#8221; leaked &amp;#8212; after all, that sort of thing happens &lt;a href="http://www.privacyrights.org/ar/ChronDataBreaches.htm#CP"&gt;all the time&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;But criticizing a media company like Viacom for ignoring users&amp;#8217; privacy is like berating a toddler for getting food all over themselves: it&amp;#8217;s in their nature, and they&amp;#8217;re going to keep doing it. Let&amp;#8217;s beat up on Google instead, that never gets old. Google shouldn&amp;#8217;t have kept this data around for Viacom to subpoena. Google deletes personally identifiable user data &lt;a href="http://www.google.com/intl/en/privacy_faq.html"&gt;after 18 months&lt;/a&gt;, which isn&amp;#8217;t enough to hide my Rick Astley obsession. Google&amp;#8217;s track record on privacy is spotty in general. For example, after a lot of pressure they finally added a &lt;a href="http://news.cnet.com/Google-adds-privacy-policy-link-to-home-page/2100-1030_3-6243162.html"&gt;link to their privacy policy&lt;/a&gt; on the Google homepage in July 2008, not before bitching and moaning like a teenager whose parents have forced him to clean his room.&lt;/p&gt;
&lt;p&gt;Google has some of the &lt;a href="http://xkcd.com/155/"&gt;most sensitive data in the world&lt;/a&gt;; in particular, they know every search that a user makes. In their &lt;a href="http://www.google.com/intl/en/privacy_faq.html"&gt;Privacy FAQ&lt;/a&gt; they list several good reasons why they need to keep this data:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;To improve search results&lt;/li&gt;
&lt;li&gt; To maintain the security of their systems&lt;/li&gt;
&lt;li&gt;To prevent fraud and other abuses&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It&amp;#8217;s true that in order to achieve these goals Google needs to save the search logs. However, the problem isn&amp;#8217;t that they keep the search logs; it&amp;#8217;s that they keep personally identifiable information in the logs, which lets them (or anyone else, such as Viacom) associate searches and clicks with real people. Google keeps this information for 18 months, and that&amp;#8217;s far too long. They could erase the personal information much sooner and still achieve all of the goals described above.&lt;/p&gt;
&lt;p&gt;For example, Google use the search logs to find common spelling mistakes made by users, so that they can offer automatic suggestions for the correct spelling. This doesn&amp;#8217;t require any personally identifiable information. Another use for the search logs is to detect click fraud. For this purpose it is indeed useful to look at the search and click history of individual users. However, the benefit of this personal data quickly diminishes with time. Data about click fraud that is over a month old should be considered prehistoric; the perpetrators are long gone from whatever IP they had been using.&lt;/p&gt;
&lt;div id="attachment_39" class="wp-caption alignnone" style="width: 310px"&gt;&lt;a href="http://www.flickr.com/photos/zervas/"&gt;&lt;img class="size-medium wp-image-39" title="Private Property" src="http://hurvitz.org/blog/wp-content/uploads/2008/07/423810290_373d65c278-300x199.jpg" alt="Private Property (photo by Zervas)" width="300" height="199" /&gt;&lt;/a&gt;&lt;p class="wp-caption-text"&gt;Private Property (photo by Zervas)&lt;/p&gt;&lt;/div&gt;
&lt;p&gt;Google&amp;#8217;s privacy policy doesn&amp;#8217;t say how long they keep search logs; probably forever. The only promise they make is to scrub out personally identifiable information after 18 months. Google are very vague about where this figure of &amp;#8220;18 months&amp;#8221; comes from; perhaps it has some &lt;a href="http://judaism.about.com/cs/judaismbasics/f/number18_why.htm"&gt;religious significance&lt;/a&gt;. From Google&amp;#8217;s Privacy FAQ:&lt;/p&gt;
&lt;blockquote&gt;
&lt;h4&gt;Why are logs kept for 18 months before being anonymized?&lt;/h4&gt;
&lt;p&gt;We strike a reasonable balance between the competing pressures we face, such as the privacy of our users, the security of our systems and the need for innovation. &lt;strong&gt;We believe 18 months strikes the right balance. &lt;/strong&gt;&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;It&amp;#8217;s time we told Google: 18 months is too long. &lt;span style="color: #ff0000;"&gt;&lt;strong&gt;&lt;span style="text-decoration: underline;"&gt;One&lt;/span&gt; month would strike the right balance&lt;/strong&gt;&lt;/span&gt; between privacy, security and the need for innovation. With one month of personally identifiable information, Google will be able to catch all the fraud they are ever likely to catch. After that, it&amp;#8217;s time to anonymize the data. The anonymized data is still useful for improving their search engine.&lt;/p&gt;
&lt;p&gt;Go to Google&amp;#8217;s &lt;a href="http://www.google.com/support/bin/request.py?form_type=user&amp;amp;stage=fm&amp;amp;user_type=user&amp;amp;contact_type=privacy&amp;amp;hl=en"&gt;Privacy Feedback&lt;/a&gt; page and ask them to reduce the amount of time they keep personally identifiable data in their logs. You could use a message such as this one:&lt;/p&gt;
&lt;div style="margin-left: 3em; margin-right: 3em"&gt;Dear Google,I&amp;#8217;m concerned about your data retention policy: you keep user identifiable information in your search logs for 18 months, and that&amp;#8217;s too long. As we have seen with the recent lawsuit by Viacom, this information can easily fall into the hands of third parties. To protect my privacy and the privacy of the rest of your users, please reduce the amount of time you keep personally identifiable data to one month. Thank you.&lt;/div&gt;
&lt;p&gt;Google isn&amp;#8217;t alone in this. &lt;a href="http://www.microsoft.com/info/privacy/search.mspx"&gt;Microsoft&lt;/a&gt; also anonymizes its logs after 18 months. &lt;a href="http://www.nytimes.com/2007/07/23/technology/23microsoftweb.html"&gt;Yahoo&lt;/a&gt; makes do with just 13 months (how did they come up with &lt;em&gt;that&lt;/em&gt; number? Perhaps it also holds &lt;a href="http://www.thevesselofgod.com/thirteen.html"&gt;occult significance&lt;/a&gt;). Ask.com, the fourth-largest search provider, gives its users the option of making &lt;a href="http://www.nytimes.com/2007/07/23/technology/23microsoftweb.html"&gt;completely anonymous searches&lt;/a&gt;. But we should focus on Google: where the market leader goes, the rest will surely follow.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=whqOdJ"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=whqOdJ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=EuXeYj"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=EuXeYj" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=lStnBJ"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=lStnBJ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=UUCt2J"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=UUCt2J" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/CookiesAreForClosers/~4/327498786" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://hurvitz.org/blog/2008/07/one-month-is-enough#comments" thr:count="30" />
		<link rel="replies" type="application/atom+xml" href="http://hurvitz.org/blog/2008/07/one-month-is-enough/feed/atom" thr:count="30" />
		<thr:total>30</thr:total>
	<feedburner:origLink>http://hurvitz.org/blog/2008/07/one-month-is-enough</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Oren Hurvitz</name>
					</author>
		<title type="html"><![CDATA[Apple&#8217;s Engineers: an Unexpected Profit Center]]></title>
		<link rel="alternate" type="text/html" href="http://feeds.feedburner.com/~r/CookiesAreForClosers/~3/314024466/apples-engineers-profit-center" />
		<id>http://hurvitz.org/blog/?p=29</id>
		<updated>2008-06-17T20:07:29Z</updated>
		<published>2008-06-17T19:19:43Z</published>
		<category scheme="http://hurvitz.org/blog" term="Ideas" />		<summary type="html"><![CDATA[According to salary information collected by new startup Glassdoor, Apple pays its engineers significantly less than competing companies in Silicon Valley. Apple engineers make $89,000 a year, whereas Google engineers can buy four more Segways a year (pre-tax) with their $112,573 paycheck. Microsoft and Yahoo are closer to Google: both companies pay their engineers $105,000 [...]]]></summary>
		<content type="html" xml:base="http://hurvitz.org/blog/2008/06/apples-engineers-profit-center">&lt;p&gt;According to salary information collected by new startup &lt;a href="http://www.glassdoor.com/index.htm"&gt;Glassdoor&lt;/a&gt;, Apple pays its engineers significantly less than competing companies in Silicon Valley. Apple engineers make $89,000 a year, whereas Google engineers can buy four more Segways a year (pre-tax) with their $112,573 paycheck. Microsoft and Yahoo are closer to Google: both companies pay their engineers $105,000 a year. See &lt;a href="http://www.techcrunch.com/2008/06/10/at-glassdoor-find-out-how-much-people-really-make-at-google-microsoft-yahoo-and-everywhere-else/"&gt;TechCrunch&amp;#8217;s review of Glassdoor&lt;/a&gt; for the data.&lt;/p&gt;
&lt;p&gt;I wondered how much of a difference this salary disparity made to Apple&amp;#8217;s bottom line, so I took a look at its &lt;a href="http://phx.corporate-ir.net/phoenix.zhtml?c=107357&amp;amp;p=irol-sec"&gt;annual 10-K filings&lt;/a&gt; from 2003 to 2007. Each of these reports includes, buried among its 170 pages, Apple&amp;#8217;s net income and how much it spent on R&amp;amp;D. For simplicity I assumed that the R&amp;#038;D budget was entirely spent on salaries; this isn&amp;#8217;t far off the mark in a hi-tech company like Apple.&lt;/p&gt;
&lt;p&gt;If Apple were to pay its engineers the same salaries as Google then its R&amp;amp;D budget would increase by 26%. This amount (26% of the R&amp;#038;D budget) is how much Apple saves each year by paying below-market salaries. I calculated what Apple&amp;#8217;s net income would have been if it had paid its engineers the same as Google, and these are the results:&lt;/p&gt;
&lt;p&gt;&lt;img src="http://hurvitz.org/blog/wp-content/uploads/2008/06/apple_net_income_increase_table.png" alt="Apple\&amp;#039;s Increase in Net Income - Table" title="Apple\&amp;#039;s Increase in Net Income - Table" width="487" height="120" class="alignnone size-full wp-image-35 noborder" /&gt;&lt;br clear="left"/&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Explanation:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;All dollar values are in millions.&lt;/li&gt;
&lt;li&gt;# Employees - from Apple&amp;#8217;s 10-K.&lt;/li&gt;
&lt;li&gt;R&amp;amp;D Budget - from Apple&amp;#8217;s 10-K.&lt;/li&gt;
&lt;li&gt;Adjusted R&amp;amp;D Budget - had Apple paid its engineers at the same level as Google, this would have been its R&amp;amp;D Budget.&lt;/li&gt;
&lt;li&gt;Net Income - from Apple&amp;#8217;s 10-K.&lt;/li&gt;
&lt;li&gt;Adjusted Net Income - had Apple paid its engineers at the same level as Google, this would have been its Net Income.&lt;/li&gt;
&lt;li&gt;Increase in Net Income - the magnitude by which Apple&amp;#8217;s net income was higher that year compared to what it would have been had it paid salaries at the same level as Google.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The Adjusted Net Income is a good estimate, but it&amp;#8217;s not completely accurate. For example, the increase in Apple&amp;#8217;s R&amp;#038;D Budget would have meant that its expenses are higher, so it would have paid less taxes. But the overall trend is clear.&lt;/p&gt;
&lt;p&gt;Here&amp;#8217;s the Increase in Net Income in chart form:&lt;/p&gt;
&lt;p&gt;&lt;img src="http://hurvitz.org/blog/wp-content/uploads/2008/06/apple_net_income_increase.jpg" alt="Apple\&amp;#039;s Increase in Net Income" title="Apple\&amp;#039;s Increase in Net Income" width="414" height="418" class="alignnone size-full wp-image-36 noborder" /&gt;&lt;br clear="left"/&gt;&lt;/p&gt;
&lt;p&gt;In 2003 and 2004, the effect of underpaying its engineers made a huge difference to Apple&amp;#8217;s bottom line. In 2003, these savings turned around Apple&amp;#8217;s year: from a loss to a small profit. In 2004, they doubled the profit. However, once Apple&amp;#8217;s earnings began to skyrocket in 2005, the effect of the R&amp;amp;D savings became much smaller: just 6% of the net income in 2007, for example.&lt;/p&gt;
&lt;p&gt;Paying low salaries to its engineers was a lifesaver for Apple during its difficult times. But now that Apple is immensely profitable there&amp;#8217;s no more excuse for this practice. In the &lt;a href="http://www.techcrunch.com/2008/06/10/at-glassdoor-find-out-how-much-people-really-make-at-google-microsoft-yahoo-and-everywhere-else/"&gt;TechCrunch article&lt;/a&gt; mentioned previously, the site&amp;#8217;s owner Michael Arrington says: &amp;#8220;Apple software engineers make only about $89,000, on average, but they get to create some of the most loved products on Earth.&amp;#8221; I&amp;#8217;m sure this warms their hearts. But an extra $20,000 a year would make their hearts downright toasty, and their spouses&amp;#8217; as well.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=pP77UI"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=pP77UI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=DkK86i"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=DkK86i" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=aWIOrI"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=aWIOrI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=EOD7HI"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=EOD7HI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/CookiesAreForClosers/~4/314024466" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://hurvitz.org/blog/2008/06/apples-engineers-profit-center#comments" thr:count="35" />
		<link rel="replies" type="application/atom+xml" href="http://hurvitz.org/blog/2008/06/apples-engineers-profit-center/feed/atom" thr:count="35" />
		<thr:total>35</thr:total>
	<feedburner:origLink>http://hurvitz.org/blog/2008/06/apples-engineers-profit-center</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Oren Hurvitz</name>
					</author>
		<title type="html"><![CDATA[The Yahoo Effect]]></title>
		<link rel="alternate" type="text/html" href="http://feeds.feedburner.com/~r/CookiesAreForClosers/~3/309144978/the-yahoo-effect" />
		<id>http://hurvitz.org/blog/?p=27</id>
		<updated>2008-09-12T08:36:06Z</updated>
		<published>2008-06-10T22:20:50Z</published>
		<category scheme="http://hurvitz.org/blog" term="Scalability" />		<summary type="html"><![CDATA[Lukas Biewald and Chris Van Pelt of Dolores Labs wrote a fun application called FaceStat. This application lets its users evaluate each other based on their photos. Unlike its famous spiritual ancestor Hot or Not, in FaceStat each person can choose which criteria he or she wants to be evaluated on, e.g. &#8220;am I liberal [...]]]></summary>
		<content type="html" xml:base="http://hurvitz.org/blog/2008/06/the-yahoo-effect">&lt;p&gt;Lukas Biewald and Chris Van Pelt of &lt;a href="http://doloreslabs.com/"&gt;Dolores Labs&lt;/a&gt; wrote a fun application called &lt;a href="http://facestat.com/"&gt;FaceStat&lt;/a&gt;. This application lets its users evaluate each other based on their photos. Unlike its famous spiritual ancestor &lt;a href="http://www.hotornot.com/"&gt;Hot or Not&lt;/a&gt;, in FaceStat each person can choose which criteria he or she wants to be evaluated on, e.g. &amp;#8220;am I liberal or conservative&amp;#8221;, &amp;#8220;do I seem trustworthy&amp;#8221;, etc.&lt;/p&gt;
&lt;p&gt;Everything was sunshine and puppies until the day Yahoo decided to link to FaceStat from their front page, sending masses of new visitors to the site. The FaceStat server gave a small whimper, rolled on its back and played dead. Incensed Yahoos took the site&amp;#8217;s downtime personally and resorted to stalking tactics: they found the email and phone number of the site&amp;#8217;s registered owner (Chris Van Pelt), and left him angry emails and phone messages. It&amp;#8217;s a tough racket, the web business.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://hurvitz.org/blog/wp-content/uploads/2008/06/tv_2412063362_bc2bb2a56c.jpg"&gt;&lt;img class="size-medium wp-image-28 alignright" title="tv_2412063362_bc2bb2a56c" src="http://hurvitz.org/blog/wp-content/uploads/2008/06/tv_2412063362_bc2bb2a56c-300x225.jpg" alt="Defunct TV" width="300" height="225" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;After &lt;a href="http://www.lukasbiewald.com/?p=153"&gt;some frantic work over the weekend&lt;/a&gt; to add hardware and streamline the software, FaceStat was back online and able to handle the load. And what was that load? According to their amazing &lt;a href="http://blog.doloreslabs.com/2008/06/facestat-scales/"&gt;Google Analytics chart&lt;/a&gt;, they jumped from 10,000 pageviews per day to 800,000! That&amp;#8217;s not a hockey stick, that&amp;#8217;s a space elevator.&lt;/p&gt;
&lt;p&gt;So what happened? They fell victim to one of the classic dangers of the web. The most famous is the &lt;a href="http://en.wikipedia.org/wiki/Slashdot_effect"&gt;Slashdot Effect&lt;/a&gt;, which happens when a website is linked to from &lt;a href="http://www.slashdot.org/"&gt;Slashdot&lt;/a&gt;. But only slightly less well-known (despite being more potent) is the Yahoo Effect. Although they managed to recover fairly quickly, they lost valuable visitors during the time that their site was still on the front page of Yahoo, but inaccessible.&lt;/p&gt;
&lt;p&gt;Unfortunately, building an immunity to this kind of problem is usually not cost-effective. There are two options, and both of them have drawbacks.&lt;/p&gt;
&lt;p&gt;First, you can buy enough hardware in advance to survive the Yahoo Effect. But if you never get that link from the front page of Yahoo then you will have wasted a lot of money.&lt;/p&gt;
&lt;p&gt;Second, you can use Cloud Computing to enable your application to use additional servers when needed. In Cloud Computing, your application runs on a variable number of servers that are owned by someone else; you can add or remove servers at a moment&amp;#8217;s notice. The poster boy for this kind of service is &lt;a href="http://www.amazon.com/gp/browse.html?node=201590011"&gt;Amazon&amp;#8217;s Elastic Compute Cloud (EC2)&lt;/a&gt;. Since you can add resources almost instantly, your application can handle vastly increased loads when needed, and you pay only for the resources you actually require at any given moment. This is a very attractive proposition, and indeed a representative of cloud computing management company &lt;a href="http://www.rightscale.com/"&gt;RightScale&lt;/a&gt; was quick to leave a comment on Lukas Biewald&amp;#8217;s blog suggesting their services (thus demonstrating that ambulance chasing isn&amp;#8217;t just for lawyers anymore).&lt;/p&gt;
&lt;p&gt;Although cloud computing is cost-effective from a hardware point of view, it has a different cost: you must design your application in advance to use these resources. This requires additional development time, and that&amp;#8217;s also an up-front cost. Given the relative costs of programmers and hardware, it might be cheaper to buy additional servers than rearchitect the application.&lt;/p&gt;
&lt;p&gt;So what&amp;#8217;s an internet entrepreneur to do? If you&amp;#8217;re starting a new application then definitely look into cloud computing to help your application withstand traffic spikes. Designing a new application to use cloud computing is easier than retrofitting it into an existing application. Another option is to use &lt;a href="http://code.google.com/appengine/"&gt;Google App Engine&lt;/a&gt;, which is Google&amp;#8217;s entry in the scalable web applications space. But that requires a significant commitment to do things the Google Way &amp;#8482;.&lt;/p&gt;
&lt;p&gt;Or just do what most of us (including FaceStat) do: build your application as quickly as possible, and worry about the traffic when you get it. It&amp;#8217;s the time-honored way: people won&amp;#8217;t respect you unless you&amp;#8217;ve got war stories about overcoming vast amounts of traffic with nothing but a screwdriver and a SCSI differential cable.&lt;/p&gt;
&lt;h4&gt;Update - June 14, 2008&lt;/h4&gt;
&lt;p&gt;Eran Hammer-Lahav spent two years &lt;a href="http://www.hueniverse.com/hueniverse/2008/04/the-last-announ.html"&gt;building Nouncer&lt;/a&gt;, a Twitter-like service, before deciding to shut down the project. One of his lessons from this experience is:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Many people criticize the typical path Web 2.0 applications take in their development: putting together a poorly executed site, gauging the market, and only upon success building the service to actually scale and accommodate the market. However, the cost of building scalability ahead of time is extremely high, and for most startup is cost prohibitive.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;(Photo by &lt;a href="http://www.flickr.com/photos/53317685@N00/"&gt;Robbt&lt;/a&gt;)&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=oYb2vI"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=oYb2vI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=QZ7Gti"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=QZ7Gti" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=lpjCXI"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=lpjCXI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=XfMfuI"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=XfMfuI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/CookiesAreForClosers/~4/309144978" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://hurvitz.org/blog/2008/06/the-yahoo-effect#comments" thr:count="0" />
		<link rel="replies" type="application/atom+xml" href="http://hurvitz.org/blog/2008/06/the-yahoo-effect/feed/atom" thr:count="0" />
		<thr:total>0</thr:total>
	<feedburner:origLink>http://hurvitz.org/blog/2008/06/the-yahoo-effect</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Oren Hurvitz</name>
					</author>
		<title type="html"><![CDATA[Anatomy of a Con]]></title>
		<link rel="alternate" type="text/html" href="http://feeds.feedburner.com/~r/CookiesAreForClosers/~3/308193294/anatomy-of-a-con" />
		<id>http://hurvitz.org/blog/?p=3</id>
		<updated>2008-09-12T08:34:07Z</updated>
		<published>2008-06-05T19:10:19Z</published>
		<category scheme="http://hurvitz.org/blog" term="Conferences" /><category scheme="http://hurvitz.org/blog" term="Reminisces" />		<summary type="html"><![CDATA[This is the tale of how I was conned at a conference. (As far as alliterative woes are concerned, I could have done worse: I could have been shafted at a shindig. Hoodwinked at a hootenanny. Mauled at a meal. You get the picture.)
Amsterdam, June 2000. The conference was about WAP. Do you remember WAP? [...]]]></summary>
		<content type="html" xml:base="http://hurvitz.org/blog/2008/06/anatomy-of-a-con">&lt;p&gt;This is the tale of how I was conned at a conference. (As far as &lt;a href="http://en.wikipedia.org/wiki/Alliteration"&gt;alliterative&lt;/a&gt; woes are concerned, I could have done worse: I could have been shafted at a shindig. Hoodwinked at a hootenanny. Mauled at a meal. You get the picture.)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Amsterdam, June 2000.&lt;/strong&gt; The conference was about WAP. Do you remember &lt;a href="http://en.wikipedia.org/wiki/Wireless_Application_Protocol"&gt;WAP&lt;/a&gt;? It was an attempt to rewrite the entire web infrastructure from scratch for mobile phones. Instead of HTML we were supposed to use WML: a markup language which is almost, but not quite, entirely unlike HTML. WAP flopped, but not before dumping a sediment of useless software on every mobile phone, and an &lt;a href="http://www.amazon.com/Professional-Wap-Programmer-Charles-Arehart/dp/1861004044/ref=pd_bbs_6?ie=UTF8&amp;amp;s=books&amp;amp;qid=1212157143&amp;amp;sr=8-6"&gt;800-page tome&lt;/a&gt; in my suitcase (it was given away at the conference).&lt;/p&gt;
&lt;p&gt;But I didn&amp;#8217;t care about any of that in 2000. This was the dot-com era before the bubble burst, the weather was sunny and Amsterdam beautiful. After the conference ended I had some time to walk around Amsterdam and take in the canals, the bikes, and the coffee shops. The next day I took a train to the airport, and that&amp;#8217;s when I was conned and relieved of my briefcase, passport, plane ticket, camera, and various other items (but sadly, not the &lt;a href="http://www.amazon.com/Professional-Wap-Programmer-Charles-Arehart/dp/1861004044/ref=pd_bbs_6?ie=UTF8&amp;amp;s=books&amp;amp;qid=1212157143&amp;amp;sr=8-6"&gt;huge book&lt;/a&gt;).&lt;/p&gt;
&lt;div id="attachment_24" class="wp-caption alignright" style="width: 310px"&gt;&lt;a href="http://lost.about.com/od/sawyer/p/sawyer.htm"&gt;&lt;img class="size-full wp-image-24" title="Con Man" src="http://hurvitz.org/blog/wp-content/uploads/2008/06/josh_holloway_lost.jpg" alt="Con Man" width="300" height="375" /&gt;&lt;/a&gt;&lt;p class="wp-caption-text"&gt;Con Man&lt;/p&gt;&lt;/div&gt;
&lt;p&gt;It was mid-morning, and the train was almost empty. I had an entire car to myself at first. After a few stops one other guy came in and sat across the aisle from me. He seemed quite ordinary: in his 30&amp;#8217;s, some stubble, no distinguishing characteristics. He asked me something trivial about the stops that the train will make, but mostly just looked out the window and fiddled with his prepaid phone cards. (A note to my younger readers: in Ye Olden Days, before everyone had cellphones, people made calls using &lt;a href="http://en.wikipedia.org/wiki/TARDIS"&gt;public phone booths&lt;/a&gt;. Phone cards were used to pay for these calls.)&lt;/p&gt;
&lt;p&gt;A couple of stops before the airport Phone Card Guy jumped up as if he&amp;#8217;d just noticed that this is his stop, and hurried out, dropping a few of his phone cards in his haste. I looked at the cards on the floor, and then around the train. There was no one else there. So I picked up the cards, went to the door of the train and shouted after him, &amp;#8220;you dropped your phone cards!&amp;#8221; Phone Card Guy was already some distance away from the train, but he came back and took the cards, thanked me, and walked away. While this was happening, a passenger that I hadn&amp;#8217;t seen before came behind me and left the train through the doorway I was standing in. He looked like a businessman: he wore a suit, and was in his 50&amp;#8217;s.&lt;/p&gt;
&lt;p&gt;I returned to my seat, and the train started moving again. It was then that I noticed that my briefcase and camera were gone from the seat where I&amp;#8217;d left them, and in a flash I realized what had happened.&lt;/p&gt;
&lt;p&gt;In con movies, at this point we would see a quick succession of scenes from earlier in the movie, explaining how the con was put together and making us see everything in a different light. This is how it worked: Phone Card Guy established rapport with me, so that I&amp;#8217;ll be motivated to go to the door of the train and tell him that he dropped his phone cards. Suit Guy was his accomplice: his job was to lurk one car over and watch to see when I had left my seat and had my back turned. At that point Suit Guy came into the car, grabbed what he could, and left through the same door I was standing at! Phone Card Guy had gone one way and Suit Guy the opposite way, so I was looking in the wrong direction and didn&amp;#8217;t notice that Suit Guy was holding my briefcase. This was all timed so that the train started moving just as I realized what happened, so I couldn&amp;#8217;t run after them or call for help.&lt;/p&gt;
&lt;p&gt;I was so full of admiration for their smooth technique that I almost didn&amp;#8217;t mind losing my stuff. Fortunately there was enough time for me to get replacement travel documents at the airport. They didn&amp;#8217;t issue me a new passport on the spot, of course: instead they had me travel with the sort of papers that are normally used to transport pets. Wuf!&lt;/p&gt;
&lt;p&gt;What I regret most is the loss of my camera, with its photos of Amsterdam. I hope the con men liked them.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=H5WZ1I"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=H5WZ1I" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=7549ki"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=7549ki" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=wxFnVI"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=wxFnVI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=e9Y9QI"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=e9Y9QI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/CookiesAreForClosers/~4/308193294" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://hurvitz.org/blog/2008/06/anatomy-of-a-con#comments" thr:count="1" />
		<link rel="replies" type="application/atom+xml" href="http://hurvitz.org/blog/2008/06/anatomy-of-a-con/feed/atom" thr:count="1" />
		<thr:total>1</thr:total>
	<feedburner:origLink>http://hurvitz.org/blog/2008/06/anatomy-of-a-con</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Oren Hurvitz</name>
					</author>
		<title type="html"><![CDATA[LinkedIn Architecture]]></title>
		<link rel="alternate" type="text/html" href="http://feeds.feedburner.com/~r/CookiesAreForClosers/~3/308193295/linkedin-architecture" />
		<id>http://hurvitz.org/blog/?p=23</id>
		<updated>2008-06-09T21:34:15Z</updated>
		<published>2008-06-04T21:20:53Z</published>
		<category scheme="http://hurvitz.org/blog" term="Scalability" />		<summary type="html"><![CDATA[At JavaOne 2008, LinkedIn employees presented two sessions about the LinkedIn architecture. The slides are available online:

LinkedIn - A Professional Social Network Built with Java™ Technologies and Agile Practices
LinkedIn Communication Architecture

These slides are hosted at SlideShare. If you register then you can download them as PDF&#8217;s.
This post summarizes the key parts of the LinkedIn architecture. [...]]]></summary>
		<content type="html" xml:base="http://hurvitz.org/blog/2008/06/linkedin-architecture">&lt;p&gt;At &lt;a href="http://hurvitz.org/blog/2008/05/javaone-2008"&gt;JavaOne 2008&lt;/a&gt;, LinkedIn employees presented two sessions about the LinkedIn architecture. The slides are available online:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.slideshare.net/linkedin/linkedins-communication-architecture"&gt;LinkedIn - A Professional Social Network Built with Java™ Technologies and Agile Practices&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.slideshare.net/linkedin/linked-in-javaone-2008-tech-session-comm"&gt;LinkedIn Communication Architecture&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These slides are hosted at SlideShare. If you register then you can download them as PDF&amp;#8217;s.&lt;/p&gt;
&lt;p&gt;This post summarizes the key parts of the LinkedIn architecture. It&amp;#8217;s based on the presentations above, and on additional comments made during the presentation at JavaOne.&lt;/p&gt;
&lt;h3&gt;Site Statistics&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;22 million members&lt;/li&gt;
&lt;li&gt;4+ million unique visitors/month&lt;/li&gt;
&lt;li&gt;40 million page views/day&lt;/li&gt;
&lt;li&gt;2 million searches/day&lt;/li&gt;
&lt;li&gt;250K invitations sent/day&lt;/li&gt;
&lt;li&gt;1 million answers posted&lt;/li&gt;
&lt;li&gt;2 million email messages/day&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Software&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Solaris (running on Sun x86 platform and Sparc)&lt;/li&gt;
&lt;li&gt;Tomcat and Jetty as application servers&lt;/li&gt;
&lt;li&gt;Oracle and MySQL as DBs&lt;/li&gt;
&lt;li&gt;No ORM (such as Hibernate); they use straight JDBC&lt;/li&gt;
&lt;li&gt;ActiveMQ for JMS. (It&amp;#8217;s partitioned by type of messages. Backed by MySQL.)&lt;/li&gt;
&lt;li&gt;Lucene as a foundation for search&lt;/li&gt;
&lt;li&gt;Spring as glue&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Server Architecture&lt;/h2&gt;
&lt;h3&gt;2003-2005&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;One monolithic web application&lt;/li&gt;
&lt;li&gt;One database: the &lt;strong&gt;Core Database&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;The network graph is cached in memory in &lt;strong&gt;The Cloud&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Members &lt;strong&gt;Search&lt;/strong&gt; implemented using Lucene. It runs on the same server as The Cloud, because member searches must be filtered according to the searching user&amp;#8217;s network, so it&amp;#8217;s convenient to have Lucene on the same machine as The Cloud.&lt;/li&gt;
&lt;li&gt;WebApp updates the Core Database directly. The Core Database updates The Cloud.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;2006&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Added &lt;strong&gt;Replica DB&amp;#8217;s&lt;/strong&gt;, to reduce the load on the Core Database. They contain read-only data. A &lt;strong&gt;RepDB&lt;/strong&gt; server manages updates of the Replica DB&amp;#8217;s.&lt;/li&gt;
&lt;li&gt;Moved Search out of The Cloud and into its own server.&lt;/li&gt;
&lt;li&gt;Changed the way updates are handled, by adding the &lt;strong&gt;Databus&lt;/strong&gt;. This is a central component that distributes updates to any component that needs them. This is the new updates flow:
&lt;ul&gt;
&lt;li&gt;Changes originate in the WebApp&lt;/li&gt;
&lt;li&gt;The WebApp updates the Core Database&lt;/li&gt;
&lt;li&gt;The Core Database sends updates to the Databus&lt;/li&gt;
&lt;li&gt;The Databus sends the updates to: the Replica DB&amp;#8217;s, The Cloud, and Search&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;2008&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;The WebApp doesn&amp;#8217;t do everything itself anymore: they split parts of its business logic into &lt;strong&gt;Services&lt;/strong&gt;.&lt;br /&gt;
The WebApp still presents the GUI to the user, but now it calls Services to manipulate the Profile, Groups, etc.&lt;/li&gt;
&lt;li&gt;Each Service has its own domain-specific database (i.e., vertical partitioning).&lt;/li&gt;
&lt;li&gt;This architecture allows &lt;em&gt;other&lt;/em&gt; applications (besides the main WebApp) to access LinkedIn. They&amp;#8217;ve added applications for Recruiters, Ads, etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;The Cloud&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;The Cloud is a server that caches the entire LinkedIn network graph in memory.&lt;/li&gt;
&lt;li&gt;Network size: 22M nodes, 120M edges.&lt;/li&gt;
&lt;li&gt;Requires &lt;strong&gt;12 GB RAM&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;There are &lt;strong&gt;40 instances&lt;/strong&gt; in production&lt;/li&gt;
&lt;li&gt;Rebuilding an instance of The Cloud from disk takes &lt;strong&gt;8 hours&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The Cloud is updated in real-time using the Databus.&lt;/li&gt;
&lt;li&gt;Persisted to disk on shutdown.&lt;/li&gt;
&lt;li&gt;The cache is implemented in C++, accessed via JNI. They chose C++ instead of Java for two reasons:
&lt;ul&gt;
&lt;li&gt;To use as little RAM as possible.&lt;/li&gt;
&lt;li&gt;Garbage Collection pauses were killing them. [LinkedIn said they were using advanced GC's, but GC's have improved since 2003; is this still a problem today?]&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Having to keep everything in RAM is a limitation, but as LinkedIn have pointed out, partitioning graphs is hard.&lt;/li&gt;
&lt;li&gt;[Sun offers servers with up to 2 TB of RAM (&lt;a href="http://www.sun.com/servers/highend/m9000/"&gt;Sun SPARC Enterprise M9000 Server&lt;/a&gt;), so LinkedIn could support up to 1.1 billion users before they run out of memory. (This calculation is based only on the number of nodes, not edges). Price is another matter: Sun say only "contact us for price", which is ominous considering that the prices they &lt;em&gt;do&lt;/em&gt; list go up to $30,000.]&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The Cloud caches the entire LinkedIn Network, but each user needs to see the network from his &lt;em&gt;own &lt;/em&gt;point of view. It&amp;#8217;s computationally expensive to calculate that, so they do it just once when a user session begins, and keep it cached. That takes up to 2 MB of RAM per user. This cached network is &lt;strong&gt;not updated &lt;/strong&gt;during the session. (It &lt;strong&gt;is&lt;/strong&gt; updated if the user himself adds/removes a link, but not if any of the user&amp;#8217;s contacts make changes. LinkedIn says users won&amp;#8217;t notice this.)&lt;/p&gt;
&lt;p&gt;As an aside, they use &lt;a href="http://ehcache.sourceforge.net/"&gt;Ehcache&lt;/a&gt; to cache members&amp;#8217; profiles. They cache up to 2 million profiles (out of 22 million members). They tried caching using LFU algorithm (Least Frequently Used), but found that Ehcache would sometimes block for 30 seconds while recalculating LFU, so they switched to LRU (Least Recently Used).&lt;/p&gt;
&lt;h2&gt;Communication Architecture&lt;/h2&gt;
&lt;h3&gt;Communication Service&lt;/h3&gt;
&lt;p&gt;The Communication Service is responsible for &lt;strong&gt;permanent messages&lt;/strong&gt;, e.g. InBox messages and emails.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The entire system is asynchronous and uses JMS heavily&lt;/li&gt;
&lt;li&gt;Clients post messages via JMS&lt;/li&gt;
&lt;li&gt;Messages are then routed via a routing service to the appropriate mailbox or directly for email processing&lt;/li&gt;
&lt;li&gt;Message delivery: either Pull (clients request their messages), or Push (e.g., sending emails)&lt;/li&gt;
&lt;li&gt;They use Spring, with proprietary LinkedIn Spring extensions. Use HTTP-RPC.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Scaling Techniques&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Functional partitioning: sent, received, archived, etc. [a.k.a. vertical partitioning]&lt;/li&gt;
&lt;li&gt;Class partitioning: Member mailboxes, guest mailboxes, corporate mailboxes&lt;/li&gt;
&lt;li&gt;Range partitioning: Member ID range; Email lexicographical range. [a.k.a. horizontal partitioning]&lt;/li&gt;
&lt;li&gt;Everything is asynchronous&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Network Updates Service&lt;/h3&gt;
&lt;p&gt;The Network Updates Service is responsible for &lt;strong&gt;short-lived notifications&lt;/strong&gt;, e.g. status updates from your contacts.&lt;/p&gt;
&lt;h4&gt;Initial Architecture (up to 2007)&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;There are many services that can contain updates.&lt;/li&gt;
&lt;li&gt;Clients make separate requests to each service that can have updates: Questions, Profile Updates, etc.&lt;/li&gt;
&lt;li&gt;It took a long time to gather all the data.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In 2008 they created the Network Updates Service. The implementation went through several iterations:&lt;/p&gt;
&lt;h4&gt;Iteration 1&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Client makes just one request, to the NetworkUpdateService.&lt;/li&gt;
&lt;li&gt;NetworkUpdateService makes multiple requests to gather the data from all the services. These requests are made in parallel.&lt;/li&gt;
&lt;li&gt;The results are aggregated and returned to the client together.&lt;/li&gt;
&lt;li&gt;Pull-based architecture.&lt;/li&gt;
&lt;li&gt;They rolled out this new system to everyone at LinkedIn, which caused problems while the system was stabilizing. In hindsight, should have tried it out on a small subset of users first.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Iteration 2&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Push-based architecture: whenever events occur in the system, add them to the user&amp;#8217;s &amp;quot;mailbox&amp;quot;. When a client asks for updates, return the data that&amp;#8217;s already waiting in the mailbox.&lt;/li&gt;
&lt;li&gt;Pros: reads are much quicker since the data is already available.&lt;/li&gt;
&lt;li&gt;Cons: might waste effort on moving around update data that will never be read. Requires more storage space.&lt;/li&gt;
&lt;li&gt;There is still post-processing of updates before returning them to the user. E.g.: collapse 10 updates from a user to 1.&lt;/li&gt;
&lt;li&gt;The updates are stored in CLOB&amp;#8217;s: 1 CLOB per update-type per user (for a total of 15 CLOB&amp;#8217;s per user).&lt;/li&gt;
&lt;li&gt;Incoming updates must be added to the CLOB. Use optimistic locking to avoid lock contention.&lt;/li&gt;
&lt;li&gt;They had set the CLOB size to 8 kb, which was too large and led to a lot of wasted space.&lt;/li&gt;
&lt;li&gt;Design note: instead of CLOB&amp;#8217;s, LinkedIn could have created additional tables, one for each type of update. They said that they didn&amp;#8217;t do this because of what they would have to do when updates expire: Had they created additional tables then they would have had to delete rows, and that&amp;#8217;s very expensive.&lt;/li&gt;
&lt;li&gt;They used JMX to monitor and change the configuration in real-time. This was very helpful.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Iteration 3&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;Goal: improve speed by reducing the number of CLOB updates, because CLOB updates are expensive.&lt;/li&gt;
&lt;li&gt;Added an overflow buffer: a VARCHAR(4000) column where data is added initially. When this column is full, dump it to the CLOB. This eliminated 90% of CLOB updates.&lt;/li&gt;
&lt;li&gt;Reduced the size of the updates.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;[LinkedIn have had success in moving from a Pull architecture to a Push architecture. However, don't discount Pull architectures. Amazon, for example, use a Pull architecture. In &lt;a href="http://www.acmqueue.com/modules.php?name=Content&amp;#038;pa=showpage&amp;#038;pid=388"&gt;A Conversation with Werner Vogels&lt;/a&gt;, Amazon's CTO, he said that when you visit the front page of Amazon they typically call more than 100 services in order to construct the page.]&lt;/p&gt;
&lt;p&gt;&lt;br/&gt;&lt;br /&gt;
The presentation ends with some tips about scaling. These are oldies but goodies:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Can&amp;#8217;t use just one database. Use many databases, partitioned horizontally and vertically.&lt;/li&gt;
&lt;li&gt;Because of partitioning, forget about referential integrity or cross-domain JOINs.&lt;/li&gt;
&lt;li&gt;Forget about 100% data integrity.&lt;/li&gt;
&lt;li&gt;At large scale, cost is a problem: hardware, databases, licenses, storage, power.&lt;/li&gt;
&lt;li&gt;Once you&amp;#8217;re large, spammers and data-scrapers come a-knocking.&lt;/li&gt;
&lt;li&gt;Cache!&lt;/li&gt;
&lt;li&gt;Use asynchronous flows.&lt;/li&gt;
&lt;li&gt;Reporting and analytics are challenging; consider them up-front when designing the system.&lt;/li&gt;
&lt;li&gt;Expect the system to fail.&lt;/li&gt;
&lt;li&gt;Don&amp;#8217;t underestimate your growth trajectory.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=Kk5w0I"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=Kk5w0I" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=Ip7Dhi"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=Ip7Dhi" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=cWNwaI"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=cWNwaI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=EeXWuI"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=EeXWuI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/CookiesAreForClosers/~4/308193295" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://hurvitz.org/blog/2008/06/linkedin-architecture#comments" thr:count="38" />
		<link rel="replies" type="application/atom+xml" href="http://hurvitz.org/blog/2008/06/linkedin-architecture/feed/atom" thr:count="38" />
		<thr:total>38</thr:total>
	<feedburner:origLink>http://hurvitz.org/blog/2008/06/linkedin-architecture</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Oren Hurvitz</name>
					</author>
		<title type="html"><![CDATA[Run for office with Contendr]]></title>
		<link rel="alternate" type="text/html" href="http://feeds.feedburner.com/~r/CookiesAreForClosers/~3/308193296/run-for-office-with-contendr" />
		<id>http://hurvitz.org/blog/?p=21</id>
		<updated>2008-09-12T08:32:43Z</updated>
		<published>2008-06-01T20:05:38Z</published>
		<category scheme="http://hurvitz.org/blog" term="Fun" /><category scheme="http://hurvitz.org/blog" term="Ideas" />		<summary type="html"><![CDATA[The Obama compaign is hiring developers to create software for his presidential campaign. It was suggested to make this software open-source. But why stop there? Whenever a successful website comes along, someone invariably creates a service that lets anyone churn out a clone in five minutes:

Want your own social network? Ning.
Your own Digg? coRank.
A Wiki [...]]]></summary>
		<content type="html" xml:base="http://hurvitz.org/blog/2008/06/run-for-office-with-contendr">&lt;p&gt;The Obama compaign is hiring developers to &lt;a href="http://news.slashdot.org/news/08/05/31/2341201.shtml"&gt;create software for his presidential campaign&lt;/a&gt;. It was suggested to make this software open-source. But why stop there? Whenever a successful website comes along, someone invariably creates a service that lets anyone churn out a clone in five minutes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Want your own social network? &lt;a href="http://www.ning.com/"&gt;Ning&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Your own Digg? &lt;a href="http://www.corank.com/"&gt;coRank&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;A Wiki to call your own? &lt;a href="http://www.wetpaint.com/"&gt;Wetpaint&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Want to show the Twitter folks how to keep a site running? &lt;a href="http://www.revou.com/"&gt;ReVou&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Starting a presidential campaign? &lt;strong&gt;Contendr!&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div id="attachment_22" class="wp-caption alignnone" style="width: 310px"&gt;&lt;a href="http://hurvitz.org/blog/wp-content/uploads/2008/06/brando_contender.jpg"&gt;&lt;img class="size-medium wp-image-22" title="Marlon Brando in &amp;quot;On The Waterfront&amp;quot;" src="http://hurvitz.org/blog/wp-content/uploads/2008/06/brando_contender-300x225.jpg" alt="Marlon Brando in " width="300" height="225" /&gt;&lt;/a&gt;&lt;p class="wp-caption-text"&gt;He coulda been a contender, if only he&amp;#39;d had Contendr.  (Marlon Brando in &amp;quot;On the Waterfront&amp;quot;)&lt;/p&gt;&lt;/div&gt;
&lt;p&gt;Suggested features:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Collect signatures to get the candidate&amp;#8217;s name on the ballot by harvesting .sig&amp;#8217;s from &lt;a href="http://slashdot.org/"&gt;Slashdot&lt;/a&gt; and other forums.&lt;/li&gt;
&lt;li&gt;Ask for campaign contributions with a tip jar on the website.&lt;/li&gt;
&lt;li&gt;Spread the candidate&amp;#8217;s message by link-spamming the appropriate sites: &lt;a href="http://pajamasmedia.com/instapundit/"&gt;Instapundit&lt;/a&gt; for Republicans or &lt;a href="http://www.dailykos.com/"&gt;Daily Kos&lt;/a&gt; for Democrats. Actually, link-spam both sites; everyone deserves to hear what you&amp;#8217;ve got to say.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The name is available (but sadly, the domain is not). Act now, and help democratize the democratic process!&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=dkEp3I"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=dkEp3I" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=G8F1oi"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=G8F1oi" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=x1JxHI"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=x1JxHI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=XoM49I"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=XoM49I" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/CookiesAreForClosers/~4/308193296" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://hurvitz.org/blog/2008/06/run-for-office-with-contendr#comments" thr:count="1" />
		<link rel="replies" type="application/atom+xml" href="http://hurvitz.org/blog/2008/06/run-for-office-with-contendr/feed/atom" thr:count="1" />
		<thr:total>1</thr:total>
	<feedburner:origLink>http://hurvitz.org/blog/2008/06/run-for-office-with-contendr</feedburner:origLink></entry>
		<entry>
		<author>
			<name>Oren Hurvitz</name>
					</author>
		<title type="html"><![CDATA[JavaOne 2008]]></title>
		<link rel="alternate" type="text/html" href="http://feeds.feedburner.com/~r/CookiesAreForClosers/~3/308193297/javaone-2008" />
		<id>http://hurvitz.org/blog/?p=5</id>
		<updated>2008-09-12T08:28:22Z</updated>
		<published>2008-05-31T18:43:32Z</published>
		<category scheme="http://hurvitz.org/blog" term="Conferences" />		<summary type="html"><![CDATA[I was at JavaOne 2008 this month. I attended mostly server-related sessions, with a few J2SE and J2ME presentations thrown in to see how the other half lives. This is my photo report.
JavaOne is held at Moscone Center in downtown San Francisco.


It was crowded!


People queued up outside the rooms until they were allowed to enter [...]]]></summary>
		<content type="html" xml:base="http://hurvitz.org/blog/2008/05/javaone-2008">&lt;p&gt;I was at JavaOne 2008 this month. I attended mostly server-related sessions, with a few J2SE and J2ME presentations thrown in to see how the other half lives. This is my photo report.&lt;/p&gt;
&lt;div id="attachment_16" class="wp-caption alignnone" style="width: 310px"&gt;&lt;a href="http://hurvitz.org/blog/wp-content/uploads/2008/05/sanfrancisco-20080509-143027.jpg"&gt;&lt;img class="size-medium wp-image-16" title="Moscone Center" src="http://hurvitz.org/blog/wp-content/uploads/2008/05/sanfrancisco-20080509-143027-300x199.jpg" alt="Moscone Center" width="300" height="199" /&gt;&lt;/a&gt;&lt;p class="wp-caption-text"&gt;Moscone Center&lt;/p&gt;&lt;/div&gt;
&lt;p&gt;JavaOne is held at Moscone Center in downtown San Francisco.&lt;br /&gt;
&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080506-103041.jpg"&gt;&lt;img class="size-medium wp-image-8 alignnone" title="Crowd" src="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080506-103041-300x225.jpg" alt="Crowd" width="300" height="225" /&gt;&lt;/a&gt;&lt;br /&gt;
It was crowded!&lt;br /&gt;
&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080508-103104.jpg"&gt;&lt;img class="alignnone size-medium wp-image-10" title="Lines" src="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080508-103104-300x225.jpg" alt="Lines" width="300" height="225" /&gt;&lt;/a&gt;&lt;br /&gt;
People queued up outside the rooms until they were allowed to enter (about ten minutes before the session&amp;#8217;s start time). This was surprisingly well-organized: Moscone Center staff stood outside the rooms with large signs containing the room&amp;#8217;s number and made sure everyone stood in the right line. The Esplanade was especially crowded, so the staff took to standing at the end of the line, holding up their sign and shouting their room number every few seconds.&lt;br /&gt;
&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080509-141820.jpg"&gt;&lt;img class="alignnone size-medium wp-image-17" title="Hang space" src="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080509-141820-300x199.jpg" alt="Hang space" width="300" height="199" /&gt;&lt;/a&gt;&lt;br /&gt;
There was a space with bean bags and video games, which is apparently de rigeur in any geek gathering these days.&lt;br /&gt;
&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080509-103233.jpg"&gt;&lt;img class="alignnone size-medium wp-image-14" title="Bathroom queue" src="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080509-103233-300x199.jpg" alt="Bathroom queue" width="300" height="199" /&gt;&lt;/a&gt;&lt;br /&gt;
Men outnumbered women at least 15:1. This led to an unlikely scene.&lt;br /&gt;
&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080509-082529.jpg"&gt;&lt;img class="alignnone size-medium wp-image-13" title="General Session" src="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080509-082529-300x199.jpg" alt="General Session" width="300" height="199" /&gt;&lt;/a&gt;&lt;br /&gt;
The General Sessions were held in a huge auditorium.&lt;br /&gt;
&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080508-145151.jpg"&gt;&lt;img class="alignnone size-medium wp-image-20" title="Book and music" src="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080508-145151-300x214.jpg" alt="Book and music" width="300" height="214" /&gt;&lt;/a&gt;&lt;br /&gt;
The regular sessions were in smaller rooms. This guy utilized the time before the session to the fullest.&lt;br /&gt;
&lt;br/&gt;&lt;/p&gt;
&lt;div id="attachment_19" class="wp-caption alignnone" style="width: 310px"&gt;&lt;a href="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080508-201345.jpg"&gt;&lt;img class="size-medium wp-image-19" title="Smash Mouth concert" src="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080508-201345-300x161.jpg" alt="Smash Mouth concert" width="300" height="161" /&gt;&lt;/a&gt;&lt;p class="wp-caption-text"&gt;Smash Mouth concert&lt;/p&gt;&lt;/div&gt;
&lt;p&gt;On Thursday night there was a Smash Mouth concert in the adjacent Yerba Buena Gardens.&lt;br /&gt;
&lt;br/&gt;&lt;/p&gt;
&lt;div id="attachment_18" class="wp-caption alignnone" style="width: 310px"&gt;&lt;a href="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080506-2132241.jpg"&gt;&lt;img class="size-medium wp-image-18" title="Vringo BOF Session" src="http://hurvitz.org/blog/wp-content/uploads/2008/05/javaone-20080506-2132241-300x200.jpg" alt="Vringo BOF Session" width="300" height="200" /&gt;&lt;/a&gt;&lt;p class="wp-caption-text"&gt;Vringo BOF Session&lt;/p&gt;&lt;/div&gt;
&lt;p&gt;My company, &lt;a href="http://www.vringo.com/"&gt;Vringo&lt;/a&gt;, presented a BOF called &lt;a href="https://www28.cplan.com/cc191/session_details.jsp?isid=295216&amp;amp;ilocation_id=191-1&amp;amp;ilanguage=english"&gt;Real-World Challenges in Signing Java Platform, Micro Edition (Java ME Platform) Applications&lt;/a&gt;. The reason the title is so unwieldy is that Sun enthusiastically replaced the text in all the presentations with the One True Marketing Terminology. This turned a simple title such as &amp;#8220;Real-World Challenges in Signing J2ME Applications&amp;#8221; into the monstrosity above.&lt;/p&gt;
&lt;p&gt;This photo shows our CTO, David Goldfarb (right), and mobile developer Chaim Kutnicki (left).&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=h27qqI"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=h27qqI" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=U7vCoi"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=U7vCoi" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=bjNy0I"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=bjNy0I" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~f/CookiesAreForClosers?a=7LkNbI"&gt;&lt;img src="http://feeds.feedburner.com/~f/CookiesAreForClosers?i=7LkNbI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/CookiesAreForClosers/~4/308193297" height="1" width="1"/&gt;</content>
		<link rel="replies" type="text/html" href="http://hurvitz.org/blog/2008/05/javaone-2008#comments" thr:count="2" />
		<link rel="replies" type="application/atom+xml" href="http://hurvitz.org/blog/2008/05/javaone-2008/feed/atom" thr:count="2" />
		<thr:total>2</thr:total>
	<feedburner:origLink>http://hurvitz.org/blog/2008/05/javaone-2008</feedburner:origLink></entry>
	</feed>
