<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Cookies are for Closers: Oren Hurvitz&#039;s Blog &#187; Software</title>
	<atom:link href="http://hurvitz.org/blog/category/software/feed" rel="self" type="application/rss+xml" />
	<link>http://hurvitz.org/blog</link>
	<description>Not a baking blog, but possibly half-baked</description>
	<lastBuildDate>Wed, 24 Aug 2011 06:44:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>The OCR Quality of Google Docs</title>
		<link>http://hurvitz.org/blog/2011/04/ocr-quality-of-google-docs</link>
		<comments>http://hurvitz.org/blog/2011/04/ocr-quality-of-google-docs#comments</comments>
		<pubDate>Thu, 28 Apr 2011 20:03:09 +0000</pubDate>
		<dc:creator>Oren Hurvitz</dc:creator>
				<category><![CDATA[Reminisces]]></category>
		<category><![CDATA[Software]]></category>

		<guid isPermaLink="false">http://hurvitz.org/blog/?p=207</guid>
		<description><![CDATA[<div class="addthis_toolbox addthis_default_style " addthis:url='http://hurvitz.org/blog/2011/04/ocr-quality-of-google-docs' addthis:title='The OCR Quality of Google Docs '  ><a class="addthis_button_facebook_like" fb:like:layout="button_count"></a><a class="addthis_button_tweet"></a><a class="addthis_counter addthis_pill_style"></a></div>Yesterday Google released an Android app for Google Docs. The most flashy feature of this app is the ability to take a photo and convert it to text using OCR. I used to work at an OCR company back when a &#8220;smart phone&#8221; was a phone that had three lines of black-on-white text instead of [...]<div class="addthis_toolbox addthis_default_style " addthis:url='http://hurvitz.org/blog/2011/04/ocr-quality-of-google-docs' addthis:title='The OCR Quality of Google Docs '  ><a class="addthis_button_facebook_like" fb:like:layout="button_count"></a><a class="addthis_button_tweet"></a><a class="addthis_counter addthis_pill_style"></a></div>]]></description>
			<content:encoded><![CDATA[<div class="addthis_toolbox addthis_default_style " addthis:url='http://hurvitz.org/blog/2011/04/ocr-quality-of-google-docs' addthis:title='The OCR Quality of Google Docs '  ><a class="addthis_button_facebook_like" fb:like:layout="button_count"></a><a class="addthis_button_tweet"></a><a class="addthis_counter addthis_pill_style"></a></div><p>Yesterday Google released an <a href="https://market.android.com/details?id=com.google.android.apps.docs">Android app for Google Docs</a>. The most flashy feature of this app is the ability to take a photo and convert it to text using OCR. I used to work at an OCR company back when a &#8220;smart phone&#8221; was a phone that had three lines of black-on-white text instead of two, so I was interested to see how well this newfangled OCR works. I tested the OCR capabilities of Google Docs for Android, and compared it with several other OCR programs. My conclusion: under ideal conditions the OCR works pretty well. However, real-life conditions are much worse, so this feature is like a conscience in a politician: widely trumpeted but rarely used.</p>
<h3>The Test Page</h3>
<p>My test document was the first page of Steven Levy&#8217;s article about Google from the April 2011 issue of Wired, titled <a href="http://www.wired.com/magazine/2011/03/mf_larrypage/">&#8220;An Unconventional CEO&#8221;</a>. Here&#8217;s what the page looks like:</p>
<div id="attachment_208" class="wp-caption alignnone" style="width: 298px"><img class="size-full wp-image-208 " title="Original Full Page" src="http://hurvitz.org/blog/wp-content/uploads/2011/04/full-page.jpg" alt="" width="288" height="381" /><p class="wp-caption-text">Original Full Page</p></div>
<p style="display: inline !important;">This is not a very challenging document. The text is very high resolution and doesn&#8217;t use many fonts. The layout is a little more challenging: there&#8217;s a header; two columns; and a few graphical elements. Good OCR programs will recognize and preserve this layout.</p>
<h3>Google Docs for Android</h3>
<p>For this test I used a Nexus S phone. Getting good photos was difficult because my hand and the phone itself kept casting shadows on the page. I tried to perform OCR anyway, but this uneven lighting resulted in horrendous performance. (I also tried using the camera&#8217;s flash, but that produced even worse results because the lighting was extremely uneven.) I am nothing if not fair, so I twisted myself like a pretzel until I managed to get a photo with even lighting across the page. For best results you should have the steady hands of a brain surgeon.</p>
<p>Once I managed to get some reasonable photos, I had the Google Docs app convert them to documents. The app doesn&#8217;t perform OCR on the device: it sends the photo to Google&#8217;s servers for processing. This was pretty fast. Here are the results of OCRing a corner of the page:</p>
<div id="attachment_210" class="wp-caption alignnone" style="width: 310px"><img class="size-medium wp-image-210" title="Android Test" src="http://hurvitz.org/blog/wp-content/uploads/2011/04/android-ocr-3-300x225.jpg" alt="" width="300" height="225" /><p class="wp-caption-text">Android Test</p></div>
<p><span style="color: #333399;">ONE AFTERNOON A-BOUT 12 years ago, Larry Page and Sergey Brin gave John Doerr a call. A few months earlier, the Google cofounders had accepted $12.5 million from Kleiner Perkins Canfield &amp; Byers, DoerrÂ’s venture-capital firm, as well as an equal amount from Sequoia Capital. When they took the cash, they agreed</span><br />
<span style="color: #333399;">that they would hire an outsider to replace Page as CEO,</span><br />
<span style="color: #333399;">a common strategy to provide Â“adult supervisionÂ” to inexperienced founders. But now they were reneging. Â“They said, Â‘WeÂ’ve changed our mind. We think we can run the company between the two of us/ Doerr recalls. ,ny between me »wu Ul nb, We.. Doerrfg instinct was to immediately sell his shares, but he held off. He m .ide Page and Brin an offer: He would set up meetings for them with the most brilliant Silicon Valley, so they could get a better sense of what the job entailed «After th t h 3 . E</span><br />
<span style="color: #333399;">told them, Â“if you think we should do a search, we will. And if you donÂ’t want to, then</span></p>
<p>This was the best result out of all of my attempts. You can see that even this result contains many errors: so many that I won&#8217;t bother pointing them out. Nevertheless, most of the text was OCR&#8217;d correctly.</p>
<p style="display: inline !important;">For comparison, here&#8217;s the same page, but this time the photo wasn&#8217;t taken quite as well. The results are much worse:</p>
<div id="attachment_209" class="wp-caption alignnone" style="width: 310px"><img class="size-medium wp-image-209 " title="Poor Lighting" src="http://hurvitz.org/blog/wp-content/uploads/2011/04/android-ocr-2-300x225.jpg" alt="" width="300" height="225" /><p class="wp-caption-text">Poor Lighting</p></div>
<p><span style="color: #333399;">i</span><br />
<span style="color: #333399;"> ome ABOUT 12 years ago, Larry Page and Sergey Brin gave John Doerr a call. A few months earlier, the Google cofounders had accepted $12.5 million from Kleiner Perkins Caufield &amp; Byers, DoerrÂ’s venture-capital firm, as well as an equal amount from Sequoia Capital. When</span><br />
<span style="color: #333399;"> they took the cash, they agreed that they would hire an 0</span><br />
<span style="color: #333399;"> utsider to replace as CEO</span><br />
<span style="color: #333399;"> a common strategy to provide Â“adult supervìsionÂ” to inexperienced founders. But now the y were renegìng. Â“They said, Â‘WeÂ’ve changed our mind. We thi</span><br />
<span style="color: #333399;"> e changed our mind. We think we can run the company between the two of us,&#8221;Â’ Doerr recalls.</span><br />
<span style="color: #333399;"> immediately sell his shares, but he held off. He made P</span><br />
<span style="color: #333399;"> . ne mane rage and Brin an offer: He would set up meetings for them with the most brilliant CEOs in Silicon Valley, 5° *hey Could get a better sense of what the job entailed. Â“After that. &#8221; he told them, Â“if you think we should do a search. we willi And if vnu don&#8217;</span><br />
<span style="color: #333399;"> El E El APH 2011</span><br />
<span style="color: #333399;"> arch, we will. And if you want to, then</span></p>
<p>The second photo is only slightly worse than the first one, but it resulted in a huge drop in OCR quality. (And some of the text is even repeated, e.g. &#8220;changed our mind&#8221;. How does <i>that</i> happen?) Unfortunately, in real-life situations the photos that users take are more likely to resemble the second photo than the first one.</p>
<p>These tests show only a corner of the page. I also tried to photograph the entire page, but this failed miserably: the resulting document contained no text at all. <strong>I suspect Google Apps for Android reduces the resolution of the photo before sending it to the server for OCR!</strong> Otherwise, the OCR should have produced <em>some</em> results. The Nexus S camera has a resolution of 2560&#215;1920. The test page was 10.8&#8243;x7.9&#8243;, which means that the photo was about 240 dpi (dots per inch), which is high enough to produce good OCR results. (The web version of Google Docs managed to produce reasonable results even with a 120 dpi image.) The most likely explanation I can think of is that the OCR was working with a low-resolution version of the page.</p>
<p>Another reason that I suspect Google Docs is downsampling the image is that when I exported the resulting document to HTML, the image in the HTML file was only 1280&#215;960 pixels, i.e. 1.2 megapixels. This is anecdotal evidence, but it&#8217;s consistent with the complete failure to perform OCR on the full page.</p>
<p style="display: inline !important;">What does this mean? If indeed Google is downsampling the photos before performing OCR then they could provide much better OCR results simply by sending the entire photo for processing instead of a lower-resolution version. And if they&#8217;re not downsampling the photos then they have some other processing problem, but that also means that there&#8217;s a lot they could do to improve the results. Let&#8217;s see if they do anything about it!</p>
<h3>Google Docs on the Web</h3>
<p style="display: inline !important;">The rest of these tests were all performed on my PC. I scanned the page using a flatbed scanner (el cheapo Canon, but it&#8217;s more than good enough).</p>
<p>The web version of Google Docs can perform OCR on images. They restrict the uploaded images to 2 MB, so I couldn&#8217;t upload the full 300 dpi page because it was 9.5 MB (as a PNG; I didn&#8217;t want to use a lossy format such as JPEG). I reduced the page to 120 dpi (1.7 MB) and uploaded that file. Here are the results:</p>
<p style="display: inline !important;"><span style="color: #333399;">AN UNCONVENTIONALCEO Ten</span></p>
<p style="display: inline !important;"><span style="color: #333399;"><br />
</span></p>
<p style="display: inline !important;"><span style="color: #333399;"><br />
</span></p>
<p style="display: inline !important;"><span style="color: #333399;">H | s BY STEVEN LEVY</span></p>
<p style="display: inline !important;"><span style="color: #333399;"><br />
</span></p>
<p style="display: inline !important;"><span style="color: #333399;"><br />
</span></p>
<p style="display: inline !important;"><span style="color: #333399;">ONE AFTERNOON ABOUT 12 years ago, Larry Page and Sergey Brin gave John Doerr a call. A few months earlier, the Google cofounders had accepted $12.5 million from Kle`iner Perkins Caufielcl &amp; Byers, DoerrÂ’s venture-capital ?rm, as well as an equal amount from Sequoia Capital. when they took the cash, they agreed that they would hire an outsider to replace Page as CEO, a common strategy to provide Â“adult supervisionÂ” to inexperienced founders. But now they were reneging. &#8220;They said, &#8216;WeÂ’ve changed our mind. We think we can run the company between the two of us,Â’ &#8221; Doerr recalls. Doerr&#8217;s ?rst instinct was to immediately sell his shares, but he held He made Page and Brin an He would set up meetings for them with the most brilliant CEOs in Silicon Valley, so they could get a better sense of what the job entailed. &#8220;After that,&#8221; he told them, &#8220;if you think we should do a search, we will. And if you don&#8217;t want to, then</span></p>
<p style="display: inline !important;"><span style="color: #333399;"><br />
</span></p>
<p style="display: inline !important;"><span style="color: #333399;"><br />
</span></p>
<p style="display: inline !important;"><span style="color: #333399;">24,000 employees later, cofounder retakes the topjob at Google. run the company like a startup</span></p>
<p style="display: inline !important;"><span style="color: #333399;"><br />
</span></p>
<p style="display: inline !important;"><span style="color: #333399;"><br />
</span></p>
<p style="display: inline !important;"><span style="color: #333399;">I&#8217;ll make a decision about that.&#8221; Page and Brin took a Magical Mystery Tour of high tech royalty: Apple&#8217;s Steve Jobs, Intel&#8217;s Andy Grove, lntuitÂ’s Scott Cook, Amazon .com&#8217;s Jeff Bezus, and others. Then they came back to Doerr. &#8220;We agree with you,&#8221; they told him; they were ready to hire a CEO. But they would</span></p>
<p style="display: inline !important;">&nbsp;</p>
<p style="display: inline !important;">&nbsp;</p>
<p>This is much better than Google Docs for Android: most of the text was converted correctly, even though the scan was very low resolution (120 dpi). However, the layout detection algorithm is pretty poor. To its credit, it detected correctly that the text was in two columns. But it mistakenly thought the header is using the same columns as the text below, so it broke it up in a ridiculous way. This is why the sentence &#8220;24,000 employees later, cofounder retakes the topjob at Google. run the company like a startup&#8221; appears in the middle of the text.</p>
<p>Other problems include: not recognizing word breaks (&#8220;UNCONVENTIONALCEO&#8221;; &#8220;topjob&#8221;); confusing &#8220;d&#8221; with &#8220;cl&#8221; in &#8220;Caufield&#8221;; converting &#8220;Bezos&#8221; to &#8220;Bezus&#8221;; not recognizing the &#8220;ff&#8221; <a href="http://en.wikipedia.org/wiki/Ligature_(typography)">ligature</a> (this is why the words &#8220;off&#8221; and &#8220;offer&#8221; are missing).</p>
<p>If Google Docs relaxes its file size limit so that higher-resolution scans can be uploaded then it&#8217;s likely that it will be able to produce very accurate results on the text itself. However,  the layout still won&#8217;t be preserved.</p>
<h3>Tesseract</h3>
<p><a href="http://code.google.com/p/tesseract-ocr/">Tesseract</a> is a free, open-source OCR program. Since <a href="http://googlecode.blogspot.com/2006/08/announcing-tesseract-ocr.html">Google had a hand</a> in making it open-source, I thought perhaps this is the OCR engine that they use in Google Docs, so I wanted to try it out. Here are the results from OCR-ing the 300 dpi scan:</p>
<p><span style="color: #333399;">AN UNCONVENTIONAL CEO</span><br />
<span style="color: #333399;"> Terr years and 24,090 emptayees tater,rf;ofsrrrrder</span><br />
<span style="color: #333399;"> Larry Page retakes the top get; at</span><br />
<span style="color: #333399;"> His goalzto run the cemparsy trite a ata.rttrp aggaia.</span><br />
<span style="color: #333399;"> BY STEVEN LEVY</span><br />
<span style="color: #333399;"> ONE AFTERNOON ABOUT</span><br />
<span style="color: #333399;"> 12 years ago, Larry Page and</span><br />
<span style="color: #333399;"> Sergey Brin gave John Doerr</span><br />
<span style="color: #333399;"> a call. A few months earlier,</span><br />
<span style="color: #333399;"> the Google cofounders had</span><br />
<span style="color: #333399;"> accepted $12.5 million from</span><br />
<span style="color: #333399;"> Kleiner Perkins Caufield &amp;</span><br />
<span style="color: #333399;"> Byers, DoerrÂ’s venture-capital</span><br />
<span style="color: #333399;"> firm, as well as an equal amount</span><br />
<span style="color: #333399;"> from Sequoia Capital. When</span><br />
<span style="color: #333399;"> they took the cash, they agreed that they would hire an outsider to replace Page as CEO,</span><br />
<span style="color: #333399;"> a common strategy to provide Â“adult supervisionÂ” to inexperienced founders. But now</span><br />
<span style="color: #333399;"> they were reneging. Â“They said, Â‘WeÂ’ve changed our mind. We think we can run the com-</span><br />
<span style="color: #333399;"> pany between the two of us,Â” Â” Doerr recalls.</span><br />
<span style="color: #333399;"> DoerrÂ’s first instinct was to immediately sell his shares, but he held OIT. He made Page</span><br />
<span style="color: #333399;"> and Brin an offer: He would set up meetings for them with the most brilliant CEOs in</span><br />
<span style="color: #333399;"> Silicon Valley, so they could get a better sense of what the job entailed. Â“After that,Â” he</span><br />
<span style="color: #333399;"> told them, Â“if you think we should do a search, we will. And if you donÂ’t want to, then</span><br />
<span style="color: #333399;"> _-&#8221;</span><br />
<span style="color: #333399;"> .ra Â‘T</span><br />
<span style="color: #333399;"> E] E] E] Aeazou</span><br />
<span style="color: #333399;"> .-&#8217;,,~.4&#8242;;?}.7 l</span><br />
<span style="color: #333399;"> si fl</span><br />
<span style="color: #333399;"> IÂ’ll make a decision about that.Â” Page and</span><br />
<span style="color: #333399;"> Brin took a Magical Mystery Tour of high</span><br />
<span style="color: #333399;"> tech royalty: AppleÂ’s Steve Jobs, IntelÂ’s</span><br />
<span style="color: #333399;"> Andy Grove, IntuitÂ’s Scott Cook, Amazon</span><br />
<span style="color: #333399;"> .comÂ’s Jeff Bezos, and others. Then they</span><br />
<span style="color: #333399;"> came back to Doerr.</span><br />
<span style="color: #333399;"> Â“We agree with you,Â” they told him; they</span><br />
<span style="color: #333399;"> were ready to hire a CEO. But they would</span><br />
<span style="color: #333399;"> sr, {&#8220;G1&#8242;Bfi1L1</span></p>
<p>Most of the text was recognized correctly, including words that Google Docs missed such as &#8220;offer&#8221; and &#8220;Caufield&#8221; (although it still got &#8220;off&#8221; wrong). Tesseract did a bad job on the header, and apparently thought some of the image was text (and didn&#8217;t use a dictionary to realize that it&#8217;s producing gibberish).</p>
<p>Of course, Tesseract had an easier task than Google Docs because it got a 300 dpi scan whereas Google Docs had only 120 dpi to work with. When I tried to give the 120 dpi scan to Tesseract it failed <strong>miserably</strong>, and produced 100% garbage.</p>
<h3>ABBYY FineReader</h3>
<p>Finally, I tested what a commercial OCR program can do. <a href="http://buy.abbyy.com/content/frpro/default.aspx">ABBYY FineReader</a> is one of the best programs today, and they have a free trial version, so that&#8217;s the one I used. The results were <strong>by far</strong> the best of the bunch. It recognized almost all of the text correctly; preserved fonts and layout; and recognized the images and saved them. Here&#8217;s a screenshot of the PDF that FineReader created. Note that unlike all the other images in this post, all the text here is <strong>editable</strong>:</p>
<div id="attachment_211" class="wp-caption alignnone" style="width: 217px"><img class="size-medium wp-image-211" title="PDF Created by ABBYY FineReader" src="http://hurvitz.org/blog/wp-content/uploads/2011/04/abbyy-article-300dpi-207x300.jpg" alt="" width="207" height="300" /><p class="wp-caption-text">PDF Created by ABBYY FineReader</p></div>
<p>I found only two mistakes in the output from ABBYY FineReader: it changed &#8220;Amazon.com&#8221; to &#8220;Amazon.corn&#8221; (probably due to an overzealous use of the dictionary), and changed &#8220;ILLUSTRATION BY Grafilu&#8221; to &#8220;LLUSTRATIDN BY Grafilll&#8221; in the footer.</p>
<h3>Conclusion</h3>
<p>Between 1996 and 2000 I worked at <a href="http://www.ligatureltd.com/">Ligature</a>, an OCR company, so I&#8217;m familiar with the quality of commercial OCR programs. Even back in 1996, <strong>all</strong> of the top commercial OCR programs produced results similar to what ABBYY FineReader produced in this roundup. I was shocked by how <strong>bad</strong> the free OCR solutions are. Google Docs for Web is the best of them, but even that program is problematic because of its file-size limit and the loss of layout.</p>
<p>As for Google Docs for Android: it produces mediocre results, even when the user goes to great lengths to give it good input. When using a mobile phone in the real world there will usually be many more challenges: the lighting is often bad; the camera isn&#8217;t held precisely perpendicular to the page; the user&#8217;s hand shakes; etc. So my advice is: if you&#8217;re Julian Assange and you want to duplicate super-secret documents in a hurry, nothing beats a flatbed scanner and a top-tier OCR program.</p>
<div class="addthis_toolbox addthis_default_style " addthis:url='http://hurvitz.org/blog/2011/04/ocr-quality-of-google-docs' addthis:title='The OCR Quality of Google Docs '  ><a class="addthis_button_facebook_like" fb:like:layout="button_count"></a><a class="addthis_button_tweet"></a><a class="addthis_counter addthis_pill_style"></a></div>]]></content:encoded>
			<wfw:commentRss>http://hurvitz.org/blog/2011/04/ocr-quality-of-google-docs/feed</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>PureText</title>
		<link>http://hurvitz.org/blog/2009/01/puretext</link>
		<comments>http://hurvitz.org/blog/2009/01/puretext#comments</comments>
		<pubDate>Sat, 17 Jan 2009 17:25:31 +0000</pubDate>
		<dc:creator>Oren Hurvitz</dc:creator>
				<category><![CDATA[Software]]></category>

		<guid isPermaLink="false">http://hurvitz.org/blog/?p=114</guid>
		<description><![CDATA[<div class="addthis_toolbox addthis_default_style " addthis:url='http://hurvitz.org/blog/2009/01/puretext' addthis:title='PureText '  ><a class="addthis_button_facebook_like" fb:like:layout="button_count"></a><a class="addthis_button_tweet"></a><a class="addthis_counter addthis_pill_style"></a></div>One of my favorite utilities is Steve Miller&#8217;s PureText. It lets you copy-and-paste text while removing all of the formatting. This is extremely useful; in fact, I used it twice just while writing this blog post. For example, I often want to copy code snippets from my IDE, Eclipse, into Microsoft Word. Here&#8217;s what the [...]<div class="addthis_toolbox addthis_default_style " addthis:url='http://hurvitz.org/blog/2009/01/puretext' addthis:title='PureText '  ><a class="addthis_button_facebook_like" fb:like:layout="button_count"></a><a class="addthis_button_tweet"></a><a class="addthis_counter addthis_pill_style"></a></div>]]></description>
			<content:encoded><![CDATA[<div class="addthis_toolbox addthis_default_style " addthis:url='http://hurvitz.org/blog/2009/01/puretext' addthis:title='PureText '  ><a class="addthis_button_facebook_like" fb:like:layout="button_count"></a><a class="addthis_button_tweet"></a><a class="addthis_counter addthis_pill_style"></a></div><p>One of my favorite utilities is Steve Miller&#8217;s <a href="http://www.stevemiller.net/puretext/">PureText</a>. It lets you copy-and-paste text while removing all of the formatting. This is extremely useful; in fact, I used it twice just while writing this blog post.</p>
<p>For example, I often want to copy code snippets from my IDE, Eclipse, into Microsoft Word. Here&#8217;s what the code looks like in Eclipse:</p>
<p><img class="alignnone size-full wp-image-115" title="Code in Eclipse" src="http://hurvitz.org/blog/wp-content/uploads/2009/01/code-in-eclipse.png" alt="Code in Eclipse" width="339" height="28" /></p>
<p>And here&#8217;s what it looks like after pasting into Word:</p>
<p><img class="alignnone size-full wp-image-116" title="Code in Microsoft Word" src="http://hurvitz.org/blog/wp-content/uploads/2009/01/code-in-microsoft-word.png" alt="Code in Microsoft Word" width="293" height="36" /></p>
<p>I blame Bill Gates.</p>
<p>However, with PureText, I simply paste using a different shortcut (Windows+V), and get only the text, without any of the formatting:</p>
<p><img class="alignnone size-full wp-image-117" title="Code in Microsoft Word, without formatting" src="http://hurvitz.org/blog/wp-content/uploads/2009/01/code-in-microsoft-word-without-formatting.png" alt="Code in Microsoft Word, without formatting" width="235" height="33" /></p>
<p>Mission accomplished.</p>
<p>(By the way, Microsoft Word allows you to remove formatting from pasted text, but only after you paste it. Using PureText is faster, and it works with all applications; not just Word.)</p>
<div class="addthis_toolbox addthis_default_style " addthis:url='http://hurvitz.org/blog/2009/01/puretext' addthis:title='PureText '  ><a class="addthis_button_facebook_like" fb:like:layout="button_count"></a><a class="addthis_button_tweet"></a><a class="addthis_counter addthis_pill_style"></a></div>]]></content:encoded>
			<wfw:commentRss>http://hurvitz.org/blog/2009/01/puretext/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

