<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Joshua Danish &#187; OCR</title>
	<atom:link href="http://www.joshuadanish.com/tag/ocr/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.joshuadanish.com</link>
	<description></description>
	<lastBuildDate>Fri, 10 Sep 2010 16:26:45 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Optical Character Recognition (OCR)</title>
		<link>http://www.joshuadanish.com/2009/11/12/optical-character-recognition-ocr/</link>
		<comments>http://www.joshuadanish.com/2009/11/12/optical-character-recognition-ocr/#comments</comments>
		<pubDate>Thu, 12 Nov 2009 21:53:23 +0000</pubDate>
		<dc:creator>Joshua</dc:creator>
				<category><![CDATA[Academic Tools]]></category>
		<category><![CDATA[articles]]></category>
		<category><![CDATA[OCR]]></category>
		<category><![CDATA[PDF]]></category>

		<guid isPermaLink="false">http://www.joshuadanish.com/?p=490</guid>
		<description><![CDATA[If you are like me, then you are constantly reading academic documents on the computer, and many of these were scanned in. This makes it difficult to annotate, copy text for a quotation, or otherwise manipulate the document in the ways that support scholarship. Enter Optical Character Recognition. This is a general class of technologies [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.adobe.com/products/acrobatpro/"><img src="http://www.joshuadanish.com/wp-content/uploads/2009/07/pdficon_large.gif" alt="pdficon_large" title="pdficon_large" width="32" height="32" class="alignright size-full wp-image-21" /></a>If you are like me, then you are constantly reading academic documents on the computer, and many of these were scanned in.  This makes it difficult to annotate, copy text for a quotation, or otherwise manipulate the document in the ways that support scholarship.  Enter Optical Character Recognition.  This is a general class of technologies that can look at images with words in them, figure out where the words are, and then convert them into a format that you can edit.  My current tool of choice for converting papers from images to text is <a href="http://www.adobe.com/products/acrobatpro/">Adobe Acrobat</a>, though there are many alternatives.  The documents that I typically convert are already in PDF format, and so it is incredibly convenient to run the OCR feature within Acrobat and then annotate the paper using Acrobat, Preview, or Skim.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.joshuadanish.com/2009/11/12/optical-character-recognition-ocr/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
