ATI FW: [VICUG-L] Optical character recognition (OCR) in Google Docs
Peter Altschul
paltschul at centurytel.net
Wed Jun 23 13:46:19 CDT 2010
source url:
http://googledocs.blogspot.com/2010/06/optical-character-recognition-ocr-in.
html
Optical character recognition (OCR) in Google Docs.
Tuesday, June 22, 2010.
A couple of months ago, my co-worker, Mike, showed up at my desk with a pile
of
paper, each of the yellowed sheets densely covered with an ancient-looking
typewriter font. His wife had recently discovered parts of her family
chronicles
in the attic, typed up by her grandmother many years ago! Now he was
wondering
if there was a way for her to continue writing the chronicles in Google
Docs.
The papers sat on my desk for a while, but recently, I returned them to Mike
with a smile, cheerfully telling him that what started as my 20% project is
now
ready for everyone to use - Google Docs now officially supports importing
scanned documents. What we launched as an experimental feature for the
Documents
List Data API last year is now available on the upload page: check the
"Convert
text from PDF or image files to Google Docs documents", upload your scanned
images (JPEG, GIF, PNG) or PDFs, and Google Docs will extract text and
formatting from the scans for you to edit away.
For the technically curious: we're using Optical Character Recognition (OCR)
that our friends from Google Books helped us set up. OCR works best with
high-resolution images, and not all formatting may be preserved. The
original
images will be included in the new document to make it easier for you to
correct
mistakes. Supported languages include English, French, Italian, German and
Spanish, with more languages and character sets on their way. We're looking
forward to get feedback from you while we keep improving the feature over
the
next months.
And Mike's scanned family chronicles have even been extended by an
additional
chapter in Google Docs: his wife recently had a baby boy named James!
Posted by: Jaron Schaeffer, Software Engineer, Google Docs.
VICUG-L is the Visually Impaired Computer User Group List.
Archived on the World Wide Web at
http://listserv.icors.org/archives/vicug-l.html
Signoff: vicug-l-unsubscribe-request at listserv.icors.org
Subscribe: vicug-l-subscribe-request at listserv.icors.org
More information about the ATI
mailing list