Friday, July 31, 2009

OCR on PDF

Today I needed to take a PDF doc that consisted of a scanned book and convert it to text, to make it searchable. I used Craig Taverner's ruby script and it worked like a charm, once I changed the script to use the right tesseract path.

No comments: