Re: textbook scanning for teaching purposes


 

Gudrun,

          Aieeeeeee!! but you've asked a complicated series of questions but I will try to help out, or at least hope I can.  Some of this is coming from the perspective of having a number of my clients receive PDF files that, though they are scanned from books, were originally scanned as image PDFs either due to when they were scanned or complete ignorance of the availability of OCR scanning once it was commonly available.

          For making PDFs that my clients have all reported are accessible from MS-Word documents and the like, I've had great luck with PDF Creator, which you can download from pdfforge.org.  This software installs a virtual printer on your system, and you can print any file to it to convert it to a PDF.  With materials that originated as text materials it seems to do a very good job and the files are smaller, and seemingly cleaner, than what any of the Microsoft Office programs generate when you use the save as PDF feature.

          The next recommendation may be of no use to you because I do not believe that the software supports OCR in Swedish, but I've used only the free version and don't know if they have more languages supported in their paid version.  There is a company with the unfortunate name (these days, anyway) of Tracker Software, that produces a number of PDF processing tools, and I've been using their free version of PDF Viewer.  They definitely have a Swedish Language "language pack" that was just updated most recently a few days ago.  I don't think that comes with the free PDF Viewer.  Anyway, PDF Viewer has a built-in OCR capability for English, and I believe Spanish, German, and French (it's been too long since the install for me to remember).  I have to say that its OCR engine is simply incredible for nicely scanned PDFs from books and is really good even for some pretty sketchy scans.  You might want to check with them with regard to what you need to scan into OCR form and/or send them a few samples.  I have been incredibly pleased with the results I've gotten and all of my clients who receive image PDFs (which is virtually every one that's in a college setting) now has this installed on their machines so that they can independently OCR process image PDFs they know have originated from scans of print materials.  It doesn't play well with JAWS as a reader, but it does as far as opening files, OCR processing them, and saving them after processing.

Brian

Join main@jfw.groups.io to automatically receive all group messages.