By Nate Adcock on Fri, 10/04/2013
Mobile document scanning and OCR (Optical Character Recognition) is a great example of the fusion and evolution of technology. What used to require a bulky scanner and special desktop software, almost any smartphone with a camera can now do. There are several apps in the App store (SmartScan+OCR ($3.99), Perfect OCR ($3.99), etc.), that allow you to scan and recognize printed text and images, but Prizmo ($9.99) is a scanner application for iOS that takes it one step further. It will attempt to detect and then segment these captured areas out as separate document fields, preserving the formats (image sections vs. text), translating the results into multiple languages, and even reading them out to you! Prizmo makes it easy for you to arrange and share them as needed, and is also a great business card or image scanner.
OCR is one of those old technologies we have come to take for granted without really understanding what it is capable of (what is so great about a scanner, right?) A scanner equipped with OCR software can do some pretty cool stuff, actually. Not only can it detect text or character blocks on a printed page, but on a whiteboard as well. Try holding a 50lb scanner up to a whiteboard sometime, though (we have WBs at my work that are equipped with robot arms for this reason). Enter smartphone cameras used for OCR image processing!
OCR software processes the glyphs or characters detected in a scanner image against a dictionary and converts them to machine-readable text (which is kind of amazing), but it can also skew images to a readable perspective, and apply corrections to fix image distortion. The software can be accurate at detecting words, given certain parameters, but I have yet to find an OCR that was 100 percent reliable. Prizmo similarly made a few expected mistakes in this regard, but still performed well on my iPad mini. The software is a powerhouse scanning package, and though you may flinch a little at the price, it offers an extremely full-featured and flexible processing capability for a mobile app.
The interface looks much like the Mac version of the software, though maybe a bit simpler. The app starts and asks if you want to save your documents to iCloud. It is easy to scan and detect images or text on a document or business card by selecting the camera icon in the main tabulated view of your scanned in documents. Once the camera app appears, you select from Text or Image detection along the right menu options. If you want to actually setup an e-document or vCard version (to snag a contact), you'll want to select the file+ symbol instead, and select the function you want to execute.
The workflow is straightforward for getting text and images into a document. Select + to add a scan item to your document. Like almost all OCR, it can be finicky in recognizing text and image sections. Nice flat and well contrasted documents will do best. The image adjustment features help, but still were only maybe 70–80 percent effective in most situations (and sometimes much worse). When it does get the text and image sections right, it's pretty amazing, and it will even translate text to several other languages.
I could not get the page detection to actually straighten out the text perspective in my images (page detection feature), so not sure what I was doing wrong. It worked perfectly in some of the sample imagery provided, so I get the impression it works best from a certain perspective and contrast conditions.
It had no problem scanning and recognizing text in a simple business card (and correcting the angle), but would not let me add an image to the vCard for some reason. The app has a good help section to give pointers about capturing and setting image properties to get the best recognition, so maybe I need to review that more thoroughly.
After you have sufficiently fiddled and arranged your document content in order, you can share it via email, copy the text out to the clipboard, export it out to other apps/services (Evernote, QuickOffice, etc.) as a pdf or text document. You can also have the document read out loud to you at a custom pace in Siri's default voice (which is cool). You can also buy some other high-quality voices through in-app purchases (iAPs). I usually don't like the iAPs, but this seems like an OK case for their use (similar to skin updates). It isn't some dependency-building, carrot on a stick that was left off to entice you to continually upgrade.
Prizmo is a great OCR scanning app. But OCR is still somewhat a love/hate proposition. Somehow, I feel the whole capture and processing workflow should be smarter, and more easy to manage (and much more accurate). Excessive fiddling or correction of the image/text sections detracts from the convenience of having a handheld scanning capability. I also could not seem to find any way to preserve both text and images in a document in any editable format for other applications to leverage (either one, but not both). Those gripes aside, the app works well when you take the time to use it in the optimal conditions, and does have several great features to help bring your printed content to the cloud! It is darn cool to hear Siri quoting back to you what you just snapped with your camera!
- Capture text and images dynamically with OCR on iPhone or iPad
- Translate captured text and have your device read it to you
- OCR not effective in many situations