Why Optical Character Recognition Holds Back Progress

by Jesse Spencer | February 3, 2020

Optical Character Recognition (OCR) has been around a long time. So long, in fact – it’s become mired in confusion and mismatched expectations. Most organizations still use a lot of paper and PDF documents. While some industries are particularly drawn to paper, humanity is a long way off from eradicating the exchange of information through documents and forms.

Enterprise OCR provided by “OCR Engines” from vendors such as Tesseract, ABBYY, OmniPage, AnyDoc, Transym, Azure, Google, and others are only a small part of what organizations really need.

I talk to people every day who are looking for better OCR. What they’re really looking for is a better way to get accurate information from data trapped in documents. It’s only possible with two things: Dealing with OCR killers and machine reading.

Imperfect Documents Kill OCR

Take a skewed document image, for example. Independent research and our own observations prove you’ll get a maximum of 40% recognition accuracy. And the reality is that you’ll probably get less, which is pretty much worthless. Errors in dates, numbers, amounts, names, etc. make trusting the data difficult.

Document skewing isn’t the only OCR killer. Poor scan quality, hole punches, pictures, data in tables, mismatched font types, and text that spans pages all wreak havoc on OCR. And these are just some of the complications. If the solution is manual human review – that’s just too time-consuming and expensive for projects with hundreds of thousands or millions of pages.

Machine Reading

Machine reading is what OCR was when it was first mainstream. It’s innovative because it combines powerful data sciences and image processing tools that strike back at OCR killers.

Machine reading solutions combine better than twice-as-accurate OCR, and intelligent integration of information contained within documents. It’s the heart of modern data integration from physical and electronic documents.



4 Steps to Achieving Wisdom You can Use at Work Today

4 Steps to Achieving Wisdom You can Use at Work Today

How to create an Information as a Second Language program. [Free Guide]

4 Steps to Achieving Wisdom You can Use at Work Today

4 Steps to Achieving Wisdom You can Use at Work Today