Previously, I wrote to you all about the amazing technology that is Grooper’s OCR Synthesis. And I let you know that OCR is problematic, and causes headaches. When we built Grooper, we knew we couldn’t stop at simply relying on using the best OCR, we had to take it further.
This is where best document image processing software comes in. While there are some off-the-shelf solutions for cleaning up scanned images, we envisioned something more powerful, with easy to use pattern matching, Atomic Regular Expression, and the ability to use fuzzy logic.
They’ve disrupted the norm, and even made enemies to become the consistent champions they are. To me, it’s obvious why this is the case:
The New England Patriots have implemented the best system for winning in today’s NFL.
Don’t get me wrong, the Patriots have had some great players on their team, but nothing has ever hinged on one specific person at any one specific time. Key players could be injured, yet they keep winning.
Every weather condition that could occur during a game has happened to New England, and they’ve overcome.
The Patriots have found a way to overcome all the crazy variables that can affect a football season, and year after year, continue to win in one of the hardest sports to do so, consistently.
They have developed a systematic approach, and supplied all players and coaching staff, with the right tools to get the job done.
The name of the game is collecting data trapped in your documents. Many folks claim to be able to do this, but just like there are 32 teams in the NFL, only one gets to win. Grooper is the one to get it done.
OCR is notorious for throwing lots of unknown variables at you, and accuracy of data suffers as a result. Low quality images, lines and specks, logos, and more.
If only there was a system that let you win, no matter what…
Grooper’s unique and highly configurable document image processing software puts in the hands of a user, the ability to always get OCR started off on the right foot. Line, box, and barcode detection/removal (among other features) give our extremely capable OCR Synthesis system the best chance at getting accurate character reads.
Without these tools, you’d be fumbling around with bad OCR.
Grooper document image processing software has a simple, easy to use interface for quickly iterating and seeing results. It also allows for objects to be created uniquely and independent of one another to get tight results, but then allow the results to be combined in specific and powerful ways to get the exact data being sought.
It’s like the spread offense, where multiple receivers give you multiple angles of attack.
I won’t bore you with explanations of distance algorithms, but I can say most of you probably have experienced fuzzy logic in your life. If you’ve ever typed something into a search engine, and it corrected you with a suggestion, this is a type of fuzzy logic. Spell correcting as you compose a text is another form.
But, again, Grooper goes further than simple fuzzy logic, and combines it with the power of pattern matching gained from Regular Expression. And so, Fuzzy RegEx is born.
The better you get at writing powerful patterns, the more accurate the results you’re going to get. The whole idea is that you’re telling the system, as specifically as possible, what you’re wanting to find. When OCR throws junk at you, the system can see past the mess, and give you what you need. It will encounter characters in the text that don’t match your pattern and transform it to what you need. When it replaces a character, it tells you with what confidence it thinks what it found, matches what you want.
Good coaches are made or broken in the intensity and randomness of the 4th quarter of a game, and the choices they make to get around those things. Good data is broken by the random problems presented by OCR, and Fuzzy RegEx lets you get around those problems.
A successful, well implemented system is the true key to repeated, sustainable success. This does not happen on accident, or without a lot of experience and hard work.
We here at BIS have forged the powerful systems of Grooper as a result of our over 30 years of experience in implementing real solutions that help our clients reach their business goals. These goals are fueled by real, accurate data.