PMG Digital Made for Humans

Mobile Development and OCR

4 MINUTE READ | January 24, 2012

Mobile Development and OCR

Author's headshot

PMG

PMG is a global independent digital company that seeks to inspire people and brands that anything is possible. Driven by shared success, PMG uses business strategy and transformation, creative, media, and insights, along with our proprietary marketing intelligence platform Alli, to deliver Digital Made for Humans™. With offices in New York, London, Dallas/Fort Worth, Austin, Atlanta, and Cleveland, our team is made up of over 900+ employees globally, and our work for brands like Apple, Nike, Best Western Hotels & Resorts, Gap Inc., Kohler, Momentive, Sephora, and Shake Shack has received top industry recognitions including Cannes Lions and Adweek Media Plan of the Year.

For our most recent mobile project we had need to use OCR capabilities to extract the data from a check image, so I did a lot of searching and scouring the internet to find a good strong OCR sdk that works on both the iPhone and the Android platforms. It can be pretty easy to find an OCR document reader in the Apple App Store, or Android Market, but that doesn’t really help our case where we need it integrated into our application itself.  So, I thought I’d compile a quick list of the different options I was able to find out there.

WiseTrend is one of many different cloud-based OCR platforms that allow you to send an image out to their server to have it read and it will send you back what text is on the image in a computer-readable format. This is actually a rather brilliant solution, because it takes all of that intensive processing that OCR requires off of the mobile device itself and puts it out on a powerful machine that can devote all of its resources to reading through pages and pages of text.  It’s also fantastic that cloud-based options like WiseTrend would work across all possible mobile platforms, because all it requires is an internet connection. Pricing on options like this can come out to over $500 dollars a month if you have a high traffic application.  That’s a pretty big investment.  The main issue we had, considering our needs, is that we needed check images to be read via OCR. Sending sensitive information like that out to a third-party cloud server is never a good idea, no matter how much you trust the company.

ABBYY is a software company that specializes in OCR and they actually have a fully functional Mobile OCR Engine, that works on iPhone and Android as well as Windows Mobile! There are a number of different libraries like this one, all of which cost about two arms and a leg.  From what my research turned up, ABBYY Mobile OCR Engine itself costs around $15K plus an extra %20 of your App sales profit. The great thing is it’s all done on the device itself so there are no security issues with using ABBYY for our specific needs, it’s just a lot to spend on an application that’s going to be sold for free.

The Tesseract OCR engine was developed at HP Labs between 1985 and 1995 and released as open source in 2005. Development on tesseract picked up again in 2006, with sponsorship from Google. It is actually considered to be one of the most accurate open source OCR engines currently available. Unfortunately Tesseract was written in a combination of C and C++ so porting it for use on the iPhone or Android platforms can take a lot of work. Lucky for us, there are developers out there willing to do that work for us for free.

Both of these projects show how you can use the Tesseract library on an iPhone, of course they’re both open source, and still in development so it could take a fair amount of work, even digging through the Tesseract library, itself to make sure it’s all working properly. But, when it’s free it can be hard to find a more viable option. Also, being sponsored by Google there is a set of tools for Android out there that allow your application to interact with the tesseract library:

Stay in touch

Bringing news to you

Subscribe to our newsletter

By clicking and subscribing, you agree to our Terms of Service and Privacy Policy

Obviously these three options aren’t the only choices for OCR out there, these are only the most logical options in my opinion. Hopefully, this will help you make the right decision whatever your OCR needs may be. In the end our client decided to drop the OCR from their application, for now. Considering the amount of extra work or money it would take depending on which library we chose, I wholly understand. OCR is a pretty massive undertaking, so before you make any final decisions as to what library you want to use for your needs, I’d suggest you pick up where I left off and do some research of your own. Test some different libraries out (most paid ones have free trial versions you can implement), make sure that the technology is accurate enough for the kinds of documents you need read. When it comes down to it, it’s better to not have any OCR than to have inaccurate, slow OCR in any kind of application, mobile or otherwise.


Related Content

thumbnail image

AlliPMG CultureCampaigns & Client WorkCompany NewsDigital MarketingData & Technology

PMG Innovation Challenge Inspires New Alli Technology Solutions

4 MINUTES READ | November 2, 2021

thumbnail image

Applying Function Options to Domain Entities in Go

11 MINUTES READ | October 21, 2019

thumbnail image

My Experience Teaching Through Jupyter Notebooks

4 MINUTES READ | September 21, 2019

thumbnail image

Working with an Automation Mindset

5 MINUTES READ | August 22, 2019

thumbnail image

3 Tips for Showing Value in the Tech You Build

5 MINUTES READ | April 24, 2019

thumbnail image

Testing React

13 MINUTES READ | March 12, 2019

thumbnail image

A Beginner’s Experience with Terraform

4 MINUTES READ | December 20, 2018

ALL POSTS