Welcome to Dream.In.Code
Become an Expert!

Join 149,478 Programmers for FREE! Get instant access to thousands of experts, tutorials, code snippets, and more! There are 1,620 people online right now. Registration is fast and FREE... Join Now!




OCR libraries, programs, SDKs

 
Reply to this topicStart new topic

OCR libraries, programs, SDKs, experience, recommendations

1lacca
19 Apr, 2007 - 12:54 AM
Post #1

code.rascal
Group Icon

Joined: 11 Aug, 2005
Posts: 3,822



Thanked: 12 times
My Contributions
I've posted a similar thread in the Java forum, but I think it is interesting on a wider scale, too.
So do you have any experience with OCR software, libraries, or SDKs?
What would you recommend, what are the pitfalls, how to optimize performance, etc. ?
If you write about one, please mention the platform it is available for, its licence, and maybe the supported character sets.
Did you integrate a library into your software, or used a standalone application?
User is offlineProfile CardPM
+Quote Post

1lacca
RE: OCR Libraries, Programs, SDKs
29 Apr, 2007 - 01:10 AM
Post #2

code.rascal
Group Icon

Joined: 11 Aug, 2005
Posts: 3,822



Thanked: 12 times
My Contributions
So it looks-like I am the only one interested in this, but since I've seen threads like this going mostly unanswered in several forums, I thought I would sum up some findings I have so far.
I have tried a couple of applications. At first take, I just wanted to see what they could do with a screenshot of a pdf file, just to assume ideal scanning. The next turn will be testing the engines with real scans, and finding out the right scanning settings and preprocessing needed on the pictures.
Simple OCR (commercial) was a let down, since it didn't have support for Latin2 characters (or at least the demo lacked it), so it went nuts on my sample page.
I was really excited aboout GOCR and Conjecture, as it was an open source OCR, that seemed like the easiest one to incorporate into an applicaiton. Well, it had problems with Latin2, too - maybe it could be extended in some way, but if there was an out of the box solution...
Tesseract from Google: same problem as above, only US charset support, although as it is OpenSource and training code is inculded, it might worth a second look, if everything else fail...
And here come the big guns. these are commercial ones, but they seem to be fine:
ABBYY FineReader 8.0 - this one worked perfectly.
Scansoft OmniPage 15.0 - first I've started out it in some batch mode, and it just hung on the first page. Then I've found out, that it can work as a normal application, like the ABBY FR, so I've tried it too, and it worked fine. I'll give another go to the batch mode, because it seems to have very interesting automation in it, that would suit my needs well.
Both of the latter two have support for a multitude of languages and charsets.
Anyway, unfortunately it looks-like I'll have to go with one of the commercial ones, however they both have SDK licenses (probably as com modules) (although I think right now they are too expensive for my customer) that sounds interesting. So probably I won't implement full-scale integrations, and some things will have to be done manually right now, but since the amount of OCRing is minimal, it should work out fine...

User is offlineProfile CardPM
+Quote Post

enlai_chu
RE: OCR Libraries, Programs, SDKs
3 May, 2007 - 08:51 PM
Post #3

New D.I.C Head
*

Joined: 3 May, 2007
Posts: 1


My Contributions
I highly recommend tocr (http://www.transym.com). It's no frills and it works really, really well. It's not free but it's CHEAP for what you get. And support is responsive too. Unfortuantely, it doesn't run in *nix, only Windows. You get a library you can work into your code. All I had to do was modify the VB demo for my needs and it's been excellent.

Enlai
PS no, I don't work for them - I'm a paying customer!
User is offlineProfile CardPM
+Quote Post

1lacca
RE: OCR Libraries, Programs, SDKs
3 May, 2007 - 11:48 PM
Post #4

code.rascal
Group Icon

Joined: 11 Aug, 2005
Posts: 3,822



Thanked: 12 times
My Contributions
Thank you Enlai, I'm downloading the trial right now, it looks promising!
User is offlineProfile CardPM
+Quote Post

1lacca
RE: OCR Libraries, Programs, SDKs
4 May, 2007 - 12:38 AM
Post #5

code.rascal
Group Icon

Joined: 11 Aug, 2005
Posts: 3,822



Thanked: 12 times
My Contributions
So I've tried TOCR, and it was around on pair with gocr. It doesn't support Latin2 characters, simply skipped a large part of my test data, but what it recognized was 99% correct. However the the provided SDK might be useful in some cases.
I've only tested it with the application provided, so if there is a way to feed it soem other charsets it might be useful (although it seemed that it has some problems with the accents..)
User is offlineProfile CardPM
+Quote Post

premwithme
RE: OCR Libraries, Programs, SDKs
30 Dec, 2007 - 10:49 AM
Post #6

New D.I.C Head
*

Joined: 30 Dec, 2007
Posts: 1

Hey,


I am new to this forum, i have download the GOCR OCR Software.

But i don't know how to run this.
There is no EXE file in the downloaded folder only one batch file in that also when i double click one Dos window open and close in second.

Can any one help me.

bye
User is offlineProfile CardPM
+Quote Post

iowa
RE: OCR Libraries, Programs, SDKs
16 Sep, 2008 - 06:40 PM
Post #7

New D.I.C Head
*

Joined: 16 Sep, 2008
Posts: 3

TOCR is nice
User is offlineProfile CardPM
+Quote Post

Fast ReplyReply to this topicStart new topic
Time is now: 1/7/09 03:40PM

Be Social

Dream.In.Code RSS Feed Dream.In.Code LinkedIn Group Follow Us On Twitter

Live Help!

Tutorials

Programming

Web Development

Reference Sheets

Code Snippets

DIC Chatroom

Bye Bye Ads

Monthly Drawing

Thumb Drive

Top Contributors

Top 10 Kudos This Month