PDA

View Full Version : This is for testing the accuracy of OCR conversion of a document.


maple41
Sep 7, 2010, 08:30 AM
This is for testing the accuracy of OCR conversion of a document. If I want to be certain I have a positive match of a word or phrase containing 15 characters, how many correct characters do I need to match? For example, if I can only match 5 characters in the proper sequence, what is the probability this is a positive match? The same test would go for a number sequence with 15 numbers.

smoothy
Sep 7, 2010, 08:42 AM
How about using the following.

The quick brown fox jumped over the lazy dogs back 1234567890 times.

Repeat it a number of times... and it contains every letter and number in the alphabet. Long used as a test to catch garbling of transmitted messages.

ScottGem
Sep 7, 2010, 09:03 AM
I'm not sure what you are asking here. OCR uses algorithms to try and determine what the actual text is. Those algorithms are proprietary secrets of the software vendor.