Ask Experts Questions for FREE Help !
Ask
    browjo's Avatar
    browjo Posts: 4, Reputation: 1
    New Member
     
    #1

    Nov 17, 2011, 07:41 PM
    Epson Perfection V700, scanning typewritten text
    We have a zillion historical dockets that need entering into a very simple ExCel spreadsheet:

    All the dockets are 50 to 75 years old. They have some "foxing" but are otherwise very well archived.

    We could go on simply typing directly into ExCel, but the fact is it will take four of us at least 20 to 30 years to finish at the current rate!

    We have to an Epson Perfection V700 Photo flat bed scanner. Currently this is operating on Adobe Professional to convert to PDF, but we can access a licenced copy of Omnipage software.

    The data is quite straight forward:

    -Date commenced (eight digits,typewritten)
    -Author (typewritten)
    -subject matter(typewritten)
    -location(typewritten)
    -docket number (four digits, handwritten)
    -date finalised(eight digits, handwritten)


    Does anyone have any ideas how we can tweak the scanner to improve accuracyand efficiency,please?

    - Is PDF converter software better to use than OCR,here?

    - Will tweaking the contrast, resolution etc and using overlays decrease the error rate?

    - Is Omnipage our best option?

    John B
    cdad's Avatar
    cdad Posts: 12,700, Reputation: 1438
    Internet Research Expert
     
    #2

    Nov 17, 2011, 08:20 PM
    Im going to step in with this note. Are you sure the paper can withstand the process of being put under light like that? If it can then your OK.

    A question I have is at what color depth are you scanning the documents? If they are black print on white paper. Then choose a low bit black/white scan. That way there is no confusion (or less confusion) for your OCR software.
    ScottGem's Avatar
    ScottGem Posts: 64,966, Reputation: 6056
    Computer Expert and Renaissance Man
     
    #3

    Nov 18, 2011, 04:52 AM
    PDF and OCR are different things. What's not clear is what your purpose is here. Are you trying to catalog these documents or do you really need to convert the whole document?

    There are third party services who can do this for you.

    I would NOT use Excel for this. I would use a database program like Access to catalog these files. I would first enter docket #, dates, author and location to create your record. Maybe include a brief description. You could hire temps to do this.
    browjo's Avatar
    browjo Posts: 4, Reputation: 1
    New Member
     
    #4

    Nov 18, 2011, 10:33 PM
    Thanks Scott & Cal,

    The purpose of the exercise is to creat an online index of contents of the many thousands of files ("dockets") we hold.

    We certainly have no intent of ever scanning whole documents,given the size of the task... just the items mentioned, namely:

    Author, comm date, subject matter etc... so that people who want to access the archives can work out which dockets are likely to be of interest to them.

    Yes we do know that PDF and OCR are different.

    Yes Access might be a better data base but your suggestion Scott still leaves us with the enormity of the task (and no, we will not be paying Temps or third party operators for 20 to 30 years to get the work done)

    Cal -yes the dockets are very well archived and can easily cope with the process of scanning. The originals were blue/black typewritten and handwritten on white.

    As mentioned, the dockets have some foxing and the base colour has faded to ivory/parchmnet. But we only need to capture black on white, of course.

    Looking forward to your thoughts

    John B
    cdad's Avatar
    cdad Posts: 12,700, Reputation: 1438
    Internet Research Expert
     
    #5

    Nov 19, 2011, 06:23 AM
    Try a test document(s) and scan them under 8-bit (black/white) and see if that helps you pull the lettering out more clearly. Also it would save a ton of drive space.
    ScottGem's Avatar
    ScottGem Posts: 64,966, Reputation: 6056
    Computer Expert and Renaissance Man
     
    #6

    Nov 19, 2011, 08:58 AM
    Ok, do as Dad suggested and test the scanning, You may be able to create a template that only scans certain parts of the page. But frankly, I think you may find data entry faster.

Not your question? Ask your question View similar questions

 

Question Tools Search this Question
Search this Question:

Advanced Search

Add your answer here.


Check out some similar questions!

Scanning typewritten text [ 0 Answers ]

We have a zillion dockets that need entering into a very simple ExCel spreadsheet: -Date commenced (eight digits,typewritten) -Author (typewritten) -subject matter(typewritten) -location(typewritten) -docket number (four digits, handwritten) -date finalised(eight digits, handwritten) All...

Win 7 64 bit driver for Epson perfection 660 [ 10 Answers ]

Hi, I had to buy a new fitted tower at the weekend when my computer refused to boot. I had no choise in what was available to buy - Windows 7 or nothing. I managed to locate drivers to upgrade all of the equipment except the scanner. Epson do not care and have no intention of providing these...

Perfection: [ 1 Answers ]

It's gratifying to witness the actions of people who are easing their ways toward any degree of perfection by using all values known to mankind as authoritative commands. Escapism or a need to 'change horses in the middle of a stream' should no longer be essential to oblige the sentiments of the...

Perfection [ 8 Answers ]

Corect me if I'm wrong with this line of thought Any good christian shold strive to be a better person Better and better and better until we are perfect As perfect as god But the bible warns us about wanting to be god I guess this is the same question of god lifting the rock

Scanning text into a Word XP document [ 4 Answers ]

My professor needs to have her published economics instructor's manual scanned into a Word document . What scanner and software is capable of doing this? The office has an old HP ScanJet 4C flatbed scanner. Thanks.


View more questions Search