Following on from my previous post on this subject, I promised an update once I’d seen the Kofax demo.
Kofax works in a similar manner to the other more advanced solutions I’ve looked at. I did learn a thing or two from this demo though that I thought I’d mention:
1. To improve the metadata on the document, use a reference database to ensure you’ve got the correct information. So for example, if the Client Number is a piece of metadata and you have an Access database with all your Client numbers, you can connect to this db and retrieve additional Client information (e.g. name, address or whatever else you want to map from the db)
2. OCR is possible but limited. For the forms my client wants to scan, all of the information is handwritten. While the various advanced solutions I’ve looked at all use OCR, there is a lot of work to get it right and it’s only 60-80% correct at best. So this is not suitable for bulk scanning on the scale I’m looking at (200,000 pages). The one case in which the accuracy of OCR could be improved is where the data on the document is written in cells or boxex. OCR has a better chance of determining this.
For bulk scanning, we’re outsourcing this to a 3rd party who will bring scanners & people to get it done. All we need to do is ensure that the metadata used during the scanning is mapped into SharePoint correctly….but that’s a post for another day.