Subject: OCRchie Date: Mon, 12 Mar 2001 13:10:28 -0800 (PST) From: kjdavies@telus.net To: fateman@cs.berkeley.edu Hi Richard, I've downloaded OCRchie and want to use it in a project at work. To do so, I've more or less reimplemented the code, keeping most of the algorithmic base. I've also removed all of the TCL code from my copy of the source; I wanted a simple engine and TCL was not needed. I will admit now that the symbol set is quite small. All I need is to be able to determine digits, with no concern about letters, symbols, or words (inter-word gaps specifically). A few improvements I made: . when processing an image, a sufficient memory block is allocated once, instead of a row-by-row allocation. This simplified the code considerably and made it run faster. The entire image is then loaded into this block. . I ditched the run-length encoding; I found that casting rays through the original image worked just as well and I was able to avoid a couple semi-expensive steps. . removed the cleverness from the skew determination and used brute force. It examines the entire range in one-degree intervals, then tenth-degree intervals about the degree with the highest standard deviation. The program runs a little slower, but I've found (with the sample image set included) that I'm always within 0.4 degrees of the skew. . separated the file manipulation from the raster manipulation (the base imaging code was taken from another project, as may be obvious from the other transformations in the ITK directory). . made the dictionary entirely text-based . changed the way the lookups were done - I use a std::map<> to (hopefully) find exactly the bit pattern determined; if I don't find exactly the right one I examine all and find the closest (fewest bits difference) and apply the symbol found. I then add the bit pattern to the std::map<>. I find this works rather well for the admittedly small symbol set I use. . changed the grid used for bit pattern determination from 5x5 to 5x6 (my bit pattern keys are 32 bits; this lets me get a little more precision essentially free). . changed the letter segmentation from 'contiguous blackspace' to 'rectangle with black, divided by white'. Easier to implement and met my needs, but could cause some problems when ligatures and kerning come into play. . made the threshhold for 'blackspace marking' configurable. Some other changes I've been considering are: . allowing more than single glyphs to be found (ligatures and kerning can be a problem; I'd like to be able to have 'fi' and VA come out appropriately when OCRing printed text. . ... I'm sure there were one or two more, but I can't remember them now. I think one was to examine the average character width, then examine the whitespace between characters to hopefully find spaces; a similar procedure could be used for finding gaps between lines. I'll send a second message with source tarball attached. I'd like to thank you for your work; this project has made it easy to build the OCR engine I need... simple, quick, and effective. Keith -- Keith Davies kjdavies@telus.net http://a1a90975.sympatico.bconnected.net/kjdavies