Subject: OCRchie
Date: Mon, 12 Mar 2001 13:10:28 -0800 (PST)
From: kjdavies@telus.net
To: fateman@cs.berkeley.edu

Hi Richard,

I've downloaded OCRchie and want to use it in a project at work.  To
do so, I've more or less reimplemented the code, keeping most of the
algorithmic base.  I've also removed all of the TCL code from my copy
of the source; I wanted a simple engine and TCL was not needed.  I
will admit now that the symbol set is quite small.  All I need is to
be able to determine digits, with no concern about letters, symbols,
or words (inter-word gaps specifically).

A few improvements I made:

. when processing an image, a sufficient memory block is allocated
once, instead of a row-by-row allocation.  This simplified the code
considerably and made it run faster.  The entire image is then
loaded into this block.

. I ditched the run-length encoding; I found that casting rays
through the original image worked just as well and I was able to
avoid a couple semi-expensive steps.

. removed the cleverness from the skew determination and used
brute force.  It examines the entire range in one-degree intervals,
then tenth-degree intervals about the degree with the highest
standard deviation.  The program runs a little slower, but I've
found (with the sample image set included) that I'm always within
0.4 degrees of the skew.

. separated the file manipulation from the raster manipulation (the
base imaging code was taken from another project, as may be obvious
from the other transformations in the ITK directory).

. made the dictionary entirely text-based

. changed the way the lookups were done - I use a std::map<> to
(hopefully) find exactly the bit pattern determined; if I don't
find exactly the right one I examine all and find the closest
(fewest bits difference) and apply the symbol found.  I then add
the bit pattern to the std::map<>.  I find this works rather well
for the admittedly small symbol set I use.

. changed the grid used for bit pattern determination from 5x5 to
5x6 (my bit pattern keys are 32 bits; this lets me get a little
more precision essentially free).

. changed the letter segmentation from 'contiguous blackspace' to
'rectangle with black, divided by white'.  Easier to implement and
met my needs, but could cause some problems when ligatures and
kerning come into play.

. made the threshhold for 'blackspace marking' configurable.

Some other changes I've been considering are:

. allowing more than single glyphs to be found (ligatures and
kerning can be a problem; I'd like to be able to have 'fi' and
VA come out appropriately when OCRing printed text.

. ... I'm sure there were one or two more, but I can't remember them
now.  I think one was to examine the average character width, then
examine the whitespace between characters to hopefully find spaces;
a similar procedure could be used for finding gaps between lines.

I'll send a second message with source tarball attached.  I'd like
to thank you for your work; this project has made it easy to build
the OCR engine I need... simple, quick, and effective.

Keith
--
Keith Davies
kjdavies@telus.net
http://a1a90975.sympatico.bconnected.net/kjdavies