1-2hit |
Fumitaka KIMURA Shinji TSURUOKA Yasuji MIYAKE Malayappan SHRIDHAR
In this paper, authors discuss on a lexicon directed algorithm for recognition of unconstrained handwritten words (cursive, discrete, or mixed) such as those encountered in mail pieces. The procedure consists of binarization, presegmentation, intermediate feature extraction, segmentation recognition, and post-processing. The segmentation recognition and the post-processing are repeated for all lexicon words while the binarization to the intermediate feature extraction are applied once for an input word. This algorithm is essentially non hierarchical in character segmentation and recognition which are performed in a single segmentation recognition process. The result of performance evaluation using large handwritten address block database, and algorithm improvements are described and discussed to achieve higher recognition accuracy and speed. Experimental studies with about 3000 word images indicate that overall accuracy in the range of 91% to 98% depending on the size of the lexicon (assumed to contain correct word) are achievable with the processing speed of 20 to 30 word per minute on typical work station.
Fumitaka KIMURA Shuji NISHIKAWA Tetsushi WAKABAYASHI Yasuji MIYAKE Toshio TSUTSUMIDA
This paper consists of two parts. The first part is devoted to comparative study on handwritten ZIP code numeral recognition using seventeen typical feature vectors and seven statistical classifiers. This part is the counterpart of the sister paper Handwritten Postal Code Recognition by Neural Network - A Comparative Study" in this special issue. In the second part, a procedure for feature synthesis from the original feature vectors is studied. In order to reduce the dimensionality of the synthesized feature vector, the effect of the dimension reduction on classification accuracy is examined. The best synthesized feature vector of size 400 achieves remarkably higher recognition accuracy than any of the original feature vectors in recognition experiment using a large number of numeral samples collected from real postal ZIP codes.