Most conventional methods used in character recognition extract geometrical features, such as stroke direction and connectivity, and compare them with reference patterns in a stored dictionary. Unfortunately, geometrical features are easily degraded by blurs and stains, and by the graphical designs such as used in Japanese newspaper headlines. This noise must be removed before recognition commences, but no preprocessing method is perfectly accurate. This paper proposes a method for recognizing degraded characters as well as characters printed on graphical designs. This method extracts features from binary images, and a new similarity measure, the complementary similarity measure, is used as a discriminant function; it compares the similarity and dissimilarity of binary patterns with reference dictionary patterns. Experiments are conducted using the standard character database ETL-2, which consists of machine-printed Kanji, Hiragana, Katakana, alphanumeric, and special characters. The results show that our method is much more robust against noise than the conventional geometrical-feature method. It also achieves high recognition rates of over 97% for characters with textured foregrounds, over 99% for characters with textured backgrounds, over 98% for outline fonts and over 99% for reverse contrast characters. The experiments for recognizing both the fontstyles and character category show that it also achieves high recognition rates against noise.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Minako SAWAKI, Norihiro HAGITA, "Recognition of Degraded Machine-Printed Characters Using a Complementary Similarity Measure and Error-Correction Learning" in IEICE TRANSACTIONS on Information,
vol. E79-D, no. 5, pp. 491-497, May 1996, doi: .
Abstract: Most conventional methods used in character recognition extract geometrical features, such as stroke direction and connectivity, and compare them with reference patterns in a stored dictionary. Unfortunately, geometrical features are easily degraded by blurs and stains, and by the graphical designs such as used in Japanese newspaper headlines. This noise must be removed before recognition commences, but no preprocessing method is perfectly accurate. This paper proposes a method for recognizing degraded characters as well as characters printed on graphical designs. This method extracts features from binary images, and a new similarity measure, the complementary similarity measure, is used as a discriminant function; it compares the similarity and dissimilarity of binary patterns with reference dictionary patterns. Experiments are conducted using the standard character database ETL-2, which consists of machine-printed Kanji, Hiragana, Katakana, alphanumeric, and special characters. The results show that our method is much more robust against noise than the conventional geometrical-feature method. It also achieves high recognition rates of over 97% for characters with textured foregrounds, over 99% for characters with textured backgrounds, over 98% for outline fonts and over 99% for reverse contrast characters. The experiments for recognizing both the fontstyles and character category show that it also achieves high recognition rates against noise.
URL: https://global.ieice.org/en_transactions/information/10.1587/e79-d_5_491/_p
Copy
@ARTICLE{e79-d_5_491,
author={Minako SAWAKI, Norihiro HAGITA, },
journal={IEICE TRANSACTIONS on Information},
title={Recognition of Degraded Machine-Printed Characters Using a Complementary Similarity Measure and Error-Correction Learning},
year={1996},
volume={E79-D},
number={5},
pages={491-497},
abstract={Most conventional methods used in character recognition extract geometrical features, such as stroke direction and connectivity, and compare them with reference patterns in a stored dictionary. Unfortunately, geometrical features are easily degraded by blurs and stains, and by the graphical designs such as used in Japanese newspaper headlines. This noise must be removed before recognition commences, but no preprocessing method is perfectly accurate. This paper proposes a method for recognizing degraded characters as well as characters printed on graphical designs. This method extracts features from binary images, and a new similarity measure, the complementary similarity measure, is used as a discriminant function; it compares the similarity and dissimilarity of binary patterns with reference dictionary patterns. Experiments are conducted using the standard character database ETL-2, which consists of machine-printed Kanji, Hiragana, Katakana, alphanumeric, and special characters. The results show that our method is much more robust against noise than the conventional geometrical-feature method. It also achieves high recognition rates of over 97% for characters with textured foregrounds, over 99% for characters with textured backgrounds, over 98% for outline fonts and over 99% for reverse contrast characters. The experiments for recognizing both the fontstyles and character category show that it also achieves high recognition rates against noise.},
keywords={},
doi={},
ISSN={},
month={May},}
Copy
TY - JOUR
TI - Recognition of Degraded Machine-Printed Characters Using a Complementary Similarity Measure and Error-Correction Learning
T2 - IEICE TRANSACTIONS on Information
SP - 491
EP - 497
AU - Minako SAWAKI
AU - Norihiro HAGITA
PY - 1996
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E79-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 1996
AB - Most conventional methods used in character recognition extract geometrical features, such as stroke direction and connectivity, and compare them with reference patterns in a stored dictionary. Unfortunately, geometrical features are easily degraded by blurs and stains, and by the graphical designs such as used in Japanese newspaper headlines. This noise must be removed before recognition commences, but no preprocessing method is perfectly accurate. This paper proposes a method for recognizing degraded characters as well as characters printed on graphical designs. This method extracts features from binary images, and a new similarity measure, the complementary similarity measure, is used as a discriminant function; it compares the similarity and dissimilarity of binary patterns with reference dictionary patterns. Experiments are conducted using the standard character database ETL-2, which consists of machine-printed Kanji, Hiragana, Katakana, alphanumeric, and special characters. The results show that our method is much more robust against noise than the conventional geometrical-feature method. It also achieves high recognition rates of over 97% for characters with textured foregrounds, over 99% for characters with textured backgrounds, over 98% for outline fonts and over 99% for reverse contrast characters. The experiments for recognizing both the fontstyles and character category show that it also achieves high recognition rates against noise.
ER -