Toshio TSUTSUMIDA Toshihiro MATSUI Tadashi NOUMI Toru WAKAHARA
Through comparing the results of two successive IPTP Character Recognition Competitions which focused on 3-digit handprinted postal codes, we herein analyze the methodologies of the submitted algorithms along with the substituted or rejected patterns of these algorithms. Regarding their methodologies, lesser diversity was apparent specifically concerning the contour-chain code based on local stroke directions and statistical discriminant functions for feature extraction and discrimination. Analysis of the patterns demonstrated that the misrecognized patterns being most often improved were categorized as a decrease in peculiarly shaped handwritten characters or heavy-handed and disconnected strokes. However, most of the remaining misrecognitions were still classed as peculiarly shaped handwriting as commonly shared between the best three algorithms. From these analyses, we could delineate a direction to be taken for developing more effective methodologies and clarify the remaining problems to be overcome by the subsequent intensive research. Furthermore, we evaluate in this article our multi-expert recognition system for achieving higher recognition performances by means of combining complementary recognition algorithms. We performed a subsequent investigation of the Candidate Appearance Likelihood Method using novel experimental conditions and a new examination of the application of the neural network as the combining method for accumulating the broader candidate appearances. The results obtained confirm that combining through the neural network constitutes one of the most effective ways of making the multi-expert recognition system a reality.
Keiji GYOHTEN Tomoko SUMIYA Noboru BABAGUCHI Koh KAKUSHO Tadahiro KITAHASHI
This paper describes COCE (COordinative Character Extractor), a method for extracting printed Japanese characters and their character strings from all sorts of document images. COCE is based on a multi-agent system where each agent tries to find a character string and extracts the characters in it. For the adaptability, the agents are allowed to look after arbitrary parts of documents and extract the characters using only the knowledge independent of the layouts. Moreover, the agents check and correct their results sometimes with the help of the other agents. From experimental results, we have verified the effectiveness of our approach.
Naoaki YAMANAKA Francis PITCHO Hiroaki SATO
This letter studies the Peak Cell Rate (PCR) policing of ATM connections that consist of multiple cell flow components. It is shown that the conventional methods proposed for policing the aggregate flow do not use the network's resources efficiently. This letter proposes a simple and efficient UPC (Usage Parameter Control) mechanism based on a tandem leaky bucket for multi-component ATM connections. The results show that network resource requirements can be minimized, with reasonable hardware complexity.
Most conventional methods used in character recognition extract geometrical features, such as stroke direction and connectivity, and compare them with reference patterns in a stored dictionary. Unfortunately, geometrical features are easily degraded by blurs and stains, and by the graphical designs such as used in Japanese newspaper headlines. This noise must be removed before recognition commences, but no preprocessing method is perfectly accurate. This paper proposes a method for recognizing degraded characters as well as characters printed on graphical designs. This method extracts features from binary images, and a new similarity measure, the complementary similarity measure, is used as a discriminant function; it compares the similarity and dissimilarity of binary patterns with reference dictionary patterns. Experiments are conducted using the standard character database ETL-2, which consists of machine-printed Kanji, Hiragana, Katakana, alphanumeric, and special characters. The results show that our method is much more robust against noise than the conventional geometrical-feature method. It also achieves high recognition rates of over 97% for characters with textured foregrounds, over 99% for characters with textured backgrounds, over 98% for outline fonts and over 99% for reverse contrast characters. The experiments for recognizing both the fontstyles and character category show that it also achieves high recognition rates against noise.
Ju YE Masahiro TANAKA Tetsuzo TANINO
The problem of genetic algorithm's efficiency has been attracting the attention of genetic algorithm community. Over the last decade, considerable researches have focused on improving genetic algorithm's performance. However, they are generally under the framework of natural evolutionary mechanism and the major genetic operators, crossover and mutation, are activated by the prior probabilities. An operator based on a prior probability possesses randomness, that is, the unexpected individuals are frequently operated, but the expected individuals are sometimes not operated. Moreover, as the evaluation function is the link between the genetic algorithm and the problem to be solved, the evaluation function provides the heuristic information for evolutionary search. Therefore, how to use this kind of heuristic information (present and past) is influential in the efficiency of evolutionary search. This paper, as an attempt, presents a eugenics-based genetic algorithm (EGA) -- a genetic algorithm that reflects the human's decision will (eugenics), and fully utilizes the heuristic information provided by the evaluation function for the decisions. In other words, EGA = evolutionary mechanisms + human's decision will + heuristic information. In EGA, the ideas of the positive eugenics and the negative eugenics are applied as the principle of selections and the selections are not activated by the prior probabilities but by the evaluation values of individuals. A method of genealogical chain-based selection for mutation is proposed, which avoids the blindness of stochastic mutation and the disruptive problem of mutation. A control strategy of reasonable competitions is proposed, which brings the effects of crossover and mutation into full play. Three examples, the minimum problem of a standard optimizing function--De Jong's test function F2, a typical combinatorial optimization problem--the traveling salesman problem, and a problem of identifying nonlinear system, are given to show the good performance of EGA.
Among various characters invented by the human being ,Kanji (Chinese character) is outstanding in its diversity and complexity, in contrast to Roman alphanumerics. So machine recognition of handwritten characters requires particular technics. This paper concerns commercially available optical character readers (OCRs) and the recognition techniques using therein. Methods of character recognition is classified the pattern types and the method of extracting features, to discuss the present states of the character recognition. In regard to the structural analysis, the proposed techniques are classified and discussed in accordance with the extraction method of line segment and axial correlation. Finally, various techniques are compared with one another by using a common database, so as to understand the present states of character recognition and to discuss the technical trends.
Masato SUZUKI Nei KATO Hirotomo ASO Yoshiaki NEMOTO
In recognition of handprinted characters, it is important to dissolve distortions of character caused by writer's habits. In order to dissolve distortions and to obtain better features, many image conversion methods have been proposed. But there are distortions that cannot be dissolved by these methods. One example is the case of parallel strokes which are spread out in fan shape. In this paper, in order to dissolve distortions, we propose a new image conversion method, Transformation based on Partial Inclination Detection (TPID)", which is employed just before normalization, and is intended to dissolve several kinds of distortions in images of each character. TPID constructs transformation functions from inclination angles which are detected in some subspaces of the character's image, and converts images using the transformation functions. TPID is especially suitable for correcting the inclinations of horizontal and vertical strokes of a character. This has a powerful impact on the quality of the characteristic features. In recognition experiments using ETL9B, the largest database of handprinted characters in Japan, we have obtained a recognition rate of 99.08%, which is the best to our knowledge.
Kanad KEENI Hiroshi SHIMODAIRA Tetsuro NISHINO Yasuo TAN
Devanagari is the most widely used script in India. Here, a method is introduced for recognizing Devanagari characters using Neural network. The proposed method reduces the number of output unit necessary for a conventional neural network where the classification is based on a winner take all basis. An automatic coding procedure for representing the output layer of the network and a different method for the final classification is also proposed. Along with the automatic coding procedure, a heuristic method for representing the output units by exploiting the structural information of Devanagari character is also demonstrated. Besides, it has been shown by random representation of the output layer that the representation effects the generalization/performance of the network. The proposed automatic representation gave the recognition rate of 98.09% for 44 categories.
Table-form document structure analysis is an important problem in the document processing domain. This paper presents a new method called Box-Driven Reasoning (BDR) to robustly analyze the structure of table-form documents that include touching characters and broken lines. Real documents are copied repeatedly and overlaid with printed data, resulting in characters that touch cells and lines that are broken. Most previous methods employ a line-oriented approach, but touching characters and broken lines make the procedure fail at an early stage. BDR deals with regions directly in contrast with other previous methods and a reduced resolution image is introduced to supplement information deteriorated by noise. Experimental tests show that BDR reliably recognizes cells and strings in document images with touching characters and broken lines.
Tsutomu MIYASATO Haruo NOMA Fumio KISHINO
This paper describes the results of tests that measured the allowable delay between images and tactile information via a force feedback device. In order to investigate the allowable delay, two experiments were performed: 1) subjective evaluation in real space and 2) subjective evaluation in virtual space using a force feedback device.
For similarity methods to work well, the image must be blurred before being input. However, the relationship between the blurring operation and the similarity is not fully understood. To solve the problem of this relationship, in this paper, the effect of blurring is investigated by expressing figure f(x) in the form of the sum of higher derivatives of f (x,σ), and then a simple similarity between figures was mathematically formulated in terms of the relation between visual patterns. By modifying this formulation, we propose pluralized simple similarity to increase the allowance in different view of multiple similarity. The similarity maintains higher allowance without any discernible loss of distinguishing power. We verify the effectiveness of the pluralized simple similarity throughout some experiments.
Continuous nonlinearity" is stressed as a fundamental principle in pattern recognition including handprinted Kanji character recognition. Continuity" in template matching and spatial nonlinearity" in structural analysis should be unified toward deriving a higher level of recognition algorithm. At the same time, continuous nonlinearity in the temporal axis is important, as is the case of simultaneous processing of segmentation and recognition for touching characters. The above viewpoint is discussed in the following examples: nonlinear normalization, directional pattern matching, locally maximized similarity, relaxation matching, dynamic programming matching, segmentation of character string using dynamic programming, and exhaustive matching for character extraction on complex background.
Tetsuo ASANO Desh RANJAN Thomas ROOS
Digital halftoning is a well-known technique in image processing to convert an image having several bits for brightness levels into a binary image consisting only of black and white dots. A great number of algorithms have been presented for this problem, some of which have only been evaluated just by comparison with human eyes. In this paper we formulate the digital halftoning problem as a combinatiorial problem which allows an exact solution with graph-theoretic tools. For this, we consider a d-dimensional grid of n := Nd pixels (d 1). For each pixel, we define a so-called k-neighborhood, k {0,...N - 1}, which is the set of at most (2k + 1)d pixels that can be reached from the current pixel in a distance of k. Now, in order to solve the digital halftoning problem, we are going to minimize the sum of distances of all k-neighborhoods between the original picture and the halftoned one. We show that the problem can be solved in linear time in the one-dimensional case while it looks hopeless to have a polynomial-time algorithm in higher dimension including the usual two-dimensional case. We present an exact algorithm for the one-dimensional case which runs in O(n) time if k is regarded to be a constant. For two-dimensional case we present fast approximation techniques based on space filling curves. An experimental comparison of several implementations of approximate algorithms proves that our algorithms are of practical interest.
This letter proposes a new shaping algorithm (CRSA: CDV Reduction Shaping Algorithm) that can freely reduce the maximum CDV value of a cell stream to any predetermined value. There is a trade off between shaping delay and the maximum CDV value reduction achieved when using CRSA. The shaper using CRSA (CR-shaper) output satisfies the Peak Cell Rate Reference Algorithm set with the CR-shaper parameters.
Toshinori MORI Kaoru SHINOZAKI
This paper proposes a method to predict and control noise voltage caused by electrostatic discharge (ESD) to electronic equipment. The relationship of grounding system configurations for a typical set of equipment to ESD immunity has been derived using a mechanism of ground potential variations. The equivalent circuit representing ground elements as lumped constants enables us to predict the transient ground potential differences between PCB (Printed Circuit Board) ground planes connected via signal cables and induced noise voltage at the receiving end. The calculation shows that the contribution of ground potential differences to noise voltage is comparable to that of the electromagnetic coupling between the discharge current on the enclosure and the circuit loops. The calculation also shows some characteristic results, such as; the induced noise voltage is remarkably dependent on the unbalance in ground cable lengths and on the impedance of ground conductors connecting PCBs, especially when the equipment uses a single-point grounding system. These characteristics were confirmed by measurements of induced ground potential differences, noise voltage and immunity levels. Thus the proposed method is shown to be very effective to analyze the dependency of grounding conditions on ESD immunity and to improve ESD immunity in equipment design.
Woo-Chan PARK Shi-Wha LEE Oh-Young KWON Tack-Don HAN Shin-Dug KIM
A model for the floating point adder/subtractor which can perform rounding and addition/subtraction operations in parallel is presented. The major requirements and structure to achieve this goal are described and algebraically verified. Processing flow of the conventional floating point addition/subtraction operation consists of alignment, addition/subtraction, normalization, and rounding stages. In general, the rounding stage requires a high speed adder for increment, increasing the overall execution time and occupying a large amount of chip area. Furthermore, it accompanies additional execution time and hardware logics for renormalization stage which may occur by an overflow from the rounding operation. A floating adder/subtractor performing addition/subtraction and IEEE rounding in parallel is designed by optimizing the operational flow of floating point addition/subtraction operation. The floating point adder/subtractor presented does not require any additional execution time nor any high speed adder for rounding operation. In addition, the renormalization step is not required because the rounding step is performed prior to the normalization operation. Thus, performance improvement and cost-effective design can be achieved by this approach.
Hidekazu KANEKO Tohru KIRYU Yoshiaki SAITOH
A novel method of multichannel surface EMG processing has been developed to compensate for the distortion in bipolar surface EMG signals due to the movement of innervation zones. The distortion of bipolar surface EMG signals was mathematically described as a filtering function. A compensating technique for such distorted bipolar surface EMG signals was developed for the brachial biceps during dynamic contractions in which the muscle length and tension change. The technique is based on multichannel surface EMG measurement, a method for estimating the movement of an innervation zone, and the inverse filtering technique. As a result, the distorted EMG signals were compensated and transformed into nearly identical waveforms, independent of the movement of the innervation zone.
Akira MATSUBAYASHI Shuichi UENO
It is known that the problem of determining, given a planar graph G with maximum vertex degree at most 4 and integers m and n, whether G is embeddable in an m n grid with unit congestion is NP-hard. In this paper, we show that it is also NP-complete to determine whether G is embeddable in ak n grid with unit congestion for any fixed integer k 3. In addition, we show a necessary and sufficient condition for G to be embeddable in a 2 grid with unit congestion, and show that G satisfying the condition is embeddable in a 2 |V(G)| grid. Based on the characterization, we suggest a linear time algorithm for recognizing graphs embeddable in a 2 grid with unit congestion.
Hiroshi KONDO Shuji TUTUMI Satoshi MIKURIYA
A simple and convenient approach for a radial symmetrical point detection is proposed. In this paper the real part-only synthesis is utilized in order to make an origin symmetric pattern of the original image and to perform automatically the calculation of its autocorrelation for the detection of the symmetry center of the image.
Hiroyoshi YAMADA Yoshio YAMAGUCHI Masakazu SENGOKU
A new superresolution technique is proposed for high-resolution estimation of the scattering analysis. For complicated multipath propagation environment, it is not enough to estimate only the delay-times of the signals. Some other information should be required to identify the signal path. The proposed method can estimate the frequency characteristic of each signal in addition to its delay-time. One method called modified (Root) MUSIC algorithm is known as a technique that can treat both of the parameters (frequency characteristic and delay-time). However, the method is based on some approximations in the signal decorrelation, that sometimes make problems. Therefore, further modification should be needed to apply the method to the complicated scattering analysis. In this paper, we propose to apply a time-domain null filtering scheme to reduce some of the dominant signal components. It can be shown by a simple experiment that the new technique can enhance estimation accuracy of the frequency characteristic in the Root-MUSIC algorithm.