The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] document image(10hit)

1-10hit
  • Text Line Segmentation in Handwritten Document Images Using Tensor Voting

    Toan Dinh NGUYEN  Gueesang LEE  

     
    PAPER-Image

      Vol:
    E94-A No:11
      Page(s):
    2434-2441

    A novel grouping approach to segment text lines from handwritten documents is presented. In this text line segmentation algorithm, for each text line, a text string that connects the center points of the characters in this text line is built. The text lines are then segmented using the resulting text strings. Since the characters of the same text line are situated close together and aligned on a smooth curve, 2D tensor voting is used to reduce the conflicts when building these text strings. First, the text lines are represented by separate connected components. The center points of these connected components are then encoded by second order tensors. Finally, a voting process is applied to extract the curve saliency values and normal vectors, which are used to remove outliers and build the text strings. The experimental results obtained from the test dataset of the ICDAR 2009 Handwriting Segmentation Contest show that the proposed method generates high detection rate and recognition accuracy.

  • Document Image Retrieval for QA Systems Based on the Density Distributions of Successive Terms

    Koichi KISE  Shota FUKUSHIMA  Keinosuke MATSUMOTO  

     
    PAPER-Document Image Retrieval

      Vol:
    E88-D No:8
      Page(s):
    1843-1851

    Question answering (QA) is the task of retrieving an answer in response to a question by analyzing documents. Although most of the efforts in developing QA systems are devoted to dealing with electronic text, we consider it is also necessary to develop systems for document images. In this paper, we propose a method of document image retrieval for such QA systems. Since the task is not to retrieve all relevant documents but to find the answer somewhere in documents, retrieval should be precision oriented. The main contribution of this paper is to propose a method of improving precision of document image retrieval by taking into account the co-occurrence of successive terms in a question. The indexing scheme is based on two-dimensional distributions of terms and the weight of co-occurrence is measured by calculating the density distributions of terms. The proposed method was tested by using 1253 pages of documents about the major league baseball with 20 questions and found that it is superior to the baseline method proposed by the authors.

  • Logical Structure Analysis of Document Images Based on Emergent Computation

    Yasuto ISHITANI  

     
    PAPER-Document Structure

      Vol:
    E88-D No:8
      Page(s):
    1831-1842

    A new method for logical structure analysis of document images is proposed in this paper as the basis for a document reader which can extract logical information from various printed documents. The proposed system consists of five basic modules: text line classification, object recognition, object segmentation, object grouping, and object modification. Emergent computation, which is a key concept of artificial life, is adopted for the cooperative interaction among modules in the system in order to achieve effective and flexible behavior of the whole system. It has three principal advantages over other methods: adaptive system configuration for various and complex logical structures, robust document analysis tolerant of erroneous feature detection, and feedback of high-level logical information to the low-level physical process for accurate analysis. Experimental results obtained for 150 documents show that the method is adaptable, robust, and effective for various document structures.

  • Skew Detection and Reconstruction of Color-Printed Document Images

    Yi-Kai CHEN  Jhing-Fa WANG  

     
    PAPER

      Vol:
    E84-D No:8
      Page(s):
    1018-1024

    Large amounts of color-printed documents are published now everyday. Some OCR approaches of color-printed document images are provided, but they cannot normally work if the input images skew. In the past years, many algorithms are provided to detect the skew of monochrome document images but none of them process color-printed document images. All of these methods assume that text is printed in black on a white background and cannot be applied to detect skew in color-printed document images. In this paper, we propose an algorithm to detect the skew angle of a color-printed document image and reconstruct it. Our approach first determines variation of color-transition count at each angle (from -45 to +45) and the angle of maximal variation is regarded as the skew angle. Then, a scanning-line model reconstructs the image. We test 100 color-printed document images of various kinds and get good results (93 succeed and 7 fail). The average processing time of A4 size image is 2.76 seconds and the reconstruction time is 3.97 seconds on a Pentium III 733 PC.

  • Image Classification Using Kolmogorov Complexity Measure with Randomly Extracted Blocks

    Jun KONG  Zheru CHI  

     
    PAPER-Image Processing,Computer Graphics and Pattern Recognition

      Vol:
    E81-D No:11
      Page(s):
    1239-1246

    Image classification is an important task in document image analysis and understanding, page segmentation-based document image compression, and image retrieval. In this paper, we present a new approach for distinguishing textual images from pictorial images using the Kolmogorov Complexity (KC) measure with randomly extracted blocks. In this approach, a number of blocks are extracted randomly from a binarized image and each block image is converted into a one-dimensional binary sequence using either horizontal or vertical scanning. The complexities of these blocks are then computed and the mean value and standard deviation of the block complexities are used to classify the image into textual or pictorial image based on two simple fuzzy rules. Experimental results on different textual and pictorial images show that the KC measure with randomly extracted blocks can efficiently classified 29 out 30 images. The performance of our approach, where an explicit training process is not needed, is comparable favorably to that of a neural network-based approach.

  • Feature Space Design for Statistical Image Recognition with Image Screening

    Koichi ARIMURA  Norihiro HAGITA  

     
    PAPER-Image Processing,Computer Graphics and Pattern Recognition

      Vol:
    E81-D No:1
      Page(s):
    88-93

    This paper proposes a design method of feature spaces in a two-stage image recognition method that improves the recognition accuracy and efficiency in statistical image recognition. The two stages are (1) image screening and (2) image recognition. Statistical image recognition methods require a lot of calculations for spatially matching between subimages and reference patterns of the specified objects to be detected in input images. Our image screening method is effective in lowering the calculation load and improving recognition accuracy. This method selects a candidate set of subimages similar to those in the object class by using a lower dimensional feature vector, while rejecting the rest. Since a set of selected subimages is recognized by using a higher dimensional feature vector, overall recognition efficiency is improved. The classifier for recognition is designed from the selected subimages and also improves recognition accuracy, since the selected subimages are less contaminated than the originals. Even when conventional recognition methods based on linear transformation algorithms, i. e. principal component analysis (PCA) and projection pursuit (PP), are applied to the recognition stage in our method, recognition accuracy and efficiency may be improved. A new criterion, called a screening criterion, for measuring overall efficiency and accuracy of image recognition is introduced to efficiently design the feature spaces of image screening and recognition. The feature space for image screening are empirically designed subject to taking the lower number of dimensions for the feature space referred to as LS and the larger value of the screening criterion. Then, the recognition feature space which number of dimensions is referred to as LR is designed under the condition LSLR. The two detection tasks were conducted in order to examine the performance of image screening. One task is to detect the eye- and-mouth-areas in a face image and the other is to detect the text-area in a document image. The experimental results demonstrate that image screening for these two tasks improves both recognition accuracy and throughput when compared to the conventional one-stage recognition method.

  • Modified MCR Expression of Binary Document Images

    Supoj CHINVEERAPHAN  Abdel Malek B.C. ZIDOURI  Makoto SATO  

     
    LETTER-Image Processing, Computer Graphics and Pattern Recognition

      Vol:
    E78-D No:4
      Page(s):
    503-507

    As a first step to develop a system to analyze or recognize patterns contained in mages, it is important to provide a good base representation that can facilitate efficiently the interpretation of such patterns. Since structural features of basic patterns in document images such as characters or tables are horizontal and vertical stroke components, we propose a new expression of document image based on the MCR expression that can express well such features of text and tabular components of an image.

  • Classification of Document Image Blocks Using MCR Stroke Index

    AbdelMalek B.C. ZIDOURI  Supoj CHINVEERAPHAN  Makoto SATO  

     
    LETTER-Image Processing, Computer Graphics and Pattern Recognition

      Vol:
    E78-D No:3
      Page(s):
    290-294

    In this paper we introduce a new feature called stroke index for document image analysis. It is based on the minimum covering run expression method (MCR). This stroke index is a function of the number of horizontal and vertical runs in the original image and of number of runs by the MCR expression. As document images may present a variety of patterns such as graph, text or picture, it is necessary for image understanding to classify these different patterns into categories beforehand. Here we show how one could use this stroke index for such applications as classification or segmentation. It also gives an insight on the possibility of stroke extraction from document images in addition to classifying different patterns in a compound image.

  • Document Image Segmentation and Layout Analysis

    Takashi SAITOH  Toshifumi YAMAAI  Michiyoshi TACHIKAWA  

     
    PAPER

      Vol:
    E77-D No:7
      Page(s):
    778-784

    A system for segmentation of document image and ordering text areas is described, and applied to complex printed page layouts of both Japanese and English. There is no need to make any assumptions about the shape of blocks, hence the segmentation technique can handle not only skewed images without skew-correction but also documents where columns are not rectangular. In this technique, based on the bottom-up strategy, the connected components are extracted from the reduced image, and classiferd according to their local information. The connected components calssified as characters are then merged into lines, and the lines are merged into areas. Extracted text areas are classified as body, caption, header or footer. A tree graph of the layout of the body texts is made, and the texts ordered by preorder traversal on the graph. We introduce the concept of an influence range of each node, a procedure for handling titles, thus obtaining good results on various documents. The total system is fast and compact.

  • Representing, Utilizing and Acquiring Knowledge for Document lmage Understanding

    Koichi KISE  Noboru BABAGUCHI  

     
    PAPER

      Vol:
    E77-D No:7
      Page(s):
    770-777

    This paper discusses the role of knowledge in document image understanding from the viewpoints of representation, utilization and acquisition. For the representation of knowledge, we propose two models, a layout model and a content model, which represent knowledge about the layout structure and content of a document, respectively. For the utilization of knowledge, we implement layout analysis and content analysis which utilize a layout model and a content model, respectively. The strategy of hypothesis generation and verification is introduced in order to integrate these two kinds of analysis. For the acquisition of knowledge, we propose a method of incremental acquisition of a layout model from a stream of example documents. From the experimental results of document image understanding and knowledge acquisition using 50 samples of visiting cards, we verified the effectiveness of the proposed method.