The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] text detection(6hit)

1-6hit
  • VTD-FCENet: A Real-Time HD Video Text Detection with Scale-Aware Fourier Contour Embedding Open Access

    Wocheng XIAO  Lingyu LIANG  Jianyong CHEN  Tao WANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2023/12/07
      Vol:
    E107-D No:4
      Page(s):
    574-578

    Video text detection (VTD) aims to localize text instances in videos, which has wide applications for downstream tasks. To deal with the variances of different scenes and text instances, multiple models and feature fusion strategies were typically integrated in existing VTD methods. A VTD method consisting of sophisticated components can efficiently improve detection accuracy, but may suffer from a limitation for real-time applications. This paper aims to achieve real-time VTD with an adaptive lightweight end-to-end framework. Different from previous methods that represent text in a spatial domain, we model text instances in the Fourier domain. Specifically, we propose a scale-aware Fourier Contour Embedding method, which not only models arbitrary shaped text contours of videos as compact signatures, but also adaptively select proper scales for features in a backbone in the training stage. Then, we construct VTD-FCENet to achieve real-time VTD, which encodes temporal correlations of adjacent frames with scale-aware FCE in a lightweight and adaptive manner. Quantitative evaluations were conducted on ICDAR2013 Video, Minetto and YVT benchmark datasets, and the results show that our VTD-FCENet not only obtains the state-of-the-arts or competitive detection accuracy, but also allows real-time text detection on HD videos simultaneously.

  • Single-Line Text Detection in Multi-Line Text with Narrow Spacing for Line-Based Character Recognition

    Chee Siang LEOW  Hideaki YAJIMA  Tomoki KITAGAWA  Hiromitsu NISHIZAKI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2023/08/31
      Vol:
    E106-D No:12
      Page(s):
    2097-2106

    Text detection is a crucial pre-processing step in optical character recognition (OCR) for the accurate recognition of text, including both fonts and handwritten characters, in documents. While current deep learning-based text detection tools can detect text regions with high accuracy, they often treat multiple lines of text as a single region. To perform line-based character recognition, it is necessary to divide the text into individual lines, which requires a line detection technique. This paper focuses on the development of a new approach to single-line detection in OCR that is based on the existing Character Region Awareness For Text detection (CRAFT) model and incorporates a deep neural network specialized in line segmentation. However, this new method may still detect multiple lines as a single text region when multi-line text with narrow spacing is present. To address this, we also introduce a post-processing algorithm to detect single text regions using the output of the single-line segmentation. Our proposed method successfully detects single lines, even in multi-line text with narrow line spacing, and hence improves the accuracy of OCR.

  • Spectral Fluctuation Method: A Texture-Based Method to Extract Text Regions in General Scene Images

    Yoichiro BABA  Akira HIROSE  

     
    PAPER-Pattern Recognition

      Vol:
    E92-D No:9
      Page(s):
    1702-1715

    To obtain text information included in a scene image, we first need to extract text regions from the image before recognizing the text. In this paper, we examine human vision and propose a novel method to extract text regions by evaluating textural variation. Human beings are often attracted by textural variation in scenes, which causes foveation. We frame a hypothesis that texts also have similar property that distinguishes them from the natural background. In our method, we calculate spatial variation of texture to obtain the distribution of the degree of likelihood of text region. Here we evaluate the changes in local spatial spectrum as the textural variation. We investigate two options to evaluate the spectrum, that is, those based on one- and two-dimensional Fourier transforms. In particular, in this paper, we put emphasis on the one-dimensional transform, which functions like the Gabor filter. The proposal can be applied to a wide range of characters mainly because it employs neither templates nor heuristics concerning character size, aspect ratio, specific direction, alignment, and so on. We demonstrate that the method effectively extracts text regions contained in various general scene images. We present quantitative evaluation of the method by using databases open to the public.

  • Geometrical, Physical and Text/Symbol Analysis Based Approach of Traffic Sign Detection System

    Yangxing LIU  Takeshi IKENAGA  Satoshi GOTO  

     
    PAPER

      Vol:
    E90-D No:1
      Page(s):
    208-216

    Traffic sign detection is a valuable part of future driver support system. In this paper, we present a novel framework to accurately detect traffic signs from a single color image by analyzing geometrical, physical and text/symbol features of traffic signs. First, we utilize an elaborate edge detection algorithm to extract edge map and accurate edge pixel gradient information. Then, we extract 2-D geometric primitives (circles, ellipses, rectangles and triangles) efficiently from image edge map. Third, the candidate traffic sign regions are selected by analyzing the intrinsic color features, which are invariant to different illumination conditions, of each region circumvented by geometric primitives. Finally, a text and symbol detection algorithm is introduced to classify true traffic signs. Experimental results demonstrated the capabilities of our algorithm to detect traffic signs with respect to different size, shape, color and illumination conditions.

  • A Contour-Based Robust Algorithm for Text Detection in Color Images

    Yangxing LIU  Satoshi GOTO  Takeshi IKENAGA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E89-D No:3
      Page(s):
    1221-1230

    Text detection in color images has become an active research area in the past few decades. In this paper, we present a novel approach to accurately detect text in color images possibly with a complex background. The proposed algorithm is based on the combination of connected component and texture feature analysis of unknown text region contours. First, we utilize an elaborate color image edge detection algorithm to extract all possible text edge pixels. Connected component analysis is performed on these edge pixels to detect the external contour and possible internal contours of potential text regions. The gradient and geometrical characteristics of each region contour are carefully examined to construct candidate text regions and classify part non-text regions. Then each candidate text region is verified with texture features derived from wavelet domain. Finally, the Expectation maximization algorithm is introduced to binarize each text region to prepare data for recognition. In contrast to previous approach, our algorithm combines both the efficiency of connected component based method and robustness of texture based analysis. Experimental results show that our proposed algorithm is robust in text detection with respect to different character size, orientation, color and language and can provide reliable text binarization result.

  • Caption Detection Algorithm Using Temporal Information in Video

    Chung-Ho SHIN  Chul-Hyun KWON  Su-Yeon KIM  Sang-Hui PARK  

     
    LETTER-Image Processing, Image Pattern Recognition

      Vol:
    E87-D No:2
      Page(s):
    487-490

    A novel caption text detection and recognition algorithm using the temporal nature of video is proposed in this paper. A text registration technique is used to locate the temporal and spatial positions of captions in video from the accumulated frame difference information. Experimental results show that the proposed method is effective and robust. Also, a high processing speed is achieved since no time consuming operation is included.