The search functionality is under construction.

IEICE TRANSACTIONS on Information

  • Impact Factor

    0.72

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.4

Advance publication (published online immediately after acceptance)

Volume E90-D No.8  (Publication Date:2007/08/01)

    Special Section on Image Recognition and Understanding
  • FOREWORD Open Access

    Yoichi SATO  

     
    FOREWORD

      Page(s):
    1133-1133
  • Generation of Training Data by Degradation Models for Traffic Sign Symbol Recognition

    Hiroyuki ISHIDA  Tomokazu TAKAHASHI  Ichiro IDE  Yoshito MEKADA  Hiroshi MURASE  

     
    PAPER

      Page(s):
    1134-1141

    We present a novel training method for recognizing traffic sign symbols. The symbol images captured by a car-mounted camera suffer from various forms of image degradation. To cope with degradations, similarly degraded images should be used as training data. Our method artificially generates such training data from original templates of traffic sign symbols. Degradation models and a GA-based algorithm that simulates actual captured images are established. The proposed method enables us to obtain training data of all categories without exhaustively collecting them. Experimental results show the effectiveness of the proposed method for traffic sign symbol recognition.

  • Data Hiding in Binary Images with Distortion-Minimizing Capabilities by Optimal Block Pattern Coding and Dynamic Programming Techniques

    I-Shi LEE  Wen-Hsiang TSAI  

     
    PAPER

      Page(s):
    1142-1150

    A new method for data hiding in binary images based on block pattern coding and dynamic programming with distortion-minimizing capabilities is proposed. Up to three message data bits can be embedded into each 22 block in an input image by changing the block's pixel pattern into another, which represents the value of the message data bits as a code according to a block pattern encoding table. And extraction of hidden message data is accomplished by block pattern decoding. To minimize the resulting image distortion, two optimization techniques are proposed. The first is to use multiple block pattern encoding tables, from which an optimal one is selected specifically for each input image, and the second is to use a dynamic programming algorithm to divide the message data into bit segments for optimal embedding in a sense of minimizing the number of binary bit flippings. Accordingly, not only more data bits can be embedded in an image block on the average, but the resulting image distortion is also reduced in an optimal way. Experimental results are also included to show the effectiveness of the proposed approach.

  • Pruned Resampling: Probabilistic Model Selection Schemes for Sequential Face Recognition

    Atsushi MATSUI  Simon CLIPPINGDALE  Takashi MATSUMOTO  

     
    PAPER

      Page(s):
    1151-1159

    This paper proposes probabilistic pruning techniques for a Bayesian video face recognition system. The system selects the most probable face model using model posterior distributions, which can be calculated using a Sequential Monte Carlo (SMC) method. A combination of two new pruning schemes at the resampling stage significantly boosts computational efficiency by comparison with the original online learning algorithm. Experimental results demonstrate that this approach achieves better performance in terms of both processing time and ID error rate than a contrasting approach with a temporal decay scheme.

  • An Approximation Method of the Quadratic Discriminant Function and Its Application to Estimation of High-Dimensional Distribution

    Shinichiro OMACHI  Masako OMACHI  Hirotomo ASO  

     
    PAPER

      Page(s):
    1160-1167

    In statistical pattern recognition, it is important to estimate the distribution of patterns precisely to achieve high recognition accuracy. In general, precise estimation of the parameters of the distribution requires a great number of sample patterns, especially when the feature vector obtained from the pattern is high-dimensional. For some pattern recognition problems, such as face recognition or character recognition, very high-dimensional feature vectors are necessary and there are always not enough sample patterns for estimating the parameters. In this paper, we focus on estimating the distribution of high-dimensional feature vectors with small number of sample patterns. First, we define a function, called simplified quadratic discriminant function (SQDF). SQDF can be estimated with small number of sample patterns and approximates the quadratic discriminant function (QDF). SQDF has fewer parameters and requires less computational time than QDF. The effectiveness of SQDF is confirmed by three types of experiments. Next, as an application of SQDF, we propose an algorithm for estimating the parameters of the normal mixture. The proposed algorithm is applied to face recognition and character recognition problems which require high-dimensional feature vectors.

  • Accuracy Improvement of Pulmonary Nodule Detection Based on Spatial Statistical Analysis of Thoracic CT Scans

    Hotaka TAKIZAWA  Shinji YAMAMOTO  Tsuyoshi SHIINA  

     
    PAPER

      Page(s):
    1168-1174

    This paper describes a novel discrimination method of pulmonary nodules based on statistical analysis of thoracic computed tomography (CT) scans. Our previous Computer-Aided Diagnosis (CAD) system can detect pulmonary nodules from CT scans, but, at the same time, yields many false positives. In order to reduce the false positives, the method proposed in the present paper uses a relationship between pulmonary nodules, false positives and image features in CT scans. The trend of variation of the relationships is acquired through statistical analysis of a set of CT scans prepared for training. In testing, by use of the trend, the method predicts the appearances of pulmonary nodules and false positives in a CT scan, and improves the accuracy of the previous CAD system by modifying the system's output based on the prediction. The method is applied to 218 actual thoracic CT scans with 386 actual pulmonary nodules. The receiver operating characteristic (ROC) analysis is used to evaluate the results. The area under the ROC curve (Az) is statistically significantly improved from 0.918 to 0.931.

  • Real-Time Space Carving Using Graphics Hardware

    Christian NITSCHKE  Atsushi NAKAZAWA  Haruo TAKEMURA  

     
    PAPER

      Page(s):
    1175-1184

    Reconstruction of real-world scenes from a set of multiple images is a topic in computer vision and 3D computer graphics with many interesting applications. Attempts have been made to real-time reconstruction on PC cluster systems. While these provide enough performance, they are expensive and less flexible. Approaches that use a GPU hardware-acceleration on single workstations achieve real-time framerates for novel-view synthesis, but do not provide an explicit volumetric representation. This work shows our efforts in developing a GPU hardware-accelerated framework for providing a photo-consistent reconstruction of a dynamic 3D scene. High performance is achieved by employing a shape from silhouette technique in advance. Since the entire processing is done on a single PC, the framework can be applied in mobile environments, enabling a wide range of further applications. We explain our approach using programmable vertex and fragment processors and compare it to highly optimized CPU implementations. We show that the new approach can outperform the latter by more than one magnitude and give an outlook for interesting future enhancements.

  • Extraction of Finger-Vein Patterns Using Maximum Curvature Points in Image Profiles

    Naoto MIURA  Akio NAGASAKA  Takafumi MIYATAKE  

     
    PAPER

      Page(s):
    1185-1194

    A biometrics system for identifying individuals using the pattern of veins in a finger was previously proposed. The system has the advantage of being resistant to forgery because the pattern is inside a finger. Infrared light is used to capture an image of a finger that shows the vein patterns, which have various widths and brightnesses that change temporally as a result of fluctuations in the amount of blood in the vein, depending on temperature, physical conditions, etc. To robustly extract the precise details of the depicted veins, we developed a method of calculating local maximum curvatures in cross-sectional profiles of a vein image. This method can extract the centerlines of the veins consistently without being affected by the fluctuations in vein width and brightness, so its pattern matching is highly accurate. Experimental results show that our method extracted patterns robustly when vein width and brightness fluctuated, and that the equal error rate for personal identification was 0.0009%, which is much better than that of conventional methods.

  • Lighting Independent Skin Tone Detection Using Neural Networks

    Marvin DECKER  Minako SAWAKI  

     
    LETTER

      Page(s):
    1195-1198

    Skin tone detection in conditions where illuminate intensity and/or chromaticity can vary often comes with high computational time or low accuracy. Here a technique is presented integrating chromaticity and intensity normalization combined with a neural skin tone classification network to achieve robust classification faster than other approaches.

  • Inverse Motion Compensation for DCT Block with Unrestricted Motion Vectors

    Min-Cheol HWANG  Seung-Kyun KIM  Sung-Jea KO  

     
    LETTER

      Page(s):
    1199-1201

    Existing methods for inverse motion compensation (IMC) in the DCT domain have not considered the unrestricted motion vector (UMV). In the existing methods, IMC is performed to deal with the UMV in the spatial domain after the inverse DCT (IDCT). We propose an IMC method which can deal with the UMV directly in the DCT domain without the use of the IDCT/DCT required by the existing methods. The computational complexity of the proposed method can be reduced by about half of that of the brute-force method operating in the spatial domain. Experimental results show that the proposed method can efficiently reduce the processing time with similar visual quality.

  • Regular Section
  • Analysis of Test Generation Complexity for Stuck-At and Path Delay Faults Based on τk-Notation

    Chia Yee OOI  Thomas CLOUQUEUR  Hideo FUJIWARA  

     
    PAPER-Complexity Theory

      Page(s):
    1202-1212

    In this paper, we discuss the relationship between the test generation complexity for path delay faults (PDFs) and that for stuck-at faults (SAFs) in combinational and sequential circuits using the recently introduced τk-notation. On the other hand, we also introduce a class of cyclic sequential circuits that are easily testable, namely two-column distributive state-shiftable finite state machine realizations (2CD-SSFSM). Then, we discuss the relevant conjectures and unsolved problems related to the test generation for sequential circuits with PDFs under different clock schemes and test generation models.

  • MARK-OPT: A Concurrency Control Protocol for Parallel B-Tree Structures to Reduce the Cost of SMOs

    Tomohiro YOSHIHARA  Dai KOBAYASHI  Haruo YOKOTA  

     
    PAPER-Database

      Page(s):
    1213-1224

    In this paper, we propose a new concurrency control protocol for parallel B-tree structures capable reducing the cost of structure-modification-operation (SMO) compared to the conventional protocols such as ARIES/IM and INC-OPT. We call this protocol the MARK-OPT protocol, since it marks the lowest SMO occurrence point during optimistic latch-coupling operations. The marking reduces middle phases for spreading an X latch and removes needless X latches. In addition, we propose three variations of the MARK-OPT, which focus on tree structure changes from other transactions. Moreover, the proposed protocols are deadlock-free and satisfy the physical consistency requirement for B-trees. These indicate that the proposed protocols are suitable as concurrency control protocols for B-tree structures. To compare the performance of the proposed protocols, the INC-OPT, and the ARIES/IM, we implement these protocols on an autonomous disk system adopting the Fat-Btree structure, a form of parallel B-tree structure. Experimental results in various environments indicate that the proposed protocols always improve system throughput, and 2P-REP-MARK-OPT is the most useful protocol in high update environment. Additionally, to mitigate access skew, data should be migrated between PEs. We also demonstrate that MARK-OPT improves the system throughput under the data migration and reduces the time for data migration to balance load distribution.

  • Quality Evaluation for Document Relation Discovery Using Citation Information

    Kritsada SRIPHAEW  Thanaruk THEERAMUNKONG  

     
    PAPER-Data Mining

      Page(s):
    1225-1234

    Assessment of discovered patterns is an important issue in the field of knowledge discovery. This paper presents an evaluation method that utilizes citation (reference) information to assess the quality of discovered document relations. With the concept of transitivity as direct/indirect citations, a series of evaluation criteria is introduced to define the validity of discovered relations. Two kinds of validity, called soft validity and hard validity, are proposed to express the quality of the discovered relations. For the purpose of impartial comparison, the expected validity is statistically estimated based on the generative probability of each relation pattern. The proposed evaluation is investigated using more than 10,000 documents obtained from a research publication database. With frequent itemset mining as a process to discover document relations, the proposed method was shown to be a powerful way to evaluate the relations in four aspects: soft/hard scoring, direct/indirect citation, relative quality over the expected value, and comparison to human judgment.

  • A Variable-Length Coding Adjustable for Compressed Test Application

    Hideyuki ICHIHARA  Toshihiro OHARA  Michihiro SHINTANI  Tomoo INOUE  

     
    PAPER-Dependable Computing

      Page(s):
    1235-1242

    Test compression / decompression using variable-length coding is an efficient method for reducing the test application cost, i.e., test application time and the size of the storage of an LSI tester. However, some coding techniques impose slow test application, and consequently a large test application time is required despite the high compression. In this paper, we clarify the fact that test application time depends on the compression ratio and the length of codewords and then propose a new Huffman-based coding method for achieving small test application time in a given test environment. The proposed coding method adjusts both of the compression ratio and the minimum length of the codewords to the test environment. Experimental results show that the proposed method can achieve small test application time while keeping high compression ratio.

  • Managing Contradictions in Multi-Agent Systems

    Ruben FUENTES-FERNANDEZ  Jorge J. GOMEZ-SANZ  Juan PAVON  

     
    PAPER-Distributed Cooperation and Agents

      Page(s):
    1243-1250

    The specification of a Multi-Agent System (MAS) involves the identification of a large number of entities and their relationships. This is a non-trivial task that requires managing different views of the system. Many problems concerning this issue originate in the presence of contradictory goals and tasks, inconsistencies, and unexpected behaviours. Such troublesome configurations should be detected and prevented during the development process in order to study alternative ways to cope with them. In this paper, we present methods and tools that support the management of contradictions during the analysis and design of MAS. Contradiction management in MAS has to consider both individual (i.e. agent) and social (i.e. organization) aspects, and their dynamics. Such issues have already been considered in social sciences, and more concretely in the Activity Theory, a social framework for the study of interactions in activity systems. Our approach applies knowledge from Activity Theory in MAS, especially its base of contradiction patterns. That requires a formalization of this social theory in order to be applicable in a software engineering context and its adaptation to agent-oriented methodologies. Then, it will be possible to check the occurrence of contradiction patterns in a MAS specification and provide solutions to those situations. This technique has been validated by implementing an assistant for the INGENIAS Development Kit and has been tested with several case studies. This paper shows part of one of these experiments for a web application.

  • ACE-INPUTS: A Cost-Effective Intelligent Public Transportation System

    Jongchan LEE  Sanghyun PARK  Minkoo SEO  Sang-Wook KIM  

     
    PAPER-Distributed Cooperation and Agents

      Page(s):
    1251-1261

    With the rapid adoption of mobile devices and location based services (LBS), applications provide with nearby information like recommending sightseeing resort are becoming more and more popular. In the mean time, traffic congestion in cities led to the development of mobile public transportation systems. In such applications, mobile devices need to communicate with servers via wireless communications and servers should process queries from tons of devices. However, because users can not neglect the payment for the wireless communications and server capacities are limited, decreasing the communications made between central servers and devices and reducing the burden on servers are quite demanding. Therefore, in this paper, we propose a cost-effective intelligent public transportation system, ACE-INPUTS, which utilizes a mobile device to retrieve the bus routes to reach a destination from the current location at the lowest wireless communication cost. To accomplish this task, ACE-INPUTS maintains a small amount of information on bus stops and bus routes in a mobile device and runs a heuristic routing algorithm based on such information. Only when a user asks more accurate route information or calls for a "leave later query", ACE-INPUTS entrusts the task to a server into which real-time traffic and bus location information is being collected. By separating the roles into mobile devices and servers, ACE-INPUTS is able to provide bus routes at the lowest wireless communication cost and reduces burden on servers. Experimental results have revealed that ACE-INPUTS is effective and scalable in most experimental settings.

  • Ontology-Based Context Modeling and Reasoning for U-HealthCare

    Eun Jung KO  Hyung Jik LEE  Jeun Woo LEE  

     
    PAPER-Artificial Intelligence and Cognitive Science

      Page(s):
    1262-1270

    In order to prepare the health care industry for an increasingly aging society, a ubiquitous health care infrastructure is certainly needed. In a ubiquitous computing environment, it is important that all applications and middleware should be executed on an embedded system. To provide personalized health care services to users anywhere and anytime, a context-aware framework should convert low-level context to high-level context. Therefore, ontology and rules were used in this research to convert low-level context to high-level context. In this paper, we propose context modeling and context reasoning in a context-aware framework which is executed on an embedded wearable system in a ubiquitous computing environment for U-HealthCare. The objective of this research is the development of the standard ontology foundation for health care services and context modeling. A system for knowledge inference technology and intelligent service deduction is also developed in order to recognize a situation and provide customized health care service. Additionally, the context-aware framework was tested experimentally.

  • Media Accessibility for Low-Vision Users in the MPEG-21 Multimedia Framework

    Truong Cong THANG  Seungji YANG  Yong Man RO  Edward K. WONG  

     
    PAPER-Rehabilitation Engineering and Assistive Technology

      Page(s):
    1271-1278

    Ethical and legal requirements have made accessibility a crucial feature in any information systems. This paper presents a content adaptation framework, based on the MPEG-21 standard, to help low-vision users have better accessibility to visual contents. We first present an overview of MPEG-21 Digital Item Adaptation (DIA) and the low-vision description tool which enables interoperable content adaptation. This description tool lists seven low-vision symptoms, namely loss of fine detail, lack of contrast, central vision loss, peripheral vision loss, hemianopia, light sensitivity, and need of light. Then we propose a systematic contrast-enhancement method to improve the content visibility for low-vision users, focusing on the first two symptoms. The effectiveness of the low-vision description tool and our adaptation framework is verified by some experiments with an adaptation test-bed. The major advantages of the proposed approach include 1) support of a wide range of low-vision conditions, and 2) customized content adaptation to specific characteristics of each user.

  • Hierarchical Behavior-Knowledge Space for Highly Reliable Handwritten Numeral Recognition

    Jangwon SUH  Jin Hyung KIM  

     
    PAPER-Pattern Recognition

      Page(s):
    1279-1285

    We propose, in this article, the Hierarchical Behavior-Knowledge Space as an extension of Behavior-Knowledge Space. Hierarchical BKS utilizes ranked level individual classifiers, and automatically expands its behavioral knowledge in order to satisfy given reliability requirement. From the statistical view point, its decisions are as optimal as those of original BKS, and the reliability threshold is a lower bound of estimated reliability. Several comparisons with original BKS and unanimous voting are shown with some experiments.

  • Online Speech Detection and Dual-Gender Speech Recognition for Captioning Broadcast News

    Toru IMAI  Shoei SATO  Shinichi HOMMA  Kazuo ONOE  Akio KOBAYASHI  

     
    PAPER-Speech and Hearing

      Page(s):
    1286-1291

    This paper describes a new method to detect speech segments online with identifying gender attributes for efficient dual gender-dependent speech recognition and broadcast news captioning. The proposed online speech detection performs dual-gender phoneme recognition and detects a start-point and an end-point based on the ratio between the cumulative phoneme likelihood and the cumulative non-speech likelihood with a very small delay from the audio input. Obtaining the speech segments, the phoneme recognizer also identifies gender attributes with high discrimination in order to guide the subsequent dual-gender continuous speech recognizer efficiently. As soon as the start-point is detected, the continuous speech recognizer with paralleled gender-dependent acoustic models starts a search and allows search transitions between male and female in a speech segment based on the gender attributes. Speech recognition experiments on conversational commentaries and field reporting from Japanese broadcast news showed that the proposed speech detection method was effective in reducing the false rejection rate from 4.6% to 0.53% and also recognition errors in comparison with a conventional method using adaptive energy thresholds. It was also effective in identifying the gender attributes, whose correct rate was 99.7% of words. With the new speech detection and the gender identification, the proposed dual-gender speech recognition significantly reduced the word error rate by 11.2% relative to a conventional gender-independent system, while keeping the computational cost feasible for real-time operation.

  • Boundary Detection in Echocardiographic Images Using Markovian Level Set Method

    Jierong CHENG  Say-Wei FOO  

     
    PAPER-Image Recognition, Computer Vision

      Page(s):
    1292-1300

    Owing to the large amount of speckle noise and ill-defined edges present in echocardiographic images, computer-based boundary detection of the left ventricle has proved to be a challenging problem. In this paper, a Markovian level set method for boundary detection in long-axis echocardiographic images is proposed. It combines Markov random field (MRF) model, which makes use of local statistics with level set method that handles topological changes, to detect a continuous and smooth boundary. Experimental results show that higher accuracy can be achieved with the proposed method compared with two related MRF-based methods.

  • A VLSI Design of a Pipelining and Area-Efficient Reed-Solomon Decoder

    Wei-min WANG  Du-yan BI  Xing-min DU  Lin-hua MA  

     
    LETTER-VLSI Systems

      Page(s):
    1301-1303

    A novel high-speed and area-efficient Reed-Solomon decoder is proposed, which employs pipelining architecture of minimized modified Euclid (ME) algorithm. The logic synthesis and simulation results of its VLSI implementation show that it not only can operate at a higher clock frequency, but also consumes fewer hardware resources.

  • Mining Text and Visual Links to Browse TV Programs in a Web-Like Way

    Xin FAN  Hisashi MIYAMORI  Katsumi TANAKA  Mingjing LI  

     
    LETTER-Human-computer Interaction

      Page(s):
    1304-1307

    As the amount of recorded TV content is increasing rapidly, people need active and interactive browsing methods. In this paper, we use both text information from closed captions and visual information from video frames to generate links to enable users to easily explore not only the original video content but also augmented information from the Web. This solution especially shows its superiority when the video content cannot be fully represented by closed captions. A prototype system was implemented and some experiments were carried out to prove its effectiveness and efficiency.

  • An Adaptive Image Resizing Algorithm in DCT Domain

    Hai-Feng XU  Song-Yu YU  Ci WANG  

     
    LETTER-Image Processing and Video Processing

      Page(s):
    1308-1311

    A novel image resizing algorithm is proposed. In our method, three steps are included in the downsampling: the first-round downsampling, the interim upsampling and the second-round downsampling. The downsampling operation unit size is selected between one single 1616 block size and four 88 block sizes during the first-round downsampling processing. To distinguish the selected downsampling operation unit size, the interim upsampling and the second-round downsampling is required. The DCT coefficients of the interim upsampling image indicate the selected downsampling unit size. The DCT coefficients are converted by some way like lifting step and simultaneously downsampled in the second round. The information about selected operator unit size is contained in the final downsampling image. Experimental results demonstrate the proposed method achieves better result than the relevant existing method.

  • Acceleration of DCT Processing with Massive-Parallel Memory-Embedded SIMD Matrix Processor

    Takeshi KUMAKI  Masakatsu ISHIZAKI  Tetsushi KOIDE  Hans Jurgen MATTAUSCH  Yasuto KURODA  Hideyuki NODA  Katsumi DOSAKA  Kazutami ARIMOTO  Kazunori SAITO  

     
    LETTER-Image Processing and Video Processing

      Page(s):
    1312-1315

    This paper reports an efficient Discrete Cosine Transform (DCT) processing method for images using a massive-parallel memory-embedded SIMD matrix processor. The matrix-processing engine has 2,048 2-bit processing elements, which are connected by a flexible switching network, and supports 2-bit 2,048-way bit-serial and word-parallel operations with a single command. For compatibility with this matrix-processing architecture, the conventional DCT algorithm has been improved in arithmetic order and the vertical/horizontal-space 1 Dimensional (1D)-DCT processing has been further developed. Evaluation results of the matrix-engine-based DCT processing show that the necessary clock cycles per image block can be reduced by 87% in comprison to a conventional DSP architecture. The determined performances in MOPS and MOPS/mm2 are factors 8 and 5.6 better than with a conventional DSP, respectively.

  • Efficient Rate Control Scheme Using Complexity of Macro Block for MPEG-2 Transcoder

    Sang-Min KWAK  Jae-Gon KIM  Jong-Ki HAN  

     
    LETTER-Image Processing and Video Processing

      Page(s):
    1316-1319

    When the bit rate of a compressed video sequence is reduced by a frequency domain transcoder system, the rate control scheme plays a very important role in maintaining consistent video quality. In this paper, we propose an efficient rate control scheme based on the complexity of MB (Macro Block) while conventional transcoding schemes use that of a picture. Since the frequency domain transcoder has to calculate the spatial activity of MB to adjust the quantization step, a process of converting the DCT (Discrete Cosine Transform) data into spatial one is required. The proposed scheme calculates the spatial activity from DCT data without converting them to pixel domain.

  • Fast Intra-Mode Prediction Algorithm in H.264/AVC Video Coding

    Jong-Ho KIM  Byung-Gyu KIM  Chang-Sik CHO  

     
    LETTER-Image Processing and Video Processing

      Page(s):
    1320-1323

    A fast intra-mode decision algorithm is proposed on the basis of an inter-mode block type for inter-frames (P-slices). Each macroblock (MB) type has its own intra prediction modes (I16MB and 88 chroma: 4 modes, I4MB and I8MB: 9 modes). This procedure creates a large computational complexity in addition to the inter mode decision procedure. In most cases, there is a high correlation between the best inter-mode block type and the direction of the texture edge or object boundary. Therefore, only a small number of intra-prediction modes are chosen to determine the best intra mode based on this correlation. We experimentally verify that the proposed scheme can significantly improve the speed of the overall encoding time with a negligible loss of image quality and a minimal bit increase. The average loss in PSNR was -0.0120.036 dB and the bit increment was approximately -0.1940.751%.