The search functionality is under construction.
The search functionality is under construction.

IEICE TRANSACTIONS on Information

  • Impact Factor

    0.59

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.4

Advance publication (published online immediately after acceptance)

Volume E87-D No.6  (Publication Date:2004/06/01)

    Special Section on Human Communication I
  • FOREWORD

    Shogo NISHIDA  

     
    FOREWORD

      Page(s):
    1289-1289
  • "Man-Computer Symbiosis" Revisited: Achieving Natural Communication and Collaboration with Computers

    Neal LESH  Joe MARKS  Charles RICH  Candace L. SIDNER  

     
    INVITED PAPER

      Page(s):
    1290-1298

    In 1960, the famous computer pioneer J.C.R. Licklider described a vision for human-computer interaction that he called "man-computer symbiosis. " Licklider predicted the development of computer software that would allow people "to think in interaction with a computer in the same way that you think with a colleague whose competence supplements your own. " More than 40 years later, one rarely encounters any computer application that comes close to capturing Licklider's notion of human-like communication and collaboration. We echo Licklider by arguing that true symbiotic interaction requires at least the following three elements: a complementary and effective division of labor between human and machine; an explicit representation in the computer of the user's abilities, intentions, and beliefs; and the utilization of nonverbal communication modalities. We illustrate this argument with various research prototypes currently under development at Mitsubishi Electric Research Laboratories (USA).

  • Bottles: A Transparent Interface as a Tribute to Mark Weiser

    Hiroshi ISHII  

     
    INVITED PAPER

      Page(s):
    1299-1311

    This paper first discusses the misinterpretation of the concept of "ubiquitous computing" that Mark Weiser originally proposed in 1991. Weiser's main message was not the ubiquity of computers, but the transparency of interface that determines users' perception of digital technologies embedded in our physical environment seamlessly. To explore Weiser's philosophy of transparency in interfaces, this paper presents the design of an interface that uses glass bottles as "containers" and "controls" for digital information. The metaphor is a perfume bottle: Instead of scent, the bottles have been filled with music -- classical, jazz, and techno music. Opening each bottle releases the sound of a specific instrument accompanied by dynamic colored light. Physical manipulation of the bottles -- opening and closing -- is the primary mode of interaction for controlling their musical contents. The bottles illustrates Mark Weiser's vision of the transparent (or invisible) interface that weaves itself into the fabric of everyday life. The bottles also exploits the emotional aspects of glass bottles that are tangible and visual, and evoke the smell of perfume and the taste of exotic beverages. This paper describes the design goals of the bottle interface, the arrangement of musical content, the implementation of the wireless electromagnetic tag technology, and the feedback from users who have played with the system.

  • Dialogue Languages and Persons with Disabilities

    Akira ICHIKAWA  

     
    INVITED PAPER

      Page(s):
    1312-1319

    Any utterances of dialogue, spoken language or sign language, have functions that enable recipients to achieve real-time and easy understanding and to control conversation smoothly in spite of its volatile characteristics. In this paper, we present evidence of these functions obtained experimentally. Prosody plays a very important role not only in spoken language (aural language) but also in sign language (visual language) and finger braille (tactile language). Skilled users of a language may detect word boundaries in utterances and estimate sentence structure immediately using prosody. The gestures and glances of a recipient may influence the utterances of the sender, leading to amendments of the contents of utterances and smooth exchanges in turn. Individuality and emotion in utterances are also very important aspects of effective communication support systems for persons with disabilities even more so than for those non-disabled persons. The trials described herein are universal in design. Some trials carried out to develop these systems are also reported.

  • Designing a Group Communication Media that is Connectedness Oriented

    Takeshi OHGURO  Kazuhiro KUWABARA  Koji KAMEI  

     
    PAPER

      Page(s):
    1320-1327

    Connectedness oriented communication denotes a mode of communication in which the activities of communication are more important than the contents of communication. It is targeted at maintaining and enhancing human social relationships. As our lifestyles and societies are shifting along with the progress of Information Technology, communication media that are connectedness oriented will play an important role. In this paper we propose a media called FaintPop, which is an example of such new media that are suitable for connectedness oriented communication. It is a communication media designed for a community, with which the sense of connectedness can be shared among members. Furthermore, it provides a general overview of the communication activities occurring in the community. We discuss several principles and points in designing the media, especially about the interaction of the users. Results and findings from the experiment using the media are reported.

  • Transparent Gaze Communications for Multiparty Videoconference System

    Thitiporn LERTRUSDACHAKUL  Akinori TAGUCHI  Terumasa AOKI  Hiroshi YASUDA  

     
    PAPER

      Page(s):
    1328-1337

    This paper addresses issues regarding to the development of teleconferencing support collaboration focusing on the realistic sensation domain. It argues that the gaze communications are the important mechanisms to enable visual channel and social presence in human-human communications. We propose a new aspect to establish multiple eye contacts and community awareness in multiparty videoconference (VC). The participants can aware of being recognized from any remote sites while they are talking with each other. Community awareness means the ability to aware of group communication in the videoconference. The participant can recognize of who is talking with whom and any communicative groups in a conference. An intelligent image arrangement through a unique position of camera is built and simulated. The systematic placement of images serves the gaze communications by utilizing the characteristic of gaze direction and image's position. The experimental results show that the proposed approach has the significant improvement in the interpersonal communication compared with the conventional VC system.

  • Multimodal Story-based Communication: Integrating a Movie and a Conversational Agent

    Yukiko I. NAKANO  Toshiyasu MURAYAMA  Toyoaki NISHIDA  

     
    PAPER

      Page(s):
    1338-1346

    In story-based communication, where a message is conveyed in story form, it is important to embody the story with expressive materials. However, it is quite difficult for users to create rich multimedia contents using multimedia editing tools. This paper proposes a web-based multimedia environment, SPOC (Stream-oriented Public Opinion Channel), aiming at helping non-skillful people to convert their stories into TV-like programs very easily. The system can produce a digital camera work for graphics and video clips as well as generate an agent animation automatically according to a narration text. Findings in evaluation experiments showed that SPOC is easy-to-use and easy-to-learn for novice users. Given a short instruction, the subjects not only mastered the operations of the software, but also succeeded in creating highly original programs. In subjective evaluation, the subjects answered that they enjoyed using the software without feeling difficulty. These results suggest that this system reduces user's cost in making a program, and encourages communication in a network community.

  • Robotic Hand System for Non-verbal Communication

    Kiyoshi HOSHINO  Ichiro KAWABUCHI  

     
    PAPER

      Page(s):
    1347-1353

    The purpose of this study is to design a humanoid robotic hand system that is capable of conveying feelings and sensitivities by finger movement for the non-verbal communication between men and robots in the near future. In this paper, studies have been made in four steps. First, a small-sized and light-weight robotic hand was developed to be used as the humanoid according to the concept of extracting required minimum motor functions and implementing them to the robot. Second, basic characteristics of the movement were checked by experiments, simple feedforward control mechanism was designed based on velocity control, and a system capable of tracking joint time-series change command with arbitrary pattern input was realized. Third, tracking performances with regard to sinusoidal input with different frequencies were studied for evaluation of the system thus realized, and space- and time-related accuracy were investigated. Fourth, the sign language motions were generated as examples of information transmission by finger movement. A series of results thus obtained indicated that this robotic hand is capable of transmitting information promptly with comparatively high accuracy through the movement.

  • Wearable Moment Display Device for Nonverbal Communications

    Hideyuki ANDO  Maki SUGIMOTO  Taro MAEDA  

     
    PAPER

      Page(s):
    1354-1360

    There has recently been considerable interest in research on wearable non-grounded force display. However, there have been no developments for the communication of nonverbal information (ex. tennis and golf swing). We propose a small and lightweight wearable force display to present motion timing and direction. The display outputs a torque using rotational moment and mechanical brakes. We explain the principle of this device, and describe an actual measurement of the torque and torque sensitivity experiments.

  • LifeMinder: A Wearable Healthcare Support System with Timely Instruction Based on the User's Context

    Kazushige OUCHI  Takuji SUZUKI  Miwako DOI  

     
    PAPER

      Page(s):
    1361-1369

    Management of diet and exercise is especially significant in preventing "lifestyle-related diseases" for patients and subclinical cases. This paper introduces a questionnaire survey on diabetic regimens that targets 38 professional users such as physicians and nurses at a diabetic clinic. Based on the results of the questionnaire survey, a design concept for a wearable healthcare support system has been developed to provide patients with timely instruction in accordance with their current context. On the basis of this design concept, we developed a prototype of a wearable healthcare support system called "LifeMinder". "LifeMinder" is composed of a wristwatch-shaped wearable sensor module and a personal digital assistant (PDA). The sensor module measures 3-axis acceleration, pulse rate, galvanic skin reflex (GSR), and skin temperature. The PDA receives this data via BluetoothTM and recognizes the patient's general behavior such as "walking" or "eating". The recognition of these behaviors reduces the patient's mental and physical burden in daily healthcare and assists in support of medical treatment.

  • Comic Image Decomposition for Reading Comics on Cellular Phones

    Masashi YAMADA  Rahmat BUDIARTO  Mamoru ENDO  Shinya MIYAZAKI  

     
    PAPER

      Page(s):
    1370-1376

    This paper presents a system for reading comics on cellular phones. It is necessary for comic images to be divided into frames and the contents such as speech text to be displayed at a comfortable reading size, since it is difficult to display high-resolution images in a low resolution cellular phone environment. We have developed a scheme how to decompose comic images into constituent elements frames, speech text and drawings. We implemented a system on the internet for a cellular phone company in our country, that provides downloadable comic data and a program for reading.

  • New Cycling Environments Using Multimodal Knowledge and Ad-hoc Network

    Sachiyo YOSHITAKI  Yutaka SAKANE  Yoichi TAKEBAYASHI  

     
    PAPER

      Page(s):
    1377-1385

    We have been developing new cycling environments by using knowledge sharing and speech communication. We have offered multimodal knowledge contents to share knowledge on safe and exciting cycling. We accumulated 140 contents, focused on issues such as riding techniques, trouble shootings, and preparations on cycling. We have also offered a new way of speech communication using an ad-hoc wireless LAN technology for safe cycling. Group cycling requires frequent communication to lead the group safely. Speech communication achieves spontaneous communication between group members without looking around or speaking loudly. Experimental result through actual cycling has shown the effectiveness of sharing multimodal knowledge contents and speech communication. Our new developed environment has an advantage of increasing multimodal knowledge through the accumulation of personal experiences of actual cycling.

  • NTM-Agent: Text Mining Agent for Net Auction

    Yukitaka KUSUMURA  Yoshinori HIJIKATA  Shogo NISHIDA  

     
    PAPER

      Page(s):
    1386-1396

    Net auctions have been widely utilized with the recent development of the Internet. However, it is a problem that there are too many items for bidders to select the most suitable one. We aim at supporting the bidders on net auctions by automatically generating a table which contains the features of several items for comparison. We construct a system called NTM-Agent (Net auction Text Mining Agent). The system collects web pages of items and extracts the items' features from the pages. After that, it generates a table which contains the extracted features. This research focuses on two problems in the process. The first problem is that if the system collects items automatically, the results contain the items which is different from the items of the user's target. The second problem is that the descriptions in net auctions are not uniform (There are different formats such as sentences, items and tables. The subjects of some sentences are omitted. ). Therefore, it is difficult to extract the information from the descriptions by conventional methods of information extraction. This research proposes methods to solve the problems. For the first problem, NTM-Agent filters the items by correlation rules about the keywords in the titles and the item descriptions. These rules are created semi-automatically by a support tool. For the second problem, NTM-Agent extracts the information by distinguishing the formats. It also learns the feature values from plain examples for the future extraction.

  • A Spoken Dialogue Interface for TV Operations Based on Data Collected by Using WOZ Method

    Jun GOTO  Kazuteru KOMINE  Masaru MIYAZAKI  Yeun-Bae KIM  Noriyoshi URATANI  

     
    PAPER

      Page(s):
    1397-1404

    The development of multi-channel digital broadcasting has generated a demand not only for new services but also for smart and highly functional capabilities in all broadcast-related devices. This is especially true of TV receivers on the viewer's side. With the aim of achieving a friendly interface that anybody can use with ease, we built a prototype spoken dialogue interface for TV operation based on data collected by using Wizard of Oz method. At the current stage of our research, we are using this system to investigate the usefulness and problem areas of an interactive voice interface for TV operation.

  • TAJODA: Proposed Tactile and Jog Dial Interface for the Blind

    Chieko ASAKAWA  Hironobu TAKAGI  Shuichi INO  Tohru IFUKUBE  

     
    PAPER

      Page(s):
    1405-1414

    There is a fatal difference in obtaining information between sighted people and the blind. Screen reading technology assists blind people in accessing digital documents by themselves helping to bridge such gap. However, these days they are becoming much more visual using various types of visual effects for sighted people to explore the information intuitively at a glance. It is very hard to convey visual effects non-visually and intuitively while retaining the original effects. In addition, it takes a long time to explore the information, since blind people use the keyboard for exploration, while sighted people use eye movement. This research aims at improving the non-visual exploration interface and improving the quality of non-visual information. Therefore, TAJODA (tactile jog dial interface) was proposed to solve these problems. It presents verbal information (text information) in the form of speech, while nonverbal information (visual effects) is represented in the form of tactile sensations. It uses a jog dial as an exploration device, which makes it possible to explore forward or backward intuitively in the speech information by spinning the jog dial clockwise or counterclockwise. It also integrates a tactile device to represent visual effects non-visually. Both speech and tactile information can be synchronized with the dial movements. The speed of spinning the dial affects the speech rate. The main part of this paper describes an experimental evaluation of the effectiveness of the proposed TAJODA interface. The experimental system used a preprocessed recorded human voice as test data. The training sessions showed that it was easy to learn how to use TAJODA. The comparison test session clearly showed that the subjects could perform the comparison task using TAJODA significantly faster (2.4 times faster) than with the comparison method that is closest to the existing screen reading function. Through this experiment, our results showed that TAJODA can drastically improve the non-visual exploration interface.

  • A Haptic Interface for Two-Handed 6DOF Manipulation-SPIDAR-G&G System

    Jun MURAYAMA  Yanlin LUO  Katsuhito AKAHANE  Shoichi HASEGAWA  Makoto SATO  

     
    PAPER

      Page(s):
    1415-1421

    In this paper, we propose a new haptic interface for two-handed manipulation. The system, named the SPIDAR-G&G system, consists of a pair of string-based 6DOF haptic devices called SPIDAR-G for both hands. By grasping the grip of each SPIDAR-G in each of the user's hands, the user can manipulate one virtual object with their right hand and the other one with their left hand cooperatively, while the user senses interaction force. We evaluated the system by measuring the completion time of a 3D pointing task, and demonstrated enhanced interactivity with virtual objects.

  • Surface Deformation Displays for Virtual Environment Using the Fuzzy Model

    MinKee PARK  Hideki HASHIMOTO  

     
    PAPER

      Page(s):
    1422-1432

    In this paper, a new method for displaying a surface deformation is proposed to provide sufficient realism in virtual environment. The approach selected in this paper is based on the fuzzy model and it is sufficient that only one additional rule be added to the fuzzy model to display a surface deformation. Furthermore, designers can easily determine which parameters should be used and how much they should be changed in order to alter shapes as required. The proposed method, thus, is a simple, but effective technique that can also be applied to real time operation and makes it possible to act on several surface points simultaneously. The results of the computer simulation are also given to demonstrate the validity of the proposed algorithm.

  • Partition Timing Routing Protocol in Wireless Ad Hoc Networks

    Jen-Yi HUANG  Hsi-Han CHEN  Lung-Jen WANG  Chung-Hsien LIN  Wen-Shyong HSIEH  

     
    PAPER

      Page(s):
    1433-1437

    Ad Hoc Networks are transmission networks in the structure of wireless networks that consist of many mobile hosts. They do so without the support from other communication infrastructures like Base Stations, and directly use wireless networks for data-transmission. This paper provides a general explanation of related protocols for setting up routes and their possible problems. In addition, related researches are described with their method of solving problems and reducing the possibility of problems occurring. Then, a novel constructive protocol called Partition-Timing Routing Protocol (PTR) is presented. If any covered node needs to transmit data to others outside the scope, it has to be managed by a core node. This protocol is able to adjust neighboring nodes covered in the scope, to select certain nodes to be their own core node. In addition, the timing for updating and adjusting the data of the covered scope is different from other methods, and at the same time it reduces the load of the entire network and makes it more flexible.

  • Modeling Email Communications

    Yihjia TSAI  Ching-Chang LIN  Ping-Nan HSIAO  

     
    PAPER

      Page(s):
    1438-1445

    Recently, the small-world network model has been popular to describe a wide range of networks such as human social relations and networks formed by biological entities. The network model achieves a small diameter with relatively few links as measured by the ratio of clustering coefficient and the number of links. It is quite natural to consider email communication similar to social network patterns. Quite surprisingly, we find from our empirical study that local email networks follow a different type of network model that falls into the category of scale-free network. We propose new network models to describe such communication structure.

  • Emerging Market for Mobile Remote Physiological Monitoring Services

    Timothy BOLT  Sadahiko KANO  Akihisa KODATE  

     
    PAPER

      Page(s):
    1446-1453

    This paper offers an initial analysis of economic and market issues in the development and deployment of mobile remote physiological monitoring services for medical patients through wireless wearable sensors and actuators. Examining the characteristics of the service technologies and related industries, this study focuses on the structure, participants and roles of standardisation of the layers within the emerging mobile remote physiological monitoring industry. The study concludes that the structure of the emerging mobile remote physiological monitoring industry will be oriented about service provision, be integrated with other personal / patient data storage services and be heavily influenced by the interplay of technological developments, the health market structure, existing players and regulation. Additionally, the keys players are likely to be the system integrators and service providers concentrating on large institutional customers. A focus of the paper is analysing both the causes and implications of a modular, horizontally layered industry structure likely to result from the mix of technologies, suppliers and customers as this market develops. The paper discusses why, although horizontal specialisation is the most likely outcome, there is little risk of key layers becoming commoditised. The paper also discusses the appropriate types and levels of standardisation and equipment certification activities that should be encouraged, along with from which groups and industries the pressure for these will come.

  • Structures of Human Relations and User-Dynamics Revealed by Traffic Data

    Masaki AIDA  Keisuke ISHIBASHI  Hiroyoshi MIWA  Chisa TAKANO  Shin-ichi KURIBAYASHI  

     
    PAPER

      Page(s):
    1454-1460

    The number of customers of a service for Internet access from cellular phones in Japan has been explosively increasing for some time. We analyze the relation between the number of customers and the volume of traffic, with a view to finding clues to the structure of human relations among the very large set of potential customers of the service. The traffic data reveals that this structure is a scale-free network, and we calculate the exponent that governs the distribution of node degree in this network. The data also indicates that people who have many friends tend to subscribe to the service at an earlier stage. These results are useful for investigating various fields, including marketing strategies, the propagation of rumors, the spread of computer viruses, and so on.

  • Study on Relationship between Technostress and Antisocial Behavior on Computers

    Nobuyo KASUGA  Katsuhito ITOH  Shin'ichi OISHI  Tomomasa NAGASHIMA  

     
    PAPER

      Page(s):
    1461-1465

    This study was conducted to examine the relationship between technostress - techno-centered tendency- and antisocial behavior on computers. Questionnaire data of computer operators were analyzed by multivariate-analysis. The results of the analysis indicated that high techno-centered tendency has a strong relationship with antisocial behavior on computers. Among the component factors of techno-centered tendency, absorption in operating computers was proven to have the strongest association with antisocial behavior on computers.

  • Judgment Biases of Temporal Order during Apparent Self-Motion

    Wataru TERAMOTO  Hiroshi WATANABE  Hiroyuki UMEMURA  Katsunori MATSUOKA  Shinichi KITA  

     
    PAPER

      Page(s):
    1466-1476

    Virtual reality system is one of the most useful tools for investigating the characteristics of human perception in dynamic visual environment because we can easily and appropriately manipulate parameters of three-dimensional stimuli of vision in accordance with our purpose. In the present study we examined how the brain processes local stimuli during the global sensation of self-motion (vection) in view of temporal information processing -- perceptual latency -- with temporal order judgment task. In Experiment 1 we demonstrated that the targets in the left visual field were perceived prior to those in the right visual field when an observer stared at rightward optokinetic stimuli or perceived self-motion leftward, and vice versa. Especially at 16.0 deg of target eccentricity the biases were much larger with the continuous exposure of optokinetic stimuli than with their intermittent exposure; the former compelled observers to perceive self-motion and the latter hardly did. In Experiment 2 we examined the relationship between the occurrence of vection and temporal order judgments as the exposure duration of optokinetic stimuli was fixed between conditions, and showed that the biases were larger when vection occurred than when it did not. In Experiment 3 we showed that the biases were not modulated by the speed of optokinetic stimuli and not related with the speed of perceived self-motion. This phenomenon can be explained based on exogenous components of attention, the shift of the reference frame for determining the order in which objects come into awareness and imbalance between hemispheric activities. The mechanism is ecologically reasonable in that it allows us to be aware of the incoming events as soon as possible and to avoid any dangerous situations.

  • A Basic Study on Teammates' Mental Workload among Ship's Bridge Team

    Koji MURAI  Yuji HAYASHI  Seiji INOKUCHI  

     
    PAPER

      Page(s):
    1477-1483

    Ship handling for leaving and entering port always carries out for a captain, deck officers and quartermasters and sometimes include a pilot. For navigational watch keeping at sea except for a narrow channel and under restricted visibility etc., the deck officer and quartermaster do it. They achieve safe and efficient navigational watch keeping with their teamwork at a ship's bridge. The importance of teamwork has been recognized in the shipping world, and its training and education methods are also thought over. However, their evaluation is not clear, because they are depended on the experience of the trainers. Therefore, we need to make an evaluation method of teamwork for education and training of the ship handling. In this paper, we define that ship's bridge teamwork is shown by 1) a change of mental workload level and 2) a change of mental workload for time. We challenge to evaluate teammates' mental workload in the ship's bridge with R-R interval of subjects' heart rate variability, and we evaluate their mental workloads with the following three steps. 1) To confirm the evaluation of the mental workload of a ship's navigator with R-R interval. 2) To evaluate teamwork with R-R interval in case of an oral presentation at meetings as pre-experiments. 3) To evaluate the teammates' mental workload among ship's bridge team in case of a leaving port. Their results showed that the method using R-R interval was sufficient for the evaluation of teamwork effects.

  • The Effects of the Timing of Commercial Breaks on the Loss of Attention

    Noriko NAGATA  Sanae H. WAKE  Mieko OHSUGA  Seiji INOKUCHI  

     
    LETTER

      Page(s):
    1484-1487

    Commercial breaks are often placed at the climax of stories in recent TV programs in Japan, which may cause some serious effects on audiences, especially children, since this practice disturbs the concentrations. The experiment measured the psycho-physiological state of four children before and after commercials. The results showed that the next peak of attention is delayed by distracting the attention.

  • Regular Section
  • Defect Level Prediction Using Multi-Model Fault Coverage

    Shyue-Kung LU  

     
    PAPER-Dependable Computing

      Page(s):
    1488-1495

    As we enter the deep submicron era, the costs to maintain the quality of shipped products increases significantly. Unfortunately, even 100% coverage of the widely used single stuck-at faults cannot guarantee that the defect level of the shipped chips is low enough. This is due to the fact that the stuck-at fault model does not cover all catastrophic defects. Moreover, it is difficult to estimate the difference between stuck-at fault coverage and defect coverage. Multiple fault models or test techniques are usually adopted in the test process, each having its corresponding fault coverage. However, the relationship between the defect level and those individual fault coverages remains to be explored. In this paper, we first propose the concept of multi-model fault coverage (MFC) instead of the fault coverage based on a single fault model. The multi-model fault coverage for nonequiprobable faults is presented, and the multi-model fault coverage for equiprobable faults is shown to be a special case of nonequiprobable faults. The relationship between defect level, fabrication yield, and multi-model fault coverage is then derived. We also analyze the defect level error between the predicted defect level and the physical defect level. An algorithm is also proposed for estimating the number of fault models required in order to achieve sufficient accuracy. Experimental results show that multi-model fault coverage can be used to predict the defect level more precisely. As the number of fault models increases, the defect level error reduces significantly. Our approach is efficient for product quality prediction, especially for deep sub-micron devices.

  • Design of a Robust LSP Quantizer for a High-Quality 4-kbit/s CELP Speech Coder

    Yusuke HIWASAKI  Kazunori MANO  Kazutoshi YASUNAGA  Toshiyuki MORII  Hiroyuki EHARA  Takao KANEKO  

     
    PAPER-Speech and Hearing

      Page(s):
    1496-1506

    This paper presents an efficient LSP quantizer implementation for low bit-rate coders. The major feature of the quantizer is that it uses a truncated cepstral distance criterion for the code selection procedure. This approach has generally been considered too computationally costly. We utilized the quantizer with a moving-average predictor, two-stage-split vector quantizer and delayed decision. We have investigated the optimal parameter settings in this case and incorporated the quantizer thus obtained into an ITU-T 4-kbit/s speech coding candidate algorithm with a bit budget of 21 bits. The objective performance is better than that with a conventional weighted mean-square criterion, while the complexity is still kept to a reasonable level. The paper also describes the codebook design and techniques that were employed to achieve robustness in noisy channel conditions.

  • Noise Post-Processing for Low Bit-Rate CELP Coders

    Hiroyuki EHARA  Kazutoshi YASUNAGA  Koji YOSHIDA  Yusuke HIWASAKI  Kazunori MANO  Takao KANEKO  

     
    PAPER-Speech and Hearing

      Page(s):
    1507-1516

    This paper presents a newly developed noise post-processing (NPP) algorithm and the results of several tests demonstrating its subjective performance. This NPP algorithm is designed to improve the subjective performance of low bit-rate code excited linear prediction (CELP) decoding under background noise conditions. The NPP algorithm is based on a stationary noise generator and improves the subjective quality of noisy signal input. A backward adaptive detector defines noisy input signal frames from decoded LSF, energy, and pitch parameters. The noise generator estimates and produces stationary noise signals using past line spectral frequency (LSF) and energy parameters. The stationary noise generator has a frame erasure concealment (FEC) scheme designed for stationary noise signals and therefore improves the speech decoder's robustness for frame erasure under background noise conditions. The algorithm has been applied to the following CELP decoders: 1) a candidate algorithm of the ITU-T 4-kbit/s speech coding standard and 2) existing ITU-T standards, the G.729 and G.723.1 series. In both cases, NPP improved the subjective performance of the baseline decoders. Improvements of approximately 0.25 CMOS (CCR MOS: comparison category rating mean opinion score) and around 0.2-0.8 DMOS (DCR MOS: degradation category rating mean opinion score) were demonstrated in the results of our subjective tests when applied to the 4-kbit/s decoder and G.729/G.723.1 decoders respectively. Other test results show that NPP improves the subjective performance of a G.729 decoder by around 0.45 in DMOS under both error-free and frame-erasure conditions, and a further improvement of around 0.2 DMOS is achieved by the FEC scheme in the noise generator.

  • Elliptic vs. Rectangular Blending for Multi-Projection Displays

    Tsuyoshi MINAKAWA  Masami YAMASAKI  

     
    PAPER-Image Processing and Video Processing

      Page(s):
    1517-1526

    We compared two edge-blending methods for multi-projection displays, elliptic and rectangular blending, by simulating three common situations: (1) an inaccurately estimated calibration parameter, (2) a worn projector lamp, and (3) a shifted viewpoint. We used a two-level-of-detail display including a high-gain rear-projection screen in the simulation to demonstrate an extreme case. The comparisons showed how strongly inaccurate elements affect a composite besides affecting the appearance itself. A subjective assessment was also carried out to obtain the evaluations of actual users. The simulation results showed that in many cases elliptic blending is more effective than rectangular blending.

  • 3D Structure from a Single Calibrated View Using Distance Constraints

    Rubin GONG  Gang XU  

     
    PAPER-Image Recognition, Computer Vision

      Page(s):
    1527-1536

    We propose a new method to recover scene points from a single calibrated view using a subset of distances among the points. This paper first introduces the problem and its relationship with the perspective n point problem. Then the number of distances required to uniquely recover scene points are explored. The result is then developed into a practical vision algorithm to calculate the initial points' coordinates using distance constraints. Finally SQP (Sequential Quadratic Programming) is used to optimize the initial estimations. It can minimize a cost function defined as the sum of squared reprojection errors while keeping the specified distance constraints strictly satisfied. Both simulation data and real scene images have been used to test the proposed method, and good results have been obtained.

  • AGSphere: Multiresolution Structure of Directional Relationship on Surface Parts

    HyungSeok KIM  Kwangyun WOHN  

     
    PAPER-Computer Graphics

      Page(s):
    1537-1544

    We present a new method in multiresolution rendering of a complex object. Our method uses viewer-centered features including the silhouette in generating multiresolution model. Because the silhouette of an object depends on the position of the viewer, the silhouette has difficulties in real-time generation. We propose the AGSphere for real-time management of the silhouette. The AGSphere easily identifies silhouette parts and manages it in multiresolution manner. The primary applicable feature of the AGSphere is the silhouette from the viewer, but we can also use the AGSphere for other directional features like light silhouette. In this paper, we show experimental results for the silhouette either from the viewer or the light. The efficiency of the proposed method is compared with other methods. We also propose new texture map generation method to use with the multiresolution geometry. Generated texture map has valid mapping function for the multiresolution geometry minimizing texture distortions.

  • A Novel Approach to Sampling the Coiled Tubing Surface with an Application for Monte Carlo Direct Lighting

    Chung-Ming WANG  Peng-Cheng WANG  

     
    PAPER-Computer Graphics

      Page(s):
    1545-1553

    Sampling is important for many applications in research areas such as graphics, vision, and image processing. In this paper, we present a novel stratified sampling algorithm (SSA) for the coiled tubing surface with a given probability density function. The algorithm is developed from the inverse function of the integration for the areas of the coiled tubing surface. We exploit a Hierarchical Allocation Strategy (HAS) to preserve sample stratification when generating any desirable sample numbers. This permits us to reduce variances when applying our algorithm to Monte Carlo Direct Lighting for realistic image generation. We accelerate the sampling process using a segmentation technique in the integration domain. Our algorithm thus runs 324 orders of magnitude faster when using faster SSA algorithm where the order of the magnitude is proportional to the sample numbers. Finally, we employ a parabolic interpolation technique to decrease the average errors occurred for using the segmentation technique. This permits us to produce nearly constant average errors, independent of the sample numbers. The proposed algorithm is novel, efficient in computing and feasible for realistic image generation using Monte Carlo method.

  • Three Point Based Registration for Binocular Augmented Reality

    Steve VALLERAND  Masayuki KANBARA  Naokazu YOKOYA  

     
    PAPER-Multimedia Pattern Processing

      Page(s):
    1554-1565

    In order to perform the registration of virtual objects in vision-based augmented reality systems, the estimation of the relation between the real and virtual worlds is needed. This paper presents a three-point vision-based registration method for video see-through augmented reality systems using binocular cameras. The proposed registration method is based on a combination of monocular and stereoscopic registration methods. A correction method that performs an optimization of the registration by correcting the 2D positions in the images of the marker feature points is proposed. Also, an extraction strategy based on color information is put forward to allow the system to be robust to fast user's motion. In addition, a quantification method is used in order to evaluate the stability of the produced registration. Timing and stability results are presented. The proposed registration method is proven to be more stable than the standard stereoscopic registration method and to be independent of the distance. Even when the user moves quickly, our developed system succeeds in producing stable three-point based registration. Therefore, our proposed methods can be considered as interesting alternatives to produce the registration in binocular augmented reality systems when only three points are available.

  • Multi-Dipole Sources Identification from EEG Topography Using System Identification Method

    Xiaoxiao BAI  Qinyu ZHANG  Yohsuke KINOUCHI  Tadayoshi MINATO  

     
    PAPER-Biological Engineering

      Page(s):
    1566-1574

    The goal of source localization in the brain is to estimate a set of parameters for representing source characteristics; one of such parameters is the source number. We here propose a method combining the Powell algorithm with the information criterion method for determining the optimal dipole number. The potential errors can be calculated by the Powell algorithm with the concentric 4-sphere head model and 32 electrodes, then the number of dipoles is determined by the information criterion method with the potential errors mentioned above. This method has the advantages of a high identification accuracy of dipole number and a small number of EEG data because in this method: (1) only one EEG topography is used in the computation, (2) 32 electrodes are used to obtain the EEG data, (3) the optimal dipole number can be obtained by this method. In order to prove our method to be efficient, precise and robust to noise, 10% white noise is introduced to test this method theoretically. Some investigations are presented here to show our method is an advanced approach for determining the optimal dipole number.

  • A Timing Driven Crosstalk Optimizer for Gridded Channel Routing

    Shih-Hsu HUANG  Yi-Siang HSU  Chiu-Cheng LIN  

     
    LETTER-Computer Components

      Page(s):
    1575-1581

    The relative window method provides quantitative crosstalk delay degradation for the post-layout timing analysis in deep sub-micron VLSI design. However, to the best of our knowledge, the relative window method has not been applied to the crosstalk minimization in gridded channel routing problem. Most conventional crosstalk optimizers only use the coupling length to estimate the crosstalk. In this paper, we present a post-layout timing driven crosstalk optimizer based on the relative window method. According to the relative signal arrival time and the coupling length, we define a delay degradation graph to describe the crosstalks between nets in a routing solution. Our optimization goal is to maximize the time slack by iteratively improving the delay degradation graph without increasing the channel height. Benchmark data consistently show that our post-layout timing driven crosstalk optimizer can further improve the routing solution obtained by a conventional crosstalk optimizer.

  • ILP-Based Program Path Analysis for Bounding Worst-Case Inter-Task Cache Conflicts

    Hiroyuki TOMIYAMA  Nikil DUTT  

     
    LETTER-System Programs

      Page(s):
    1582-1587

    The unpredictable behavior of cache memory makes it difficult to statically analyze the worst-case performance of real-time systems. This problem is further exacerbated in the case of preemptive multitask systems because of inter-task cache interference, called Cache-Related Preemption Delay (CRPD). This paper proposes an approach to analyzing the tight upper bound on CRPD which a task might impose on lower-priority tasks. Our method finds the program execution path which requires the maximum number of cache blocks using an integer linear programming technique. Experimental results show that our approach provides up to 69% tighter bounds on CRPD than a conservative approach.

  • Decaying Obsolete Information in Finding Recent Frequent Itemsets over Data Streams

    Joong Hyuk CHANG  Won Suk LEE  

     
    LETTER-Databases

      Page(s):
    1588-1592

    A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Consequently, the knowledge embedded in a data stream is likely to be changed as time goes by. However, most of mining algorithms or frequency approximation algorithms for a data stream are not able to extract the recent change of information in a data stream adaptively. This is because the obsolete information of old transactions which may be no longer useful or possibly invalid at present is regarded as important as that of recent transactions. This paper proposes an information decay method for finding recent frequent itemsets in a data stream. The effect of old transactions on the mining result of a data steam is gradually diminished as time goes by. Furthermore, the decay rate of information can be flexibly adjusted, which enables a user to define the desired life-time of the information of a transaction in a data stream.

  • Parallel Information Retrieval with Query Expansion

    Yoojin CHUNG  

     
    LETTER-Contents Technology and Web Information Systems

      Page(s):
    1593-1595

    An information retrieval (IR) system with query expansion on a low-cost high-performance PC cluster environment is implemented. We study how query performance is affected by query expansion and two declustering methods using two standard Korean test collections. According to the experiments, the greedy method shows about 20% enhancement overall when compared with the lexical method.

  • Compensation of Speech Coding Distortion for Wireless Speech Recognition

    Hong Kook KIM  

     
    LETTER-Speech and Hearing

      Page(s):
    1596-1600

    In this paper, we perform some experiments to show that the quantization noise caused by low-bit-rate speech coding can be characterized as a white noise process. Then, the signal-to-quantization noise ratio of the decoded speech for a given bit-rate is estimated by observing the perceptual speech quality equivalent to the artificially generated noisy speech obtained by adding a white Gaussian noise source. This information is incorporated into the parameter tuning of a noise-robust compensation algorithm for speech recognition so that the compensation algorithm can be performed better under a range of the estimated SNRs. Finally, we apply the compensation algorithm to a connected digit string recognition system that utilizes speech signals decoded by the GSM adaptive multi-rate (AMR) speech coder. It is shown that the noise-robust compensation algorithm reduces word error rates by 15% or more at low bit-rate modes of the AMR speech coder.

  • Video-Based Augmented Reality under Orthography without Euclidean Calibration

    Yongduek SEO  Ki-Sang HONG  

     
    LETTER-Multimedia Pattern Processing

      Page(s):
    1601-1605

    An algorithm is developed for augmenting a real video with virtual graphics objects without computing Euclidean information. For this, we design a method of specifying the virtual camera that performs Euclidean orthographic projection in recovered affine space. In addition, our method has the capability of generating views of objects shaded by virtual light sources. Our novel formulation and experimental results are presented.