IEICE global.ieice.org Site

Keyword Search Result

[Keyword] verb(63hit)

41-60hit(63hit)

Bidirectional Eye Contact for Human-Robot Communication
Dai MIYAUCHI Akio NAKAMURA Yoshinori KUNO

PAPER

Vol:
E88-D No:11
Page(s):
2509-2516
Eye contact is an effective means of controlling human communication, such as in starting communication. It seems that we can make eye contact if we simply look at each other. However, this alone does not establish eye contact. Both parties also need to be aware of being watched by the other. We propose a method of bidirectional eye contact satisfying these conditions for human-robot communication. When a human wants to start communication with a robot, he/she watches the robot. If it finds a human looking at it, the robot turns to him/her, changing its facial expressions to let him/her know its awareness of his/her gaze. When the robot wants to initiate communication with a particular person, it moves its body and face toward him/her and changes its facial expressions to make the person notice its gaze. We show several experimental results to prove the effectiveness of this method. Moreover, we present a robot that can recognize hand gestures after making eye contact with the human to show the usefulness of eye contact as a means of controlling communication.
Estimation of Radiated Power of Radio Transmitters Using a Reverberation Chamber
Tsutomu SUGIYAMA Takashi SHINOZUKA Ken IWASAKI

PAPER-Measurements

Vol:
E88-B No:8
Page(s):
3158-3163
A procedure for estimating radiated power of radio transmitter is proposed based on a statistical property of field intensity time variation distribution in a reverberation chamber. When random varying multipath waves produced by stirrers in a reverberation chamber are received together with a direct wave, the resulting mixed waves are regarded as a kind of multipath waves. Theoretical and experimental results are reported regarding a procedure for estimating radiated power from the 63.2% value of CDF (Cumulative Distribution Function) of an envelope of multipath waves.
Statistical Characteristics of E-Field Distribution in a Reverberation Chamber
Katsushige HARIMA

PAPER-Measurements

Vol:
E88-B No:8
Page(s):
3127-3132
A statistically uniform E-field is created in a reverberation chamber by moving mechanical stirrers to vary boundary conditions. The uniformity of the spatial electric-field distribution in an ideal reverberation chamber can be theoretically estimated by calculating the probability density function of its distribution. However, uniformity in an actual chamber is affected by the dimensions of the chamber and the structure of the stirrers. We experimentally and theoretically evaluated the effect of stirrers on the spatial uniformity of the average, median, and maximum electric-field distributions. When the dimensions of a chamber equipped with effective stirrers are large compared to the wavelength at the operating frequency, that is, when resonant modes above approximately 105 exist below the operating frequency, the spatial uniformity experimentally evaluated agrees well with theoretical values estimated by calculating the probability density function of their distributions.
Harmonicity Based Dereverberation for Improving Automatic Speech Recognition Performance and Speech Intelligibility
Keisuke KINOSHITA Tomohiro NAKATANI Masato MIYOSHI

PAPER-Speech Enhancement

Vol:
E88-A No:7
Page(s):
1724-1731
A speech signal captured by a distant microphone is generally smeared by reverberation, which severely degrades both the speech intelligibility and Automatic Speech Recognition (ASR) performance. Previously, we proposed a single-microphone dereverberation method, named "Harmonicity based dEReverBeration (HERB)." HERB estimates the inverse filter for an unknown room transfer function by utilizing an essential feature of speech, namely harmonic structure. In previous studies, improvements in speech intelligibility was shown solely with spectrograms, and improvements in ASR performance were simply confirmed by matched condition acoustic model. In this paper, we undertook a further investigation of HERB's potential as regards to the above two factors. First, we examined speech intelligibility by means of objective indices. As a result, we found that HERB is capable of improving the speech intelligibility to approximately that of clean speech. Second, since HERB alone could not improve the ASR performance sufficiently, we further analyzed the HERB mechanism with a view to achieving further improvements. Taking the analysis results into account, we proposed an appropriate ASR configuration and conducted experiments. Experimental results confirmed that, if HERB is used with an ASR adaptation scheme such as MLLR and a multicondition acoustic model, it is very effective for improving ASR performance even in unknown severely reverberant environments.
A High Presence Shared Space Communication System Using 2D Background and 3D Avatar
Kyohei YOSHIKAWA Takashi MACHIDA Kiyoshi KIYOKAWA Haruo TAKEMURA

INVITED PAPER

Vol:
E87-D No:12
Page(s):
2532-2539
Displaying a 3D geometric model of a user in real time is an advantage for a telecommunication system because depth information is useful for nonverbal communication such as finger-pointing and gesturing that contain 3D information. However, the range image acquired by a rangefinder suffers from errors due to image noises and distortions in depth measurement. On the other hand, a 2D image is free from such errors. In this paper, we propose a new method for a shared space communication system that combines the advantages of both 2D and 3D representations. A user is represented as a 3D geometric model in order to exchange nonverbal communication cues. A background is displayed as a 2D image to give the user adequate information about the environment of the remote site. Additionally, a high-resolution texture taken by a video camera is projected onto the 3D geometric model of the user. This is done because the low resolution of the image acquired by the rangefinder makes it difficult to exchange facial expressions. Furthermore, to fill in the data occluded by the user, old pixel values are used for the user area in the 2D background image. We have constructed a prototype of a high presence shared space communication system based on our method. Through a number of experiments, we have found that our method is more effective for telecommunication than a method with only a 2D or 3D representation.
Reverberation Cue as a Control Parameter of Distance in Virtual Audio Environment
Han-gil MOON Jung-Uk NOH Koeng-Mo SUNG Dae-young JANG

LETTER-Engineering Acoustics

Vol:
E87-A No:7
Page(s):
1822-1826
Over the last twenty years, 3-D audio technologies have advanced significantly despite the difficulties in implementing them. However, their performance in providing information, especially about the distance of a sound source, remains imperfect. Therefore, more researches on distance cues are indispensable to achieve more effective technology. In this paper, we try to show how the conventional cues change as the distance of a sound source varies, by means of measured impulse responses using the swept-sine method and modeled impulse responses using CATT Acoustics. It is well known that the conventional cues comprise loudness, spectral information, reverberation and binaural information. Among these, we focus on the reverberation cue to describe the distance of a sound source. Some researches have shown that reverberation can give listeners absolute distance information, but the implementation using this cue is unfeasible because there are no well-defined parameters. In this paper, we also try to validate reverberation as a feasible distance cue by suggesting early decay time (EDT) and clarity index, C80, as the parameters for controlling the perceived distance with the reverberation cue.
"Man-Computer Symbiosis" Revisited: Achieving Natural Communication and Collaboration with Computers
Neal LESH Joe MARKS Charles RICH Candace L. SIDNER

INVITED PAPER

Vol:
E87-D No:6
Page(s):
1290-1298
In 1960, the famous computer pioneer J.C.R. Licklider described a vision for human-computer interaction that he called "man-computer symbiosis. " Licklider predicted the development of computer software that would allow people "to think in interaction with a computer in the same way that you think with a colleague whose competence supplements your own. " More than 40 years later, one rarely encounters any computer application that comes close to capturing Licklider's notion of human-like communication and collaboration. We echo Licklider by arguing that true symbiotic interaction requires at least the following three elements: a complementary and effective division of labor between human and machine; an explicit representation in the computer of the user's abilities, intentions, and beliefs; and the utilization of nonverbal communication modalities. We illustrate this argument with various research prototypes currently under development at Mitsubishi Electric Research Laboratories (USA).
Robotic Hand System for Non-verbal Communication
Kiyoshi HOSHINO Ichiro KAWABUCHI

PAPER

Vol:
E87-D No:6
Page(s):
1347-1353
The purpose of this study is to design a humanoid robotic hand system that is capable of conveying feelings and sensitivities by finger movement for the non-verbal communication between men and robots in the near future. In this paper, studies have been made in four steps. First, a small-sized and light-weight robotic hand was developed to be used as the humanoid according to the concept of extracting required minimum motor functions and implementing them to the robot. Second, basic characteristics of the movement were checked by experiments, simple feedforward control mechanism was designed based on velocity control, and a system capable of tracking joint time-series change command with arbitrary pattern input was realized. Third, tracking performances with regard to sinusoidal input with different frequencies were studied for evaluation of the system thus realized, and space- and time-related accuracy were investigated. Fourth, the sign language motions were generated as examples of information transmission by finger movement. A series of results thus obtained indicated that this robotic hand is capable of transmitting information promptly with comparatively high accuracy through the movement.
Improved HMM Separation for Distant-Talking Speech Recognition
Tetsuya TAKIGUCHI Masafumi NISHIMURA

PAPER

Vol:
E87-D No:5
Page(s):
1127-1137
In distant-talking speech recognition, the recognition accuracy is seriously degraded by reverberation and environmental noise. A robust speech recognition technique in such environments, HMM separation and composition, has been described in. HMM separation estimates the model parameters of the acoustic transfer function using adaptation data uttered from an unknown position in noisy and reverberant environments, and HMM composition builds an HMM of noisy and reverberant speech, using the acoustic transfer function estimated by HMM separation. Previously, HMM separation has been applied to the acoustic transfer function based on a single Gaussian distribution. However the improvement was smaller than expected for the impulse response with long reverberations. This is because the variance of the acoustic transfer function in each frame increases, since the length of the impulse response of the room reverberation is longer than that of the spectral analysis window. In this paper, HMM separation is extended to estimate the acoustic transfer function based on the Gaussian mixture components in order to compensate for the greater variability of the acoustic transfer function, and the re-estimation formulae are derived. In addition, this paper introduces a technique to adapt the noise weight for each mel-spaced frequency in order to improve the performance of the HMM separation in the linear-spectral domain, since the use of the HMM separation in the linear-spectral domain sometimes causes a negative mean output due to the subtraction operation. The extended HMM separation is evaluated on distant-talking speech recognition tasks. The results of the experiments clarify the effectiveness of the proposed method.
Two Methodology-Trials Using Higher Order Correlation for Reverberation Measurement of Noisy Acoustic Room
Kiminobu NISHIMURA Mitsuo OHTA

PAPER-Audio/Speech Coding

Vol:
E87-A No:3
Page(s):
598-604
In this paper, first, we consider how to illustrate the effect of background noise to the measurement of room acoustics under a background noise of arbitrary distribution type. Two kinds of estimation methods are proposed to evaluate a proper reverberation time of a room by observing real unrefined decay curves, which can not realize smoothly a sufficient decay of 60 dB in a low frequency region, especially under a contamination of background noise. In the first method, an observation equation is derived from a stochastic model by means of well-known Sabine's differential equation, which is approximately rewritten in a matched form of difference equation especially to preserve its original physical meaning and functional linearity on the reverberation parameter. The effect of background noise is eliminated by employing a generalized state estimation algorithm based on Bayes' theorem. In the second one, after reflecting the effect of background noise in an observation equation of measuring model, a well-known mutual information criterion is introduced to estimate a reverberation time especially based on the basic property of statistical independency between signal and background noise. Finally, the effectiveness of the proposed methods are experimentally confirmed too by applying it to the actual measurement of a reverberation time in the actual living situation of room contaminated by a background noise. The proposed methods are, however, some technique using actively the higher order correlation beyond a linear one, and so they are methodology-trials which should coexist with other techniques.
Decision Tree Based Disambiguation of Semantic Roles for Korean Adverbial Postpositions
Seong-Bae PARK

LETTER-Natural Language Processing

Vol:
E86-D No:8
Page(s):
1459-1463
The case postpositions usually have more than one semantic role in Korean. The adverbial postpositions among various postpositions especially make the development of Korean-based machine translation system difficult, because they have more semantic roles than others. In this paper, we describe a new method for resolving semantic ambiguities of adverbial postpositions using decision tree induction. The lack of training examples in decision tree induction is overcome by clustering words into classes using a kind of greedy algorithm. The cross validation results show that the presented method achieves 76.5% of accuracy on the average, which is 20.3% improvement over the baseline method.
Blind Source Separation of Acoustic Signals Based on Multistage ICA Combining Frequency-Domain ICA and Time-Domain ICA
Tsuyoki NISHIKAWA Hiroshi SARUWATARI Kiyohiro SHIKANO

PAPER-Digital Signal Processing

Vol:
E86-A No:4
Page(s):
846-858
We propose a new algorithm for blind source separation (BSS), in which frequency-domain independent component analysis (FDICA) and time-domain ICA (TDICA) are combined to achieve a superior source-separation performance under reverberant conditions. Generally speaking, conventional TDICA fails to separate source signals under heavily reverberant conditions because of the low convergence in the iterative learning of the inverse of the mixing system. On the other hand, the separation performance of conventional FDICA also degrades significantly because the independence assumption of narrow-band signals collapses when the number of subbands increases. In the proposed method, the separated signals of FDICA are regarded as the input signals for TDICA, and we can remove the residual crosstalk components of FDICA by using TDICA. The experimental results obtained under the reverberant condition reveal that the separation performance of the proposed method is superior to those of TDICA- and FDICA-based BSS methods.
A Child Verb Learning Model Based on Syntactic Bootstrapping
Tiansheng XU Zenshiro KAWASAKI Keiji TAKIDA Zheng TANG

PAPER-Artificial Intelligence, Cognitive Science

Vol:
E85-D No:6
Page(s):
985-993
This paper presents a child verb learning model mainly based on syntactic bootstrapping. The model automatically learns 4-5-year-old children's linguistic knowledge of verbs, including subcategorization frames and thematic roles, using a text in dialogue format. Subcategorization frame acquisition of verbs is guided by the assumption of the existence of nine verb prototypes. These verb prototypes are extracted based on syntactic bootstrapping and some psycholinguistic studies. Thematic roles are assigned by syntactic bootstrapping and other psycholinguistic hypotheses. The experiments are performed on the data from the CHILDES database. The results show that the learning model successfully acquires linguistic knowledge of verbs and also suggest that psycholinguistic studies of child verb learning may provide important hints for linguistic knowledge acquisition in natural language processing (NLP).
Verb Ellipsis Resolution in Japanese Sentence Using Surface Expressions and Examples
Masaki MURATA Hitoshi ISAHARA

PAPER-Natural Language Processing

Vol:
E85-D No:4
Page(s):
767-772
Verb phrases are sometimes omitted in natural language (ellipsis). It is necessary to resolve the verb phrase ellipses in language understanding, machine translation, and dialogue processing. This paper describes a practical way to resolve verb phrase ellipses by using surface expressions and examples. To make heuristic rules for ellipsis resolution we classified verb phrase ellipses by checking whether the referent of a verb phrase ellipsis appears in the surrounding sentences or not. We experimented with the resolution of verb phrase elipses on a novel and obtained a recall rate of 73% and a precision rate of 66% on test sentences. In the case when the referent of a verb phrase ellipsis appeared in the surrounding sentences, the accuracy rate was high. But, in the case when the referent of a verb phrase ellipsis did not appear in the surrounding sentences, the accuracy rate was not so high. Since the analysis of this phenomena is very difficult, it is valuable to propose a way of solving the problem to a certain extent. When the size of corpus becomes larger and the machine performance becomes greater, the method of using corpus will become effective.
Evaluation of Electric-Field Uniformity in a Reverberation Chamber for Radiated Immunity Testing
Katsushige HARIMA Yukio YAMANAKA

LETTER

Vol:
E84-B No:9
Page(s):
2618-2621
In using a reverberation chamber for radiated immunity testing, it is important to determine the number of discrete steps through which the stirrer rotates and the number of probe locations for a given test volume in the chamber. This is because they affect the uniformity and calibration of the field in the test volume. We experimentally evaluated the effect of the numbers of stirrers and their steps on the field uniformity, and the effect of the number of probe locations on field calibration.
Subjective Assessment of the Desired Echo Return Loss for Subband Acoustic Echo Cancellers
Sumitaka SAKAUCHI Yoichi HANEDA Shoji MAKINO Masashi TANAKA Yutaka KANEDA

PAPER-Engineering Acoustics

Vol:
E83-A No:12
Page(s):
2633-2639
We investigated the dependence of the desired echo return loss on frequency for various hands-free telecommunication conditions by subjective assessment. The desired echo return loss as a function of frequency (DERLf) is an important factor in the design and performance evaluation of a subband echo canceller, and it is a measure of what is considered an acceptable echo caused by electrical loss in the transmission line. The DERLf during single-talk was obtained as attenuated band-limited echo levels that subjects did not find objectionable when listening to the near-end speech and its band-limited echo under various hands-free telecommunication conditions. When we investigated the DERLf during double-talk, subjects also heard the speech in the far-end room from a loudspeaker. The echo was limited to a 250-Hz bandwidth assuming the use of a subband echo canceller. The test results showed that: (1) when the transmission delay was short (30 ms), the echo component around 2 to 3 kHz was the most objectionable to listeners; (2) as the transmission delay rose to 300 ms, the echo component around 1 kHz became the most objectionable; (3) when the room reverberation time was relatively long (about 500 ms), the echo component around 1 kHz was the most objectionable, even if the transmission delay was short; and (4) the DERLf during double-talk was about 5 to 10 dB lower than that during single-talk. Use of these DERLf values will enable the design of more efficient subband echo cancellers.
FDTD Analysis of Electromagnetic Fields in a Reverberation Chamber
Katsushige HARIMA

LETTER-Electromagnetic Compatibility

Vol:
E81-B No:10
Page(s):
1946-1950
The Finite-Difference Time-Domain (FDTD) method is applied to the analysis of electromagnetic fields in a reverberation chamber. The chamber is used for radiated susceptibility/immunity measurement of electromagnetic compatibility (EMC) test and measurement of the radiated power of radio transmitters. The analytical results defined the distribution of the electric field in the reverberation chamber and clarified the effect on field uniformity of the size of the chamber and the number, size, and position of stirrers.
Use of Multimodal Information in Facial Emotion Recognition
Liyanage C. DE SILVA Tsutomu MIYASATO Ryohei NAKATSU

PAPER-Artificial Intelligence and Cognitive Science

Vol:
E81-D No:1
Page(s):
105-114
Detection of facial emotions are mainly addressed by computer vision researchers based on facial display. Also detection of vocal expressions of emotions is found in research work done by acoustic researchers. Most of these research paradigms are devoted purely to visual or purely to auditory human emotion detection. However we found that it is very interesting to consider both of these auditory and visual informations together, for processing, since we hope this kind of multimodal information processing will become a datum of information processing in future multimedia era. By several intensive subjective evaluation studies we found that human beings recognize Anger, happiness, Surprise and Dislike by their visual appearance, compared to voice only detection. When the audio track of each emotion clip is dubbed with a different type of auditory emotional expression, still Anger, Happiness and Surprise were video dominant. However Dislike emotion gave mixed responses to different speakers. In both studies we found that Sadness and Fear emotions were audio dominant. As a conclusion to the paper we propose a method of facial emotion detection by using a hybrid approach, which uses multimodal informations for facial emotion recognition.
A Proposal of Five-Degree-of-Freedom 3D Nonverbal Voice Interface
Tatsuhiro YONEKURA Rikako NARISAWA Yoshiki WATANABE

PAPER-Human Communications and Ergonomics

Vol:
E79-A No:2
Page(s):
242-247
This paper proposes a new emphasizing three-dimensional pointing device considering user friendliness and lack of cable clutter. The proposed method utilizes five degrees of freedom via the medium of non-verbal voice of human. That is, the spatial direction of the sound source, the type of the voice phoneme and the tone of the voice phoneme are utilized. The input voice is analyzed regarding the above factors and then taking proper effects as previously defined for human interface. In this paper the estimated spatial direction is used for three-dimensional movement for the virtual object as three degrees of freedom. Both of the type and the tone of the voice phoneme are used for remaining two degrees of freedom. Since vocalization of nonverbal human voice is an everyday task, and the intonation of the voice can be quite easily and intentionally controlled by human vocal ability, the proposed scheme is a new three-dimensional spatial interaction medium. In this sense, this paper realizes a cost-effective and handy nonverbal interface scheme without any artificial wearing materials which might give a physical and psychological fatigue. By using the prototype the authors evaluate the performance of the scheme from both of static and dynamic points of view and show some advantages of look and feel, and then prospect possibilities of the application for the proposed scheme.
A Model for Explaining a Phenomenon in Creative concept Formation
Koichi HORI

PAPER-Artificial Intelligence and Cognitive Science

Vol:
E76-D No:12
Page(s):
1521-1527
This paper gives a model to explain one phenomenon found in the process of creative concept formation, i.e. the phenomenon that people often get trapped in some state where the mental world remains nebulous and sometimes suddenly make a jump to a new concept. This phenomenon has been qualitatively explained mainly by the philosophers but there have not been models for explaining it quantitatively. Such model is necessary in a new research field to study the systems for aiding human creative activities. So far, the work on creation aid has not had theoretical background and the systems have been built based only on trial and error. The model given in this paper explains some aspects of the phenomena found in creative activities and give some suggestions for the future systems for aiding creative concept formation.

41-60hit(63hit)

Keyword Search Result

[Keyword] verb(63hit)

Bidirectional Eye Contact for Human-Robot Communication

Estimation of Radiated Power of Radio Transmitters Using a Reverberation Chamber

Statistical Characteristics of E-Field Distribution in a Reverberation Chamber

Harmonicity Based Dereverberation for Improving Automatic Speech Recognition Performance and Speech Intelligibility

A High Presence Shared Space Communication System Using 2D Background and 3D Avatar

Reverberation Cue as a Control Parameter of Distance in Virtual Audio Environment

"Man-Computer Symbiosis" Revisited: Achieving Natural Communication and Collaboration with Computers

Robotic Hand System for Non-verbal Communication

Improved HMM Separation for Distant-Talking Speech Recognition

Two Methodology-Trials Using Higher Order Correlation for Reverberation Measurement of Noisy Acoustic Room

Decision Tree Based Disambiguation of Semantic Roles for Korean Adverbial Postpositions

Blind Source Separation of Acoustic Signals Based on Multistage ICA Combining Frequency-Domain ICA and Time-Domain ICA

A Child Verb Learning Model Based on Syntactic Bootstrapping

Verb Ellipsis Resolution in Japanese Sentence Using Surface Expressions and Examples

Evaluation of Electric-Field Uniformity in a Reverberation Chamber for Radiated Immunity Testing

Subjective Assessment of the Desired Echo Return Loss for Subband Acoustic Echo Cancellers

FDTD Analysis of Electromagnetic Fields in a Reverberation Chamber

Use of Multimodal Information in Facial Emotion Recognition

A Proposal of Five-Degree-of-Freedom 3D Nonverbal Voice Interface

A Model for Explaining a Phenomenon in Creative concept Formation

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles