The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] CTI(8214hit)

5301-5320hit(8214hit)

  • Designing Target Cost Function Based on Prosody of Speech Database

    Kazuki ADACHI  Tomoki TODA  Hiromichi KAWANAMI  Hiroshi SARUWATARI  Kiyohiro SHIKANO  

     
    PAPER-Speech Synthesis and Prosody

      Vol:
    E88-D No:3
      Page(s):
    519-524

    This research aims to construct a high-quality Japanese TTS (Text-to-Speech) system that has high flexibility in treating prosody. Many TTS systems have implemented a prosody control system but such systems have been fundamentally designed to output speech with a standard pitch and speech rate. In this study, we employ a unit selection-concatenation method and also introduce an analysis-synthesis process to provide precisely controlled prosody in output speech. Speech quality degrades in proportion to the amount of prosody modification, therefore a target cost for prosody is set to evaluate prosodic difference between target prosody and speech candidates in such a unit selection system. However, the conventional cost ignores the original prosody of speech segments, although it is assumed that the quality deterioration tendency varies in relation to the pitch or speech rate of original speech. In this paper, we propose a novel cost function design based on the prosody of speech segments. First, we recorded nine databases of Japanese speech with different prosodic characteristics. Then with respect to the speech databases, we investigated the relationships between the amount of prosody modification and the perceptual degradation. The results indicate that the tendency of perceptual degradation differs according to the prosodic features of the original speech. On the basis of these results, we propose a new cost function design, which changes a cost function according to the prosody of a speech database. Results of preference testing of synthetic speech show that the proposed cost functions generate speech of higher quality than the conventional method.

  • Dynamic and Adaptive Morphing of Three-Dimensional Mesh Using Control Maps

    Tong-Yee LEE  Chien-Chi HUANG  

     
    PAPER-Computer Graphics

      Vol:
    E88-D No:3
      Page(s):
    646-651

    This paper describes a dynamic and adaptive scheme for three-dimensional mesh morphing. Using several control maps, the connectivity of intermediate meshes is dynamically changing and the mesh vertices are adaptively modified. The 2D control maps in parametric space that include curvature map, area deformation map and distance map, are used to schedule the inserting and deleting vertices in each frame. Then, the positions of vertices are adaptively moved to better positions using weighted centroidal voronoi diagram (WCVD) and a Delaunay triangulation is finally used to determine the connectivity of mesh. In contrast to most previous work, the intermediate mesh connectivity gradually changes and is much less complicated. We demonstrate several examples of aesthetically pleasing morphs created by the proposed method.

  • Speech Enhancement by Spectral Subtraction Based on Subspace Decomposition

    Takahiro MURAKAMI  Tetsuya HOYA  Yoshihisa ISHIDA  

     
    PAPER-Speech and Hearing

      Vol:
    E88-A No:3
      Page(s):
    690-701

    This paper presents a novel algorithm for spectral subtraction (SS). The method is derived from a relation between the spectrum obtained by the discrete Fourier transform (DFT) and that by a subspace decomposition method. By using the relation, it is shown that a noise reduction algorithm based on subspace decomposition is led to an SS method in which noise components in an observed signal are eliminated by subtracting variance of noise process in the frequency domain. Moreover, it is shown that the method can significantly reduce computational complexity in comparison with the method based on the standard subspace decomposition. In a similar manner to the conventional SS methods, our method also exploits the variance of noise process estimated from a preceding segment where speech is absent, whereas the noise is present. In order to more reliably detect such non-speech segments, a novel robust voice activity detector (VAD) is then proposed. The VAD utilizes the spread of eigenvalues of an autocorrelation matrix corresponding to the observed signal. Simulation results show that the proposed method yields an improved enhancement quality in comparison with the conventional SS based schemes.

  • Block Adaptive Beamforming via Parallel Projection Method

    Wen-Hsien FANG  Hsien-Sen HUNG  Chun-Sem LU  Ping-Chi CHU  

     
    PAPER-Antennas and Propagation

      Vol:
    E88-B No:3
      Page(s):
    1227-1233

    This paper addresses a simple, and yet effective approach to the design of block adaptive beamformers via parallel projection method (PPM), which is an extension of the classic projection onto convex set (POCS) method to inconsistent sets scenarios. The proposed approach begins with the construction of the convex constraint sets which the weight vector of the adaptive beamformer lies in. The convex sets are judiciously chosen to force the weights to possess some desirable properties or to meet some prescribed rules. Based on the minimum variance criterion and a fixed gain at the look direction, two constraint sets including the minimum variance constraint set and the gain constraint set are considered. For every input block of data, the weights of the proposed beamformer can then be determined by iteratively projecting the weight vector onto these convex sets until it converges. Furnished simulations show that the proposed beamformer provides superior performance compared with previous works in various scenarios but yet in general with lower computational overhead.

  • Objective Quality Assessment of Wideband Speech Coding

    Nobuhiko KITAWAKI  Kou NAGAI  Takeshi YAMADA  

     
    PAPER-Network

      Vol:
    E88-B No:3
      Page(s):
    1111-1118

    Recently, wideband speech communication using 7 kHz-wideband speech coding, as described in ITU-T Recommendations G.722, G.722.1, and G.722.2, has become increasingly necessary for use in advanced IP telephony using PCs, since, for this application, hands-free communication using separate microphones and loudspeakers is indispensable, and in this situation wideband speech is particularly helpful in enhancing the naturalness of communication. An objective quality measurement methodology for wideband-speech coding has been studied, its essential components being an objective quality measure and an input test signal. This paper describes Wideband-PESQ conforming to the draft Annex to ITU-T Recommendation P.862, "Perceptual Evaluation of Speech Quality (PESQ)," as the objective quality measure, by evaluating the consistency between the subjectively evaluated MOS (Mean Opinion Score) and objectively estimated MOS. This paper also describes the verification of artificial voice conforming to Recommendation P.50 "Artificial Voices," as the input test signal for such measurements, by evaluating the consistency between the objectively estimated MOS using a real voice and that obtained using an artificial voice.

  • Location-Aware Power-Efficient Directional MAC Protocol in Ad Hoc Networks Using Directional Antenna

    Tetsuro UEDA  Shinsuke TANAKA  Dola SAHA  Siuli ROY  Somprakash BANDYOPADHYAY  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E88-B No:3
      Page(s):
    1169-1181

    Use of directional antenna in the context of ad hoc wireless networks can largely reduce radio interference, thereby improving the utilization of wireless medium. Our major contribution in this paper is to devise a MAC protocol that exploits the advantages of directional antenna in ad hoc networks for improved system performance. In this paper, we have illustrated a MAC protocol for ad hoc networks using directional antenna with the objective of effective utilization of the shared wireless medium. In order to implement effective MAC protocol in this context, a node should know how to set its transmission direction to transmit a packet to its neighbors and to avoid transmission in other directions where data communications are already in progress. In this paper, we are proposing a receiver-centric approach for location tracking and MAC protocol, so that, nodes become aware of its neighborhood and also the direction of the nodes for communicating directionally. A node develops its location-awareness from these neighborhood-awareness and direction-awareness. In this context, researchers usually assume that the gain of directional antennas is equal to the gain of corresponding omni-directional antenna. However, for a given amount of input power, the range R with directional antenna will be much larger than that using omni-directional antenna. In this paper, we also propose a two level transmit power control mechanism in order to approximately equalize the transmission range R of an antenna operating at omni-directional and directional mode. This will not only improve medium utilization but also help to conserve the power of the transmitting node during directional transmission. Our proposed directional MAC protocol can be effective in both ITS (Intelligent Transportation System), which we simulate in String and Parallel Topology, and in any community network, which we simulate in Random Topology. The performance evaluation on QualNet network simulator clearly indicates the efficiency of our protocol.

  • A Simple Leakage-Resilient Authenticated Key Establishment Protocol, Its Extensions, and Applications

    SeongHan SHIN  Kazukuni KOBARA  Hideki IMAI  

     
    PAPER-Information Security

      Vol:
    E88-A No:3
      Page(s):
    736-754

    Authenticated Key Establishment (AKE) protocols enable two entities, say a client (or a user) and a server, to share common session keys in an authentic way. In this paper, we review the previous AKE protocols, all of which turn out to be insecure, under the following realistic assumptions: (1) High-entropy secrets that should be stored on devices may leak out due to accidents such as bugs or mis-configureations of the system; (2) The size of human-memorable secret, i.e. password, is short enough to memorize, but large enough to avoid on-line exhaustive search; (3) TRM (Tamper-Resistant Modules) used to store secrets are not perfectly free from bugs and mis-configurations; (4) A client remembers only one password, even if he/she communicates with several different servers. Then, we propose a simple leakage-resilient AKE protocol (cf.[41]) which is described as follows: the client keeps one password in mind and stores one secret value on devices, both of which are used to establish an authenticated session key with the server. The advantages of leakage-resilient AKEs to the previous AKEs are that the former is secure against active adversaries under the above-mentioned assumptions and has immunity to the leakage of stored secrets from a client and a server (or servers), respectively. In addition, the advantage of the proposed protocol to is the reduction of memory size of the client's secrets. And we extend our protocol to be possible for updating secret values registered in server(s) or password remembered by a client. Some applications and the formal security proof in the standard model of our protocol are also provided.

  • Address Autoconfiguration for Event-Driven Sensor Network

    Shinji MOTEGI  Kiyohito YOSHIHARA  Hiroki HORIUCHI  

     
    PAPER-Network

      Vol:
    E88-B No:3
      Page(s):
    950-957

    An event-driven sensor network composed of a large number of sensor nodes has been widely studied. A sensor node sends packets to a sink when the node detects an event. For the sink to receive packets it fails to acquire, the sink must send re-transmission requests to the sensor node. To send the requests to the sensor node using unicast, the network address of the sensor node is required to distinguish the sensor node from others. Since it is difficult to allocate the address manually to a number of nodes, a reasonable option is to use existing address autoconfiguration methods. However, the methods waste the limited energy of the sensor nodes due to using a number of control messages to allocate a permanent address to every node. In this paper, we propose an energy-efficient address autoconfiguration method for the event-driven sensor network. The proposed method allocates a temporary address only to a sensor node which detects an event, on an on-demand basis. By performing simulation studies, we evaluated the proposed method and compared it with one of the existing methods based on the number of control messages for the address allocation. The results show that the number of control messages of the proposed method is small compared to that of the existing method. We also evaluated the process time overhead of the proposed method using the implemented system. Although the proposed method has little extra overhead, the results show the processing time is short enough for practical use.

  • Multiuser MIMO Beamforming for Single Data Stream Transmission in Frequency-Selective Fading Channels

    Huy Hoang PHAM  Tetsuki TANIGUCHI  Yoshio KARASAWA  

     
    PAPER

      Vol:
    E88-A No:3
      Page(s):
    651-659

    In this paper, we propose a multiple-input multiple-output (MIMO) beamforming scheme for a multiuser system in frequency-selective fading channels. The maximum signal-to-noise and interference ratio (MSINR) is adopted as a criterion to determine the transmit and receive weight vectors. In order to maximize the output SINR over all users, two algorithms for base station are considered: the first algorithm is based on the receive weight vector optimization and the second algorithm is based on an iterative update of both transmit and receive weight vectors. Based on the result of single user MIMO beamforming, we analyze the interference channels cancellation ability of multiuser MIMO system. The first algorithm is a simple method and the second algorithm is a performative solution. Through computer simulations, it is shown that multiuser communication system is achievable using the proposed methods in frequency-selective fading condition.

  • SDC: A Scalable Approach to Collect Data in Wireless Sensor Networks

    Niwat THEPVILOJANAPONG  Yoshito TOBE  Kaoru SEZAKI  

     
    PAPER-Software Platform Technologies

      Vol:
    E88-B No:3
      Page(s):
    890-902

    In this paper, we present Scalable Data Collection (SDC) protocol, a tree-based protocol for collecting data over multi-hop, wireless sensor networks. The design of the protocol aims to satisfy the requirements of sensor networks that every sensor transmits sensed data to a sink node periodically or spontaneously. The sink nodes construct the tree by broadcasting a HELLO packet to discover the child nodes. The sensor receiving this packet decides an appropriate parent to which it will attach, it then broadcasts the HELLO packet to discover its child nodes. Based on this process, the tree is quickly created without flooding of any routing packets. SDC avoids periodic updating of routing information but the tree will be reconstructed upon node failures or adding of new nodes. The states required on each sensor are constant and independent of network size, thereby SDC scales better than the existing protocols. Moreover, each sensor can make forwarding decisions regardless of the knowledge on geographical information. We evaluate the performance of SDC by using the ns-2 simulator and comparing with Directed Diffusion, DSR, AODV, and OLSR. The simulation results demonstrate that SDC achieves much higher delivery ratio and lower delay as well as scalability in various scenarios.

  • Dialogue Speech Recognition by Combining Hierarchical Topic Classification and Language Model Switching

    Ian R. LANE  Tatsuya KAWAHARA  Tomoko MATSUI  Satoshi NAKAMURA  

     
    PAPER-Spoken Language Systems

      Vol:
    E88-D No:3
      Page(s):
    446-454

    An efficient, scalable speech recognition architecture combining topic detection and topic-dependent language modeling is proposed for multi-domain spoken language systems. In the proposed approach, the inferred topic is automatically detected from the user's utterance, and speech recognition is then performed by applying an appropriate topic-dependent language model. This approach enables users to freely switch between domains while maintaining high recognition accuracy. As topic detection is performed on a single utterance, detection errors may occur and propagate through the system. To improve robustness, a hierarchical back-off mechanism is introduced where detailed topic models are applied when topic detection is confident and wider models that cover multiple topics are applied in cases of uncertainty. The performance of the proposed architecture is evaluated when combined with two topic detection methods: unigram likelihood and SVMs (Support Vector Machines). On the ATR Basic Travel Expression Corpus, both methods provide a significant reduction in WER (9.7% and 10.3%, respectively) compared to a single language model system. Furthermore, recognition accuracy is comparable to performing decoding with all topic-dependent models in parallel, while the required computational cost is much reduced.

  • A Kernel-Based Fisher Discriminant Analysis for Face Detection

    Takio KURITA  Toshiharu TAGUCHI  

     
    PAPER-Pattern Recognition

      Vol:
    E88-D No:3
      Page(s):
    628-635

    This paper presents a modification of kernel-based Fisher discriminant analysis (FDA) to design one-class classifier for face detection. In face detection, it is reasonable to assume "face" images to cluster in certain way, but "non face" images usually do not cluster since different kinds of images are included. It is difficult to model "non face" images as a single distribution in the discriminant space constructed by the usual two-class FDA. Also the dimension of the discriminant space constructed by the usual two-class FDA is bounded by 1. This means that we can not obtain higher dimensional discriminant space. To overcome these drawbacks of the usual two-class FDA, the discriminant criterion of FDA is modified such that the trace of covariance matrix of "face" class is minimized and the sum of squared errors between the average vector of "face" class and feature vectors of "non face" images are maximized. By this modification a higher dimensional discriminant space can be obtained. Experiments are conducted on "face" and "non face" classification using face images gathered from the available face databases and many face images on the Web. The results show that the proposed method can outperform the support vector machine (SVM). A close relationship between the proposed kernel-based FDA and kernel-based Principal Component Analysis (PCA) is also discussed.

  • MOS-Bounded Diodes for On-Chip ESD Protection in Deep Submicron CMOS Process

    Ming-Dou KER  Kun-Hsien LIN  Che-Hao CHUANG  

     
    PAPER-Semiconductor Materials and Devices

      Vol:
    E88-C No:3
      Page(s):
    429-436

    New diode structures without the field-oxide boundary across the p/n junction for ESD protection are proposed. A NMOS (PMOS) is especially inserted into the diode structure to form the NMOS-bounded (PMOS-bounded) diode, which is used to block the field oxide isolation across the p/n junction in the diode structure. The proposed N(P)MOS-bounded diodes can provide more efficient ESD protection to the internal circuits, as compared to the other diode structures. The N(P)MOS-bounded diodes can be used in the I/O ESD protection circuits, power-rail ESD clamp circuits, and the ESD conduction cells between the separated power lines. From the experimental results, the human-body-model ESD level of ESD protection circuit with the proposed N(P)MOS-bounded diodes is greater than 8 kV in a 0.35-µm CMOS process.

  • A Performance Prediction of Clock Generation PLLs: A Ring Oscillator Based PLL and an LC Oscillator Based PLL

    Takahito MIYAZAKI  Masanori HASHIMOTO  Hidetoshi ONODERA  

     
    PAPER-Integrated Electronics

      Vol:
    E88-C No:3
      Page(s):
    437-444

    This paper discusses performance prediction of clock generation PLLs using a ring oscillator based VCO (RingVCO) and an LC oscillator based VCO (LCVCO). For clock generation, we generally design PLLs using RingVCOs because of their superiority in tunable frequency range, chip area and power consumption, in spite of their poor noise characteristics. In the future, it is predicted that operating frequency will rapidly increase and supply voltage will dramatically decrease. Besides, rigid noise performances will be required. In this condition, it is not clear neither how performances of both PLLs will change nor the performance differences between both PLLs will change. This paper predicts and compares future performances of PLLs using a RingVCO and an LCVCO with a qualitative evaluation by an analytical approach and with design experiments based on predicted process parameters. Our discussion reveals that the relative performance difference between both PLLs will be unchanged. As technology advances, power dissipation and chip area of both PLLs favorably decrease, while, noise characteristics of both PLLs degrade, which indicates low noise PLL circuit design will be more important.

  • Tracking of Speaker Direction by Integrated Use of Microphone Pairs in Equilateral-Triangle

    Yusuke HIOKA  Nozomu HAMADA  

     
    PAPER

      Vol:
    E88-A No:3
      Page(s):
    633-641

    In this report, we propose a tracking algorithm of speaker direction using microphones located at vertices of an equilateral triangle. The method realizes tracking by minimizing a performance index that consists of the cross spectra at three different microphone pairs in the triangular array. We adopt the steepest descent method to minimize it, and for guaranteeing global convergence to the correct direction with high accuracy, we alter the performance index during the adaptation depending on the convergence state. Through some computer simulation and experiments in a real acoustic environment, we show the effectiveness of the proposed method.

  • Azim: Direction-Based Service System for Both Indoors and Outdoors

    Yohei IWASAKI  Nobuo KAWAGUCHI  Yasuyoshi INAGAKI  

     
    PAPER-Application

      Vol:
    E88-B No:3
      Page(s):
    1034-1044

    In this paper, we propose an advanced location-based service that we call a direction-based service, which utilizes both the position and direction of a user. The direction-based service enables a user to point to an object of interest for command or investigation. We also describe the design, implementation and evaluations of a direction-based service system named Azim. With this system, the direction of the user can be obtained by a magnetic-based direction sensor. The sensor is also used for azimuth-based position estimation, in which a user's position is estimated by having the user point to and measure azimuths of several markers or objects whose positions are already known. Because this approach does not require any other accurate position sensors or positive beacons, it can be deployed cost-effectively. Also, because the measurements are naturally associated with some degree of error, the position is calculated as a probability distribution. The calculation considers the error of direction measurement and the pre-obtained field information such as obstacles and magnetic field disturbance, which enables robust position measurements even in geomagnetically disturbed environments. For wide-area use, the system also utilizes a wireless LAN to obtain rough position information by identifying base stations. We have implemented a prototype system for the proposed method and some applications for the direction-based services. Furthermore, we have conducted experiments both indoors and outdoors, and exemplified that positioning accuracy by the proposed method is precise enough for a direction-based service.

  • Design Optimization of Active Shield Circuits for Digital Noise Suppression Based on Average Noise Evaluation

    Retdian A. NICODIMUS  Hiroto SUZUKI  Kazuyuki WADA  Shigetaka TAKAGI  

     
    PAPER

      Vol:
    E88-A No:2
      Page(s):
    444-450

    A design optimization of active shield circuit using noise averaging method is proposed. The relation between the averaged noise and the design parameters of the active shield circuit such as circuit gain and on-chip layout is examined. A simple design guideline is also provided. Simulation results show that the active shield circuit designed by the proposed optimization method gives a better noise suppression performance of about 28% than the conventional one.

  • A Noise Reduction Method for Non-stationary Noise Based on Noise Reconstruction System with ALE

    Naoto SASAOKA  Yoshio ITOH  Kensaku FUJII  

     
    LETTER-Digital Signal Processing

      Vol:
    E88-A No:2
      Page(s):
    593-596

    A noise reduction technique to reduce background noise in noisy speech is proposed. We have proposed the noise reduction method which uses a noise reconstruction system. However, since a residual speech signal is included in the input signal of a noise reconstruction filter (NRF) used for reconstructing the background noise, the long time average value of error signal for estimating the background noise is needed not to estimate the speech signal. Therefore, the ability of tracking the non-stationary noise is decreased. In order to solve this problem, we propose the noise reconstruction system with adaptive line enhancer (ALE). Since ALE works to obtain the signal occupied by noise components, the input signal of the NRF includes only a few speech components. Therefore, we can give the high tracking ability to NRF.

  • Extracting Translation Equivalents from Bilingual Comparable Corpora

    Hiroyuki KAJI  

     
    PAPER-Natural Language Processing

      Vol:
    E88-D No:2
      Page(s):
    313-323

    An improved method for extracting translation equivalents from bilingual comparable corpora according to contextual similarity was developed. This method has two main features. First, a seed bilingual lexicon--which is used to bridge contexts in different languages--is adapted to the corpora from which translation equivalents are to be extracted. Second, the contextual similarity is evaluated by using a combination of similarity measures defined in opposite directions. An experiment using Wall Street Journal and Nihon Keizai Shimbun corpora, together with the EDR bilingual dictionary, demonstrated the effectiveness of the method; it produced lists of candidate translation equivalents with an accuracy of around 30% for frequently occurring unknown words. The method thus proved to be useful for improving the coverage of a bilingual lexicon.

  • Prospects and Problems in Fabrication of MgB2 Josephson Junctions

    Kenji UEDA  Michio NAITO  

     
    INVITED REVIEW PAPER

      Vol:
    E88-C No:2
      Page(s):
    226-231

    We briefly survey recent developments in the thin film synthesis and junction fabrication of MgB2 toward superconducting electronics. The most serious problem in the thin film synthesis of MgB2 is the high vapor pressure required for phase stability. This problem makes in-situ film growth difficult. However, there has been substantial progress in thin film technology for MgB2 in the past three years. The low-temperature thin-film process in a UHV chamber can produce high-quality MgB2 films with Tc 35 K. Furthermore, technology to produce single-crystal epitaxial MgB2 films has recently been developed by using hybrid physical-chemical vapor deposition. With regard to Josephson junctions, various types of junctions have been fabricated, all of which indicate that MgB2 has potential for superconducting devices that operate at 20-30 K, the temperature reached by current commercial cryocoolers.

5301-5320hit(8214hit)