The MOS switch with bootstrapped technique is widely used in low-voltage switched-capacitor circuit. The switched-capacitor circuit with the bootstrapped technique could be a dangerous design approach in the nano-scale CMOS process due to the gate-oxide transient overstress. The impact of gate-oxide transient overstress on MOS switch in switched-capacitor circuit is investigated in this work with the sample-and-hold amplifier (SHA) in a 130-nm CMOS process. After overstress on the MOS switch of SHA with unity-gain buffer, the circuit performances in time domain and frequency domain are measured to verify the impact of gate-oxide reliability on circuit performances. The oxide breakdown on switch device degrades the circuit performance of bootstrapped switch technique.
Tobias CINCAREK Hiromichi KAWANAMI Ryuichi NISIMURA Akinobu LEE Hiroshi SARUWATARI Kiyohiro SHIKANO
In this paper, the development, long-term operation and portability of a practical ASR application in a real environment is investigated. The target application is a speech-oriented guidance system installed at the local community center. The system has been exposed to ordinary people since November 2002. More than 300 hours or more than 700,000 inputs have been collected during four years. The outcome is a rare example of a large scale real-environment speech database. A simulation experiment is carried out with this database to investigate how the system's performance improves during the first two years of operation. The purpose is to determine empirically the amount of real-environment data which has to be prepared to build a system with reasonable speech recognition performance and response accuracy. Furthermore, the relative importance of developing the main system components, i.e. speech recognizer and the response generation module, is assessed. Although depending on the system's modeling capacities and domain complexity, experimental results show that overall performance stagnates after employing about 10-15 k utterances for training the acoustic model, 40-50 k utterances for training the language model and 40 k-50 k utterances for compiling the question and answer database. The Q&A database was most important for improving the system's response accuracy. Finally, the portability of the well-trained first system prototype for a different environment, a local subway station, is investigated. Since collection and preparation of large amounts of real data is impractical in general, only one month of data from the new environment is employed for system adaptation. While the speech recognition component of the first prototype has a high degree of portability, the response accuracy is lower than in the first environment. The main reason is a domain difference between the two systems, since they are installed in different environments. This implicates that it is imperative to take the behavior of users under real conditions into account to build a system with high user satisfaction.
A method for detecting interconnect open faults of CMOS combinational circuits by applying a ramp voltage to the power supply terminal is proposed. The method can assign a known logic value to a fault location automatically by applying a ramp voltage and as a result, it requires only one test vector to detect a fault as a delay fault or an erroneous logic value at primary outputs. In this paper, we show fault detectability and effectiveness of the proposed method by simulation-based and theoretical analysis. We also expose that the method can be applicable to every fault location in a circuit and open faults with any value. Finally, we show ATPG results that are suitable to the proposed method.
The depth-first sphere decoder (SD) and the K-best algorithm have been widely studied as near optimum detectors. Depth-first SD has a non-deterministic computational throughput and K-best requires a sorting unit whose complexity is significant when a large K is used together with high modulation constellations. In this letter, we propose a MIMO detector that employs the trellis structure instead of the conventional tree searching. This detector can keep the computational throughput constant and reduce the complexity because the sorting is not required. From the simulation and complexity analysis, we investigate the advantage and drawback of the proposed detector.
Toshiaki KAMADA Nobuaki MINEMATSU Takashi OSANAI Hisanori MAKINAE Masumi TANIMOTO
In forensic voice telephony speaker verification, we may be requested to identify a speaker in a very noisy environment, unlike the conditions in general research. In a noisy environment, we process speech first by clarifying it. However, the previous study of speaker verification from clarified speech did not yield satisfactory results. In this study, we experimented on speaker verification with clarification of speech in a noisy environment, and we examined the relationship between improving acoustic quality and speaker verification results. Moreover, experiments with realistic noise such as a crime prevention alarm and power supply noise was conducted, and speaker verification accuracy in a realistic environment was examined. We confirmed the validity of speaker verification with clarification of speech in a realistic noisy environment.
Jin-Song ZHANG Satoshi NAKAMURA
An efficient way to develop large scale speech corpora is to collect phonetically rich ones that have high coverage of phonetic contextual units. The sentence set, usually called as the minimum set, should have small text size in order to reduce the collection cost. It can be selected by a greedy search algorithm from a large mother text corpus. With the inclusion of more and more phonetic contextual effects, the number of different phonetic contextual units increased dramatically, making the search not a trivial issue. In order to improve the search efficiency, we previously proposed a so-called least-to-most-ordered greedy search based on the conventional algorithms. This paper evaluated these algorithms in order to show their different characteristics. The experimental results showed that the least-to-most-ordered methods successfully achieved smaller objective sets at significantly less computation time, when compared with the conventional ones. This algorithm has already been applied to the development a number of speech corpora, including a large scale phonetically rich Chinese speech corpus ATRPTH which played an important role in developing our multi-language translation system.
Amin SAEEDFAR Hiroyasu SATO Kunio SAWAYA
This paper includes different approaches for analysis of a thin-wire antenna in the presence of de-ionized water box at different temperatures as a high-permittivity three-dimensional dielectric body. In continuation with the previous work of authors, first, the coupled tensor-volume/line integral equations is solved by using Galerkin-based moment method (MoM) consisting of a combination of entire-domain and sub-domain basis functions including three-dimensional polynomials with different degrees. Then, the accuracy of such MoM, specifically for a high-permittivity dielectric scatterer, is substantiated by comparing its numerical results with that of FDTD method and some experimental data.
Much research has shown that a carefully designed auto rate medium access control can utilize the underlying physical multi-rate capability to exploit the time-variation of the channel. In this paper, we develop a simple analytical model to elucidate the rule that maximizes the throughput of RTS/CTS based multi-rate wireless local area networks. Based on the discovered rule, we propose two distributed fair auto rate medium access control schemes called FARM and FARM+ from the viewpoint of throughput fairness and time-share fairness, respectively. With the proposed schemes, after receiving a RTS frame, the receiver selectively returns the CTS frame to inform the transmitter the maximum feasible rate probed by the signal-to-noise ratio of the received RTS frame. The key feature of the proposed schemes is that they are capable of maintaining throughput/time-share fairness in asymmetric situation where the distribution of SNR varies with stations. Extensive simulation results show that the proposed schemes outperform the existing throughput/time-share fair auto rate schemes in time-varying channel conditions.
Shoei SATO Akio KOBAYASHI Kazuo ONOE Shinichi HOMMA Toru IMAI Tohru TAKAGI Tetsunori KOBAYASHI
We present a novel method of integrating the likelihoods of multiple feature streams, representing different acoustic aspects, for robust speech recognition. The integration algorithm dynamically calculates a frame-wise stream weight so that a higher weight is given to a stream that is robust to a variety of noisy environments or speaking styles. Such a robust stream is expected to show discriminative ability. A conventional method proposed for the recognition of spoken digits calculates the weights from the entropy of the whole set of HMM states. This paper extends the dynamic weighting to a real-time large-vocabulary continuous speech recognition (LVCSR) system. The proposed weight is calculated in real-time from mutual information between an input stream and active HMM states in a search space without an additional likelihood calculation. Furthermore, the mutual information takes the width of the search space into account by calculating the marginal entropy from the number of active states. In this paper, we integrate three features that are extracted through auditory filters by taking into account the human auditory system's ability to extract amplitude and frequency modulations. Due to this, features representing energy, amplitude drift, and resonant frequency drifts, are integrated. These features are expected to provide complementary clues for speech recognition. Speech recognition experiments on field reports and spontaneous commentary from Japanese broadcast news showed that the proposed method reduced error words by 9.2% in field reports and 4.7% in spontaneous commentaries relative to the best result obtained from a single stream.
Futoshi FURUTA Kazuo SAITOH Akira YOSHIDA Hideo SUZUKI
We have designed a superconductor-semiconductor hybrid analog-to-digital (A/D) converter and experimentally evaluated its performance at sampling frequencies up to 18.6 GHz. The A/D converter consists of a superconductor front-end circuit and a semiconductor back-end circuit. The front-end circuit includes a sigma-delta modulator and an interface circuit, which is for transmitting data signal to the semiconductor back-end circuit. The semiconductor back-end circuit performs decimation filtering. The design of the modulator was modified to reduce effects of integrator leak and thermal noise on signal-to-noise ratio (SNR). Using the improved modulator design, we achieved a bit-accuracy close to the ideal value. The hybrid architecture enabled us to reduce the integration scale of the front-end circuit to fewer than 500 junctions. This simplicity makes feasible a circuit based on a high TC superconductor as well as on a low TC superconductor. The experimental results show that the hybrid A/D converter operated perfectly and that SNR was 84.8 dB (bit accuracy~13.8 bit) at a band width of 9.1 MHz. This converter has the highest performance of all sigma-delta A/D converters.
Masakiyo FUJIMOTO Kentaro ISHIZUKA
This paper addresses the problem of voice activity detection (VAD) in noisy environments. The VAD method proposed in this paper is based on a statistical model approach, and estimates statistical models sequentially without a priori knowledge of noise. Namely, the proposed method constructs a clean speech / silence state transition model beforehand, and sequentially adapts the model to the noisy environment by using a switching Kalman filter when a signal is observed. In this paper, we carried out two evaluations. In the first, we observed that the proposed method significantly outperforms conventional methods as regards voice activity detection accuracy in simulated noise environments. Second, we evaluated the proposed method on a VAD evaluation framework, CENSREC-1-C. The evaluation results revealed that the proposed method significantly outperforms the baseline results of CENSREC-1-C as regards VAD accuracy in real environments. In addition, we confirmed that the proposed method helps to improve the accuracy of concatenated speech recognition in real environments.
This letter presents a race-free mixed serial-parallel comparison (RFMSPC) scheme which uses both serial and parallel CAMs in a match line. A self-reset search line scheme for the serial CAM is proposed to avoid the timing race problem and additional timing penalties. Various 32 entry CAMs are designed using 90 nm 1.2 V CMOS process to verify the proposed RFMSPC scheme. It shows that the RFMSPC saves power consumption by 40%, 53% and 63% at the cost of a 4%, 6% and 16% increase in search time according to 1, 2, and 4 serial CAM bits in a match line.
In this comment we point out that the mapping from carry-propagation adders to carry-save adders in the context of shift-and-add multiplication is inconsistent. Based on this it is shown that the implementation in Ref.[1] does not achieve any complexity reduction in practice.
Yukihiro BANDOH Kazuya HAYASE Seishi TAKAMURA Kazuto KAMIKURA Yoshiyuki YASHIMA
Realistic representations using extremely high quality images are becoming increasingly popular. For example, digital cinemas can now display moving pictures composed of high-resolution digital images. Although these applications focus on increasing the spatial resolution only, higher frame-rates are being considered to achieve more realistic representations. Since increasing the frame-rate increases the total amount of information, efficient coding methods are required. However, its statistical properties are not clarified. This paper establishes for high frame-rate video a mathematical model of the relationship between frame-rate and bit-rate. A coding experiment confirms the validity of the mathematical model.
Areeyata SRIPETCH Poompat SAENGUDOMLERT
In a power grid used to distribute electricity, optical fibers can be inserted inside overhead ground wires to form an optical network infrastructure for data communications. Dense wavelength division multiplexing (DWDM)-based optical networks present a promising approach to achieve a scalable backbone network for power grids. This paper proposes a complete optimization procedure for optical network designs based on an existing power grid. We design a network as a subgraph of the power grid and divide the network topology into two layers: backbone and access networks. The design procedure includes physical topology design, routing and wavelength assignment (RWA) and optical amplifier placement. We formulate the problem of topology design into two steps: selecting the concentrator nodes and their node members, and finding the connections among concentrators subject to the two-connectivity constraint on the backbone topology. Selection and connection of concentrators are done using integer linear programming (ILP). For RWA and optical amplifier placement problem, we solve these two problems together since they are closely related. Since the ILP for solving these two problems becomes intractable with increasing network size, we propose a simulated annealing approach. We choose a neighborhood structure based on path-switching operations using k shortest paths for each source and destination pair. The optimal number of optical amplifiers is solved based on local search among these neighbors. We solve and present some numerical results for several randomly generated power grid topologies.
Jakyong JUN Sangwon KANG Thomas R. FISCHER
In this paper, a block-constrained trellis coded quantization (BC-TCQ) algorithm is combined with an algebraic codebook to produce an algebraic trellis code (ATC) to be used in ACELP coding. In ATC, the set of allowed algebraic codebook pulse positions is expanded, and the expanded set is partitioned into subsets of pulse positions; the trellis branches are labeled with these subsets. The list Viterbi algorithm (LVA) is used to select the excitation codevector. The combination of an ATC codebook and LVA trellis search algorithm is denoted as an ATC-LVA block code. The ATC-LVA block code is used as the fixed codebook of the AMR-WB 8.85 kbps mode, reducing complexity compared to the conventional algebraic codebook.
Image segmentation is an essential processing step for many image analysis applications. In this paper, a novel image segmentation algorithm using fuzzy C-means clustering (FCM) with spatial constraints based on Markov random field (MRF) via Bayesian theory is proposed. Due to disregard of spatial constraint information, the FCM algorithm fails to segment images corrupted by noise. In order to improve the robustness of FCM to noise, a powerful model for the membership functions that incorporates local correlation is given by MRF defined through a Gibbs function. Then spatial information is incorporated into the FCM by Bayesian theory. Therefore, the proposed algorithm has both the advantages of the FCM and MRF, and is robust to noise. Experimental results on the synthetic and real-world images are given to demonstrate the robustness and validity of the proposed algorithm.
Takafumi OKUYAMA Kenta YASUKAWA Katsunori YAMAOKA
Delay jitter degrades the quality of delay-sensitive live media streaming. We investigate the use of multipath transmission with two paths to reduce delay jitter and, in this paper, propose a nearly equal delay path set configuration (NEED-PC) scheme that further improves the performance of the multipath delay jitter reduction method for delay-sensitive live media streaming. The NEED-PC scheme configures a pair of a maximally node-disjoint paths that have nearly equal path delays and satisfy a given delay constraint. The results of our simulation experiments show that path sets configured by the NEED-PC scheme exhibit better delay jitter reduction characteristics than a conventional scheme that chooses the shortest path as the primary path. We evaluate the performance of path sets configured by the NEED-PC scheme and find that the NEED-PC scheme reduces delay jitter when it is applied to a multipath delay jitter reduction method. We also investigate the trade-off between reduced delay jitter and the increased traffic load incurred by applying multipath transmission to more flows. The results show that the NEED-PC scheme is practically effective even if the amount of additional redundant traffic caused by using multipath transmission is taken into account.
Yuta YAMATO Yusuke NAKAMURA Kohei MIYASE Xiaoqing WEN Seiji KAJIHARA
Per-test diagnosis based on the X-fault model is an effective approach for a circuit with physical defects of non-deterministic logic behavior. However, the extensive use of vias and buffers in a deep-submicron circuit and the unpredictable order relation among threshold voltages at the fanout branches of a gate have not been fully addressed by conventional per-test X-fault diagnosis. To take these factors into consideration, this paper proposes an improved per-test X-fault diagnosis method, featuring (1) an extended X-fault model to handle vias and buffers and (2) the use of occurrence probabilities of logic behaviors for a physical defect to handle the unpredictable relation among threshold voltages. Experimental results show the effectiveness of the proposed method.
Kentaro ISODA Takuya SAKAMOTO Toru SATO
Orbit estimation of space debris, objects of no inherent value orbiting the earth, is a task that is important for avoiding collisions with spacecraft. The Kamisaibara Spaceguard Center radar system was built in 2004 as the first radar facility in Japan devoted to the observation of space debris. In order to detect the smaller debris, coherent integration is effective in improving SNR (Signal-to-Noise Ratio). However, it is difficult to apply coherent integration to real data because the motions of the targets are unknown. An effective algorithm is proposed for echo detection and orbit estimation of the faint echoes from space debris. The characteristics of the evaluation function are utilized by the algorithm. Experiments show the proposed algorithm improves SNR by 8.32 dB and enables estimation of orbital parameters accurately to allow for re-tracking with a single radar.