Yasuyuki SUZUKI Zin YAMAZAKI Masayuki MAMADA
A monolithic modulator driver IC based on InP HBTs with a new circuit topology -- called a functional distributed circuit (FDC) -- for over 80-Gb/s optical transmission systems has been developed. The FDC topology includes a wide-band amplifier designed using a distributed circuit, a digital function designed using a lumped circuit, and broadband impedance matching between the lumped circuit and distributed circuit to enable both wider bandwidth and digital functions. The driver IC integrated with a 2:1 multiplexing function produces 2.6-Vp-p (differential output: 5.2 Vp-p) and 2.4- Vp-p (differential output: 4.8 Vp-p) output-voltage swings with less than 450-fs and 530-fs rms jitter at 80 Gb/s and 90 Gb/s, respectively. To the best of our knowledge, this is equivalent to the highest data rate operation yet reported for monolithic modulator drivers. When it was mounted in a module, the driver IC successfully achieved electro-optical modulation using a dual-drive LiNbO3 Mach-Zehnder modulator up to 90 Gb/s. These results indicate that the FDC has the potential to realize high-speed and functional ICs for over-80-Gb/s transmission systems.
Junghyeun HWANG Hisakazu KIKUCHI Shogo MURAMATSU Jaeho SHIN
The error diffusion filter in this paper is optimized with respect to the ideal blue noise pattern corresponding to a single tone level. The filter coefficients are optimized by the minimization of the squared error norm between the Fourier power spectra of the resulting halftone and the blue noise pattern. During the process of optimization, the binary pattern power spectrum matching algorithm is applied with the aid of a new blue noise model. The number of the optimum filters is equal to that of different tones. The visual fidelity of the bilevel halftones generated by the error diffusion filters is evaluated in terms of a weighted signal-to-noise ratio, Fourier power spectra, and others. Experimental results have demonstrated that the proposed filter set generates satisfactory bilevel halftones of grayscale images.
Wei CHEN Gang LIU Jun GUO Shinichiro OMACHI Masako OMACHI Yujing GUO
In speech recognition, confidence annotation adopts a single confidence feature or a combination of different features for classification. These confidence features are always extracted from decoding information. However, it is proved that about 30% of knowledge of human speech understanding is mainly derived from high-level information. Thus, how to extract a high-level confidence feature statistically independent of decoding information is worth researching in speech recognition. In this paper, a novel confidence feature extraction algorithm based on latent topic similarity is proposed. Each word topic distribution and context topic distribution in one recognition result is firstly obtained using the latent Dirichlet allocation (LDA) topic model, and then, the proposed word confidence feature is extracted by determining the similarities between these two topic distributions. The experiments show that the proposed feature increases the number of information sources of confidence features with a good information complementary effect and can effectively improve the performance of confidence annotation combined with confidence features from decoding information.
Sung Soo KIM Chang Woo HAN Nam Soo KIM
In this letter, we present useful features accounting for pronunciation prominence and propose a classification technique for prominence detection. A set of phone-specific features are extracted based on a forced alignment of the test pronunciation provided by a speech recognition system. These features are then applied to the traditional classifiers such as the support vector machine (SVM), artificial neural network (ANN) and adaptive boosting (Adaboost) for detecting the place of prominence.
Sanaz SEYEDIN Seyed Mohammad AHADI
This paper presents a novel noise-robust feature extraction method for speech recognition. It is based on making the Minimum Variance Distortionless Response (MVDR) power spectrum estimation method robust against noise. This robustness is obtained by modifying the distortionless constraint of the MVDR spectral estimation method via weighting the sub-band power spectrum values based on the sub-band signal to noise ratios. The optimum weighting is obtained by employing the experimental findings of psychoacoustics. According to our experiments, this technique is successful in modifying the power spectrum of speech signals and making it robust against noise. The above method, when evaluated on Aurora 2 task for recognition purposes, outperformed both the MFCC features as the baseline and the MVDR-based features in different noisy conditions.
Sanghyun SEO Eunjung CHO Giorgi AROSHVILI Chong JIN Dimitris PAVLIDIS Laurence CONSIDINE
The paper presents a systematic study of in-situ passivated AlN/GaN Metal Insulator Semiconductor Field Effect Transistors (MISFETs) with submicron gates. DC, high frequency small signal, large signal and low frequency dispersion effects are reported. The DC characteristics are analyzed in conjunction with the power performance of the device at high frequencies. Studies of the low frequency characteristics are presented and the results are compared with those of AlGaN/GaN High Electron Mobility Transistors (HEMTs). Small signal measurements showed a current gain cutoff frequency and maximum oscillation frequency of 49.9 GHz and 102.3 GHz respectively. The overall characteristics of the device include a peak current density of 335 mA/mm, peak extrinsic transconductance of 130 mS/mm, a maximum output power density of 533 mW/mm with peak power added efficiency (P.A.E.) of 41.3% and linear gain of 17 dB. The maximum frequency dispersion of transconductance and output resistance of the fabricated MISFETs is 20% and 21% respectively.
Do-Gil LEE Gumwon HONG Seok Kee LEE Hae-Chang RIM
The construction of annotated corpora requires considerable manual effort. This paper presents a pragmatic method to minimize human intervention for the construction of Korean part-of-speech (POS) tagged corpus. Instead of focusing on improving the performance of conventional automatic POS taggers, we devise a discriminative POS tagger which can selectively produce either a single analysis or multiple analyses based on the tagging reliability. The proposed approach uses two decision rules to judge the tagging reliability. Experimental results show that the proposed approach can effectively control the quality of corpus and the amount of manual annotation by the threshold value of the rule.
Yasushi YUMINAKA Yasunori TAKAHASHI Kenichi HENMI
This paper presents a Pulse-Width Modulation (PWM) pre-emphasis technique which utilizes time-domain information processing to increase the data rate for a given bandwidth of interconnection. The PWM pre-emphasis method does not change the pulse amplitude as for conventional FIR pre-emphasis, but instead exploits timing resolution. This fits well with recent CMOS technology trends toward higher switching speeds and lower supply voltage. We discuss multiple-valued data transmission based on time-domain pre-emphasis techniques in consideration of higher-order channel effects. Also, a new data-dependent adaptive time-domain pre-emphasis technique is proposed to compensate for the data-dependent jitter.
Chunxiao JIANG Xin MA Canfeng CHEN Jian MA Yong REN
Dynamic spectrum access has become a focal issue recently, in which identifying the available spectrum plays a rather important role. Lots of work has been done concerning secondary user (SU) synchronously accessing primary user's (PU's) network. However, on one hand, SU may have no idea about PU's communication protocols; on the other, it is possible that communications among PU are not based on synchronous scheme at all. In order to address such problems, this paper advances a strategy for SU to search available spectrums with asynchronous MAC-layer sensing. With this method, SUs need not know the communication mechanisms in PU's network when dynamically accessing. We will focus on four aspects: 1) strategy for searching available channels; 2) vacating strategy when PUs come back; 3) estimation of channel parameters; 4) impact of SUs' interference on PU's data rate. The simulations show that our search strategy not only can achieve nearly 50% less interference probability than equal allocation of total search time, but also well adapts to time-varying channels. Moreover, access by our strategies can attain 150% more access time than random access. The moment matching estimator shows good performance in estimating and tracing time-varying channels.
In this paper, we propose a new method that employs two novel features, correlation density (Cd) and fractal dimension (Fd), to recognize emotional states contained in speech. The former feature obtained by a list of parametric filters reflects the broad frequency components and the fine structure of lower frequency components, contributed by unvoiced phones and voiced phones, respectively; the latter feature indicates the non-linearity and self-similarity of a speech signal. Comparative experiments based on Hidden Markov Model and K Nearest Neighbor methods are carried out. The results show that Cd and Fd are much more closely related with emotional expression than the features commonly used.
Cooperation is an attractive approach to improving the spectrum sensing performance of cognitive systems experiencing deep shadowing and fading. In this letter, an efficient weight-based cooperative spectrum sensing scheme is proposed. Simulation results show that the proposed scheme has better accuracy than "AND," "OR," and "half-voting" combination schemes and has similar spectrum sensing accuracy but with lower computational and communication complexity in comparison to the "optimal data fusion" rule.
Seong-Jun HAHM Yuichi OHKAWA Masashi ITO Motoyuki SUZUKI Akinori ITO Shozo MAKINO
We propose an improved reference speaker weighting (RSW) and speaker cluster weighting (SCW) approach that uses an aspect model. The concept of the approach is that the adapted model is a linear combination of a few latent reference models obtained from a set of reference speakers. The aspect model has specific latent-space characteristics that differ from orthogonal basis vectors of eigenvoice. The aspect model is a "mixture-of-mixture" model. We first calculate a small number of latent reference models as mixtures of distributions of the reference speaker's models, and then the latent reference models are mixed to obtain the adapted distribution. The mixture weights are calculated based on the expectation maximization (EM) algorithm. We use the obtained mixture weights for interpolating mean parameters of the distributions. Both training and adaptation are performed based on likelihood maximization with respect to the training and adaptation data, respectively. We conduct a continuous speech recognition experiment using a Korean database (KAIST-TRADE). The results are compared to those of a conventional MAP, MLLR, RSW, eigenvoice and SCW. Absolute word accuracy improvement of 2.06 point was achieved using the proposed method, even though we use only 0.3 s of adaptation data.
Makoto SAKAI Norihide KITAOKA Kazuya TAKEDA
Acoustic feature transformation is widely used to reduce dimensionality and improve speech recognition performance. In this letter we focus on dimensionality reduction methods that minimize the average classification error. Unfortunately, minimization of the average classification error may cause considerable overlaps between distributions of some classes. To mitigate risks of considerable overlaps, we propose a dimensionality reduction method that minimizes the maximum classification error. We also propose two interpolated methods that can describe the average and maximum classification errors. Experimental results show that these proposed methods improve speech recognition performance.
Akinori ITO Shun'ichiro ABE Yoiti SUZUKI
In this paper, we propose a novel data hiding technique for G.711-coded speech based on the LSB substitution method. The novel feature of the proposed method is that a low-bitrate encoder, G.726 ADPCM, is used as a reference for deciding how many bits can be embedded in a sample. Experiments showed that the method outperformed the simple LSB substitution method and the selective embedding method proposed by Aoki. We achieved 4-kbit/s embedding with almost no subjective degradation of speech quality, and 10 kbit/s while maintaining good quality.
Subjects in Electromagnetic Compatibility (EMC) research that have been presented at meetings of the IEICE Technical Committee on Electromagnetic Compatibility (EMCJ) are overviewed and categorized. The temporal changes in the proportions of the categorized subjects among the total number of presentations each year is also shown. Finally, speculative opinions are presented on what EMC subjects will be studied in the near future.
Hae Young LEE Seung-Min PARK Tae Ho CHO
This paper presents an approach to implementing simulation models for SAM fuzzy controllers without the use of external components. The approach represents a fuzzy controller as a composition of simple simulation models which involve only basic operations.
Yoshiharu AKIYAMA Hiroshi YAMANE Nobuo KUWABARA
We investigated the effect of a high-speed power line communication (PLC) signal induced into a very high-speed digital subscriber line (VDSL) system by conductive coupling based on a network model. Four electronic devices with AC mains and telecommunication ports were modeled using a 4-port network, and the parameters of the network were obtained from measuring impedance and transmission loss. We evaluated the decoupling factor from the mains port to the telecommunication port of a VDSL modem using these parameters for the four electric and electronic devices. The results indicate that the mean value of the decoupling factor for the differential and common mode signals were more than 88 and 62 dB, respectively, in the frequency range of a PLC system. Taking the following parameters into consideration; decoupling factor Ld, the average transmission signal powers of VDSL and PLC, desired and undesired (DU) ratio, and transmission loss of a typical 300-m-long indoor telecommunication line, the VDSL system cannot be disturbed by the PLC signal induced into the VDSL modem from the AC mains port in normal installation.
Kohei MIYASE Xiaoqing WEN Seiji KAJIHARA Yuta YAMATO Atsushi TAKASHIMA Hiroshi FURUKAWA Kenji NODA Hideaki ITO Kazumi HATAYAMA Takashi AIKYO Kewal K. SALUJA
Capture-safety, (defined as the avoidance of timing error due to unduly high launch switching activity in capture mode during at-speed scan testing), is critical in avoiding test induced yield loss. Although several sophisticated techniques are available for reducing capture IR-drop, there are few complete capture-safe test generation flows. This paper addresses the problem by proposing a novel and practical capture-safe test generation flow, featuring (1) a complete capture-safe test generation flow; (2) reliable capture-safety checking; and (3) effective capture-safety improvement by combining X-bit identification & X-filling with low launch-switching-activity test generation. The proposed flow minimizes test data inflation and is compatible with existing automatic test pattern generation (ATPG) flow. The techniques proposed in the flow achieve capture-safety without changing the circuit-under-test or the clocking scheme.
The ability to find the speaker's face region in a video is useful for various applications. In this work, we develop a novel technique to find this region within different time windows, which is robust against the changes of view, scale, and background. The main thrust of our technique is to integrate audiovisual correlation analysis into a video segmentation framework. We analyze the audiovisual correlation locally by computing quadratic mutual information between our audiovisual features. The computation of quadratic mutual information is based on the probability density functions estimated by kernel density estimation with adaptive kernel bandwidth. The results of this audiovisual correlation analysis are incorporated into graph cut-based video segmentation to resolve a globally optimum extraction of the speaker's face region. The setting of any heuristic threshold in this segmentation is avoided by learning the correlation distributions of speaker and background by expectation maximization. Experimental results demonstrate that our method can detect the speaker's face region accurately and robustly for different views, scales, and backgrounds.
Tzong-Lin WU Jun FAN Francesco de PAULIS Chuen-De WANG Antonio Ciccomancini SCOGNA Antonio ORLANDI
Noise coupling on the power distribution networks (PDN) or between PDN and signal traces is becoming one of the main challenges in designing above GHz high-speed digital circuits. Developing an efficient and accurate modeling method is essential to understand the noise coupling mechanism and then solve the problem afterwards. In addition, development of new noise mitigation technology is also important for future high-speed circuit systems. In this invited paper, a novel modeling methodology that is based on the physics-based equivalent circuit model will be introduced, and an example of multiple layer PCB circuits will be modeled and validated with good accuracy. Based on the periodic structure concept, several new electromagnetic bandgap structures (EBG), such as coplanar EBG, photonic crystal power layer (PCPL), and ground surface perturbation lattice (GSPL), will be introduced for the mitigation of power/ground noise. The trade/offs of all these structures will be discussed.