Masaki AMEMIYA Jun YAMAWAKU Toshio MORIOKA
Conventional WDM systems multiplex channels with different signal bandwidths using fixed and equal channel spacing. As a result, their spectral efficiency is rather poor. If the wavelength and the bandwidth of each channel in a WDM system could be freely changed as needed, a variety of services with different signal bandwidths could be accommodated efficiently. This is expected to yield high spectral efficiency. For this purpose, this paper proposes a WDM optically amplified system that combines optical power splitting with homodyne detection; its use in three configurations, point to point, ring (center to remote nodes), and peer to peer, is described. Coherent optical systems generally need a frequency stable local light source in addition to a sending light source in each WDM channel. We improve cost effectiveness by proposing that the output of one light source be divided to yield the local light for frequency selection by homodyne detection and the sending light source whose output is externally modulated by transmission signal. In this configuration, the local light level is low to permit high levels of sending power. The key problem is how to get high SNR with limited low-level local lights. This paper derives the optimum receiving loss condition that can maximize the SNR with local light levels as low as -20 dBm for the point to point configuration. For the ring configuration, the system overcomes the optical power loss created by splitting numbers over 1,000 even if the local lights are as low as 0 dBm. The ring configuration can, therefore, flexibly accommodate many users and services. We also elucidate the relation between SNR and BER for DPSK homodyne detection in a bandwidth-flexible system.
Bayesian combining of confidence measures is proposed for speech recognition. Bayesian combining is achieved by the estimation of joint pdf of confidence feature vector in correct and incorrect hypothesis classes. In addition, the adaptation of a confidence score using the pdf is presented. The proposed methods reduced the classification error rate by 18% from the conventional single feature based confidence scoring method in isolated word Out-of-Vocabulary rejection test.
Jianfeng QIANG Hiroshi HARADA Hiromitsu WAKANA Ping ZHANG
Multipath propagation of radio signal introduces frequency selectivity. OFDMA systems greatly suffer from frequency selective fading. It is an important limit factor of performance of OFDMA systems, especially in subband based multiple user access scehems. In this paper, we propose the method of subband selection and handover to improve the system performance over the frequency selective channel. Two subband selection algorithms are presented to accurately select the subband with high channel gain and avoid the channel notch. The random access procedure employing subband selection is presented as an example. The effects of the subband selection are also given. The selection effectively improves the performances of frame synchronization, frequency synchronization, channel estimation, and bit error rate (BER). The investigations show that the proposed scheme is promising to reliable communications over frequency selective fading channel.
Zhicheng SUI Qingji ZENG Shilin XIAO
Based on traffic prediction, a new reservation method is proposed to reduce delay and void filling ratio at edge node of optical burst switching networks. Simulation studies show that our method achieves an important improvement and has a dynamic optimum weight value in a certain offset time.
Shinsuke TAKAOKA Fumiyuki ADACHI
In this letter, pilot-assisted adaptive prediction iterative channel estimation in frequency-domain is presented for the antenna diversity reception of orthogonal frequency division multiplexing (OFDM) signals. A frequency-domain adaptive prediction filtering is applied to iterative channel estimation for improving the tracking capability against frequency-domain variations in a severe frequency-selective fading channel. Also, in order to track the changing fading environment, the tap weights of frequency-domain prediction filter are updated using the simple NLMS algorithm. Updating of tap weights is incorporated into the iterative channel estimation loop to achieve faster convergence rate. The average bit error rate (BER) performance in a frequency-selective Rayleigh fading channel is evaluated by computer simulation. It is confirmed that the frequency-domain adaptive prediction iterative channel estimation provides better BER performance than the conventional iterative channel estimation schemes.
This paper proposes a Voice Activity Detection (VAD) algorithm using Radial Basis Function (RBF) network. The k-means clustering and Least Mean Square (LMS) algorithm are used to update the RBF network to the underlying speech condition. The inputs for RBF are the three parameters a Code Excited Linear Prediction (CELP) coder, which works stably under various background noise levels. Adaptive hangover threshold applies in RBF-VAD for reducing error, because threshold value has trade off effect in VAD decision. The experimental results show that the proposed VAD algorithm achieves better performance than G.729 Annex B at any noise level.
Masayuki ARAI Satoshi FUKUMOTO Kazuhiko IWASAKI
In this paper, we present a model for evaluating the effectiveness of (2, 1, m) convolutional-code-based packet-level FEC, under the condition of a limited buffer size in which the number of available packets is restricted for recovery. We analytically derive the post-reconstruction receiving rate, i.e., the probability that a lost packet is received or recovered before the buffer limit is reached. We show numerical examples of the analytical results and demonstrate that the buffer size at the same level as m gives sufficient recovery performance.
Ki-Chai KIM Sung Min LIM Min Seok KIM
This letter presents a reduction technique of penetrated electromagnetic fields through a narrow slot in a planar conducting screen. When a plane wave is excited to the narrow slot, the aperture electric field is controlled by the two parallel wires connected on the slot. The magnitude of penetrated electromagnetic fields through a narrow slot is controlled by electric field distributions on the slot aperture. The results show that the magnitude of the penetrated electromagnetic field can be effectively reduced by installing the two parallel wires on the slot.
Norio YAMAGAKI Hideki TODE Koso MURAKAMI
Recently, various types of traffic have increased on the Internet with the development of broadband networks. However, it is difficult to guarantee QoS for each traffic type in current network environments. Moreover, it has been reported that bandwidth can be allocated to flows unfairly, and this can be an important issue for QoS guarantees. Therefore, we have proposed a flow-based queue management scheme, called Dual Metrics Fair Queueing (DMFQ), to improve the fairness and QoS per flow. DMFQ discards arrival packets by considering not only the arrival rate per flow but also the flow succession time. In addition, we have confirmed the effectiveness of DMFQ through several computer simulations. In this paper, we implement DMFQ with hardware for high-speed operation. Concretely, we propose the design policies and show the hardware design results.
An all-digital CMOS duty cycle correction (DCC) circuit with a fixed rising edge was proposed to achieve the wide correction ranges of input duty cycle and PVT variations, the low standby power and the fast recovery from the standby mode for use in multi-phase clock systems. SPICE simulations showed that this DCC adjusts the output duty cycle to 500.7% for the wide range of input duty cycle from 15% to 85% at the input frequency of 1 GHz, within the commercial range of PVT corners. The all-digital implementation and the use of a toggle flip flop at the input stage enabled the wide correction ranges of PVT variations and input duty cycle, respectively.
Vasaka VISOOTTIVISETH Hiroyuki KIDO Katsuyoshi IIDA Youki KADOBAYASHI Suguru YAMAGUCHI
Although IP Multicast offers efficient data delivery for large group communications, the most critical issue delaying widespread deployment of IP Multicast is the scalability of multicast forwarding state as the number of multicast groups increases. Sender-Initiated Multicast (SIM) was proposed as an alternative multicast forwarding scheme for small group communications with incremental deployment capability. The key feature of SIM is in its Preset mode with the automatic SIM tunneling function, which maintaining forwarding information states only on the branching routers. To demonstrate how SIM increases scalability with respect to the number of groups, in this paper we evaluate the proposed protocol both through simulations and real experiments. As from the network operator's point of view, the bandwidth consumption, memory requirements on state-and-signaling per session in routers, and the processing overhead are considered as evaluation parameters. Finally, we investigated the strategies for incremental deployment.
Arata KAWAMURA Youji IIGUNI Yoshio ITOH
A noise reduction technique that uses the linear prediction to remove noise components in speech signals has been proposed previously. The noise reduction works well for additive white noise signals, because the coefficients of the linear predictor converge such that the prediction error becomes white. In this method, the linear predictor is updated by a gradient-based algorithm with a fixed step-size. However, the optimal value of the step-size changes with the values of the prediction coefficients. In this paper, we propose a noise reduction system using the linear predictor with a variable step-size. The optimal value of the step-size depends also on the variance of the white noise, however the variance is unknown. We therefore introduce a speech/non-speech detector, and estimate the variance in non-speech segments where the observed signal includes only noise components. The simulation results show that the noise reduction capability of the proposed system is better than that of the conventional one with a fixed step-size.
SeongHan SHIN Kazukuni KOBARA Hideki IMAI
Authenticated Key Establishment (AKE) protocols enable two entities, say a client (or a user) and a server, to share common session keys in an authentic way. In this paper, we review the previous AKE protocols, all of which turn out to be insecure, under the following realistic assumptions: (1) High-entropy secrets that should be stored on devices may leak out due to accidents such as bugs or mis-configureations of the system; (2) The size of human-memorable secret, i.e. password, is short enough to memorize, but large enough to avoid on-line exhaustive search; (3) TRM (Tamper-Resistant Modules) used to store secrets are not perfectly free from bugs and mis-configurations; (4) A client remembers only one password, even if he/she communicates with several different servers. Then, we propose a simple leakage-resilient AKE protocol (cf.[41]) which is described as follows: the client keeps one password in mind and stores one secret value on devices, both of which are used to establish an authenticated session key with the server. The advantages of leakage-resilient AKEs to the previous AKEs are that the former is secure against active adversaries under the above-mentioned assumptions and has immunity to the leakage of stored secrets from a client and a server (or servers), respectively. In addition, the advantage of the proposed protocol to is the reduction of memory size of the client's secrets. And we extend our protocol to be possible for updating secret values registered in server(s) or password remembered by a client. Some applications and the formal security proof in the standard model of our protocol are also provided.
Ian R. LANE Tatsuya KAWAHARA Tomoko MATSUI Satoshi NAKAMURA
An efficient, scalable speech recognition architecture combining topic detection and topic-dependent language modeling is proposed for multi-domain spoken language systems. In the proposed approach, the inferred topic is automatically detected from the user's utterance, and speech recognition is then performed by applying an appropriate topic-dependent language model. This approach enables users to freely switch between domains while maintaining high recognition accuracy. As topic detection is performed on a single utterance, detection errors may occur and propagate through the system. To improve robustness, a hierarchical back-off mechanism is introduced where detailed topic models are applied when topic detection is confident and wider models that cover multiple topics are applied in cases of uncertainty. The performance of the proposed architecture is evaluated when combined with two topic detection methods: unigram likelihood and SVMs (Support Vector Machines). On the ATR Basic Travel Expression Corpus, both methods provide a significant reduction in WER (9.7% and 10.3%, respectively) compared to a single language model system. Furthermore, recognition accuracy is comparable to performing decoding with all topic-dependent models in parallel, while the required computational cost is much reduced.
Ming-Dou KER Kun-Hsien LIN Che-Hao CHUANG
New diode structures without the field-oxide boundary across the p/n junction for ESD protection are proposed. A NMOS (PMOS) is especially inserted into the diode structure to form the NMOS-bounded (PMOS-bounded) diode, which is used to block the field oxide isolation across the p/n junction in the diode structure. The proposed N(P)MOS-bounded diodes can provide more efficient ESD protection to the internal circuits, as compared to the other diode structures. The N(P)MOS-bounded diodes can be used in the I/O ESD protection circuits, power-rail ESD clamp circuits, and the ESD conduction cells between the separated power lines. From the experimental results, the human-body-model ESD level of ESD protection circuit with the proposed N(P)MOS-bounded diodes is greater than 8 kV in a 0.35-µm CMOS process.
Sungyun JUNG Jongmok SON Keunsung BAE
In this paper, we propose a new feature extraction method that combines both HMT-based denoising and weighted filter bank analysis for robust speech recognition. The proposed method is made up of two stages in cascade. The first stage is denoising process based on the wavelet domain Hidden Markov Tree model, and the second one is the filter bank analysis with weighting coefficients obtained from the residual noise in the first stage. To evaluate performance of the proposed method, recognition experiments were carried out for additive white Gaussian and pink noise with signal-to-noise ratio from 25 dB to 0 dB. Experiment results demonstrate the superiority of the proposed method to the conventional ones.
Takio KURITA Toshiharu TAGUCHI
This paper presents a modification of kernel-based Fisher discriminant analysis (FDA) to design one-class classifier for face detection. In face detection, it is reasonable to assume "face" images to cluster in certain way, but "non face" images usually do not cluster since different kinds of images are included. It is difficult to model "non face" images as a single distribution in the discriminant space constructed by the usual two-class FDA. Also the dimension of the discriminant space constructed by the usual two-class FDA is bounded by 1. This means that we can not obtain higher dimensional discriminant space. To overcome these drawbacks of the usual two-class FDA, the discriminant criterion of FDA is modified such that the trace of covariance matrix of "face" class is minimized and the sum of squared errors between the average vector of "face" class and feature vectors of "non face" images are maximized. By this modification a higher dimensional discriminant space can be obtained. Experiments are conducted on "face" and "non face" classification using face images gathered from the available face databases and many face images on the Web. The results show that the proposed method can outperform the support vector machine (SVM). A close relationship between the proposed kernel-based FDA and kernel-based Principal Component Analysis (PCA) is also discussed.
Wen-Hsien FANG Hsien-Sen HUNG Chun-Sem LU Ping-Chi CHU
This paper addresses a simple, and yet effective approach to the design of block adaptive beamformers via parallel projection method (PPM), which is an extension of the classic projection onto convex set (POCS) method to inconsistent sets scenarios. The proposed approach begins with the construction of the convex constraint sets which the weight vector of the adaptive beamformer lies in. The convex sets are judiciously chosen to force the weights to possess some desirable properties or to meet some prescribed rules. Based on the minimum variance criterion and a fixed gain at the look direction, two constraint sets including the minimum variance constraint set and the gain constraint set are considered. For every input block of data, the weights of the proposed beamformer can then be determined by iteratively projecting the weight vector onto these convex sets until it converges. Furnished simulations show that the proposed beamformer provides superior performance compared with previous works in various scenarios but yet in general with lower computational overhead.
Nobuhiko KITAWAKI Kou NAGAI Takeshi YAMADA
Recently, wideband speech communication using 7 kHz-wideband speech coding, as described in ITU-T Recommendations G.722, G.722.1, and G.722.2, has become increasingly necessary for use in advanced IP telephony using PCs, since, for this application, hands-free communication using separate microphones and loudspeakers is indispensable, and in this situation wideband speech is particularly helpful in enhancing the naturalness of communication. An objective quality measurement methodology for wideband-speech coding has been studied, its essential components being an objective quality measure and an input test signal. This paper describes Wideband-PESQ conforming to the draft Annex to ITU-T Recommendation P.862, "Perceptual Evaluation of Speech Quality (PESQ)," as the objective quality measure, by evaluating the consistency between the subjectively evaluated MOS (Mean Opinion Score) and objectively estimated MOS. This paper also describes the verification of artificial voice conforming to Recommendation P.50 "Artificial Voices," as the input test signal for such measurements, by evaluating the consistency between the objectively estimated MOS using a real voice and that obtained using an artificial voice.
This paper describes a dynamic and adaptive scheme for three-dimensional mesh morphing. Using several control maps, the connectivity of intermediate meshes is dynamically changing and the mesh vertices are adaptively modified. The 2D control maps in parametric space that include curvature map, area deformation map and distance map, are used to schedule the inserting and deleting vertices in each frame. Then, the positions of vertices are adaptively moved to better positions using weighted centroidal voronoi diagram (WCVD) and a Delaunay triangulation is finally used to determine the connectivity of mesh. In contrast to most previous work, the intermediate mesh connectivity gradually changes and is much less complicated. We demonstrate several examples of aesthetically pleasing morphs created by the proposed method.