Seong-Jun HAHM Yuichi OHKAWA Masashi ITO Motoyuki SUZUKI Akinori ITO Shozo MAKINO
We propose an improved reference speaker weighting (RSW) and speaker cluster weighting (SCW) approach that uses an aspect model. The concept of the approach is that the adapted model is a linear combination of a few latent reference models obtained from a set of reference speakers. The aspect model has specific latent-space characteristics that differ from orthogonal basis vectors of eigenvoice. The aspect model is a "mixture-of-mixture" model. We first calculate a small number of latent reference models as mixtures of distributions of the reference speaker's models, and then the latent reference models are mixed to obtain the adapted distribution. The mixture weights are calculated based on the expectation maximization (EM) algorithm. We use the obtained mixture weights for interpolating mean parameters of the distributions. Both training and adaptation are performed based on likelihood maximization with respect to the training and adaptation data, respectively. We conduct a continuous speech recognition experiment using a Korean database (KAIST-TRADE). The results are compared to those of a conventional MAP, MLLR, RSW, eigenvoice and SCW. Absolute word accuracy improvement of 2.06 point was achieved using the proposed method, even though we use only 0.3 s of adaptation data.
Jaewoon KIM Youngjin PARK Soonwoo LEE Yoan SHIN
TR-UWB (Transmitted Reference-Ultra Wide Band) systems have low system complexity since they transmit data with the corresponding reference signals and demodulate the data through correlation using these received signals. However, the BER (Bit Error Rate) performance in the conventional TR-UWB systems is sensitive to the SNR (Signal-to-Noise Ratio) of the reference templates used in the correlator. We propose an improved recursive transceiver structure that effectively minimizes the BER for TR-UWB systems by increasing the SNR of reference templates.
Takeshi YAMADA Yuki KASUYA Yuki SHINOHARA Nobuhiko KITAWAKI
This paper describes non-reference objective quality evaluation for noise-reduced speech. First, a subjective test is conducted in accordance with ITU-T Rec. P.835 to obtain the speech quality, the noise quality, and the overall quality of noise-reduced speech. Based on the results, we then propose an overall quality estimation model. The unique point of the proposed model is that the estimation of the overall quality is done only using the previously estimated speech quality and noise quality, in contrast to conventional models, which utilize the acoustical features extracted. Finally, we propose a non-reference objective quality evaluation method using the proposed model. The results of an experiment with different noise reduction algorithms and noise types confirmed that the proposed method gives more accurate estimates of the overall quality compared with the method described in ITU-T Rec. P.563.
Kenji SUGIYAMA Naoya SAGARA Yohei KASHIMURA
With DCT coding, block artifact and mosquito noise degradations appear in decoded pictures. The control of post filtering is important to reduce degradations without causing side effects. Decoding information is useful, if the filter is inside or close to the encoder; however, it is difficult to control with independent post filtering, such as in a display. In this case, control requires the estimation of the artifact from only the decoded picture. In this work, we describe an estimation method that determines the mosquito noise block and level. In this method, the ratio of spatial activity is taken between the mosquito block and the neighboring flat block. We test the proposed method using the reconstructed pictures which are coded with different quantization scales. We recognize that the results are mostly reasonable with the different quantizations.
Toru YAMADA Yoshihiro MIYAMOTO Yuzo SENDA Masahiro SERIZAWA
This paper presents a Reduced-reference based video-quality estimation method suitable for individual end-user quality monitoring of IPTV services. With the proposed method, the activity values for individual given-size pixel blocks of an original video are transmitted to end-user terminals. At the end-user terminals, the video quality of a received video is estimated on the basis of the activity-difference between the original video and the received video. Psychovisual weightings and video-quality score adjustments for fatal degradations are applied to improve estimation accuracy. In addition, low-bit-rate transmission is achieved by using temporal sub-sampling and by transmitting only the lower six bits of each activity value. The proposed method achieves accurate video quality estimation using only low-bit-rate original video information (15 kbps for SDTV). The correlation coefficient between actual subjective video quality and estimated quality is 0.901 with 15 kbps side information. The proposed method does not need computationally demanding spatial and gain-and-offset registrations. Therefore, it is suitable for real-time video-quality monitoring in IPTV services.
Ken UENO Tetsuya HIROSE Tetsuya ASAI Yoshihito AMEMIYA
A voltage-controlled oscillator (VCO) tolerant to process variations at lower supply voltage was proposed. The circuit consists of an on-chip threshold-voltage-monitoring circuit, a current-source circuit, a body- biasing control circuit, and the delay cells of the VCO. Because variations in low-voltage VCO frequency are mainly determined by that of the current in delay cells, a current-compensation technique was adopted by using an on-chip threshold-voltage-monitoring circuit and body-biasing circuit techniques. Monte Carlo SPICE simulations demonstrated that variations in the oscillation frequency by using the proposed techniques were able to be suppressed about 65% at a 1-V supply voltage, compared to frequencies with and without the techniques.
Hyo J. LEE In Hwan DOH Eunsam KIM Sam H. NOH
Conventional kernel prefetching schemes have focused on taking advantage of sequential access patterns that are easy to detect. However, it is observed that, on random and even sequential references, they may cause performance degradation due to inaccurate pattern prediction and overshooting. To address these problems, we propose a novel approach to work with existing kernel prefetching schemes, called Reference Pattern based kernel Prefetching (RPP). The RPP can reduce negative effects of existing schemes by identifying one more reference pattern, i.e., looping, in addition to random and sequential patterns and delaying starting prefetching until patterns are confirmed to be sequential or looping.
Koichi KOBAYASHI Kunihiko HIRAISHI Nguyen Van TANG
In this paper, we propose a new approximate algorithm for the model predictive control (MPC) problem with a time-varying reference of hybrid systems. The proposed algorithm consists of an offline computation and an online computation. In the offline computation, candidates of mode sequences are derived. In the online computation, after the mode sequence is uniquely decided among candidates, the finite-time optimal control problem, i.e., the quadratic programming problem, is solved. So by applying the proposed algorithm, the computational amount of the online computation is decreased. First, the MPC problem with a time-varying reference is formulated. Next, the proposed algorithm is explained, and the accuracy of the obtained approximate solution is discussed. Finally, the effectiveness of the proposed method is shown by a numerical example.
Dajiang ZHOU Jinjia ZHOU Satoshi GOTO
In the latest video coding frameworks, efficiency of motion vector (MV) coding is becoming increasingly important because of the growing bit rate portion of motion information. However, neither the conventional median predictor, nor the newer schemes such as the minimum bit rate prediction scheme and the hybrid scheme, can effectively eliminate the local redundancy of motion vectors. In this paper, we present the prioritized reference decision scheme for efficient motion vector coding, based on the H.264/AVC framework. This scheme makes use of a boolean indicator to specify whether the median predictor is to be used for the current MV or not. If not, the median prediction is considered not suitable for the current MV, and this information is used for refining the possible space of a group of reference MVs including 4 neighboring MVs and the zero MV. This group of MVs is organized to be a prioritized list so that the reference MV with highest priority is to be selected as the prediction value. Furthermore, the boolean indicators are coded into the modified code words of mb_type and sub_mb_type, so as to reduce the overhead. By applying the proposed scheme, the structure and the applicability problems with the state-of-the-art MBP scheme have been overcome. Experimental result shows that the proposed scheme achieves a considerable reduction of bits for MVDs, compared with the conventional median prediction algorithm. It also achieves a better and much stabler performance than MBP-based MV coding.
In this paper, a novel 800 mV beta-multiplier reference current source circuit is presented. In order to cope with the narrow input common-mode range of the Opamp in the reference circuit, the resistive voltage divider was employed. High gain Opamp was designed to compensate for the intrinsic low output resistance of the MOS transistors. The proposed reference circuit was designed in a standard 0.18 µm CMOS process with nominal Vth of 420 mV and -450 mV for n-MOS and p-MOS transistor, respectively. The total power consumption including Opamp is less than 50 µW.
Hidekazu TAOKA Yoshihisa KISHIYAMA Kenichi HIGUCHI Mamoru SAWAHASHI
This paper presents comparisons between common and dedicated reference signals (RSs) for channel estimation in MIMO multiplexing using codebook-based precoding for orthogonal frequency division multiplexing (OFDM) radio access in the Evolved UTRA downlink with frequency division duplexing (FDD). We clarify the best RS structure for precoding-based MIMO multiplexing based on comparisons of the structures in terms of the achievable throughput taking into account the overhead of the common and dedicated RSs and the precoding matrix indication (PMI) signal. Based on extensive simulations on the throughput in 2-by-2 and 4-by-4 MIMO multiplexing with precoding, we clarify that channel estimation based on common RSs multiplied with the precoding matrix indicated by the PMI signal achieves higher throughput compared to that using dedicated RSs irrespective of the number of spatial multiplexing streams when the number of available precoding matrices, i.e., the codebook size, is less than approximately 16 and 32 for 2-by-2 and 4-by-4 MIMO multiplexing, respectively.
Sangwon HAN Jongsik KIM Kwang-Ho WON Hyunchol SHIN
In a low dropout (LDO) linear regulator whose reference voltage is supplied by a bandgap reference, double stacked diodes increase the effective junction area ratio in the bandgap reference, which significantly lowers the output spectral noise of the LDO. A low noise LDO with the area-efficient bandgap reference is implemented in 0.18 µm CMOS. An effective diode area ratio of 105 is obtained while the actual silicon area is saved by a factor of 4.77. As a result, a remarkably low output noise of 186 nV/sqrt(Hz) is achieved at 1 kHz. Moreover, the dropout voltage, line regulation, and load regulation of the LDO are measured to be 0.3 V, 0.04%/V, and 0.46%, respectively.
Shu-Ling SHIEH I-En LIAO Kuo-Feng HWANG Heng-Yu CHEN
This paper proposes an efficient self-organizing map algorithm based on reference point and filters. A strategy called Reference Point SOM (RPSOM) is proposed to improve SOM execution time by means of filtering with two thresholds T1 and T2. We use one threshold, T1, to define the search boundary parameter used to search for the Best-Matching Unit (BMU) with respect to input vectors. The other threshold, T2, is used as the search boundary within which the BMU finds its neighbors. The proposed algorithm reduces the time complexity from O(n2) to O(n) in finding the initial neurons as compared to the algorithm proposed by Su et al. [16] . The RPSOM dramatically reduces the time complexity, especially in the computation of large data set. From the experimental results, we find that it is better to construct a good initial map and then to use the unsupervised learning to make small subsequent adjustments.
Chihiro ONO Yasuhiro TAKISHIMA Yoichi MOTOMURA Hideki ASOH Yasuhide SHINAGAWA Michita IMAI Yuichiro ANZAI
This paper proposes a novel approach of constructing statistical preference models for context-aware personalized applications such as recommender systems. In constructing context-aware statistical preference models, one of the most important but difficult problems is acquiring a large amount of training data in various contexts/situations. In particular, some situations require a heavy workload to set them up or to collect subjects capable of answering the inquiries under those situations. Because of this difficulty, it is usually done to simply collect a small amount of data in a real situation, or to collect a large amount of data in a supposed situation, i.e., a situation that the subject pretends that he is in the specific situation to answer inquiries. However, both approaches have problems. As for the former approach, the performance of the constructed preference model is likely to be poor because the amount of data is small. For the latter approach, the data acquired in the supposed situation may differ from that acquired in the real situation. Nevertheless, the difference has not been taken seriously in existing researches. In this paper we propose methods of obtaining a better preference model by integrating a small amount of real situation data with a large amount of supposed situation data. The methods are evaluated using data regarding food preferences. The experimental results show that the precision of the preference model can be improved significantly.
This paper presents the analysis of in-band interference caused by pulse-based ultra-wideband (UWB) systems. The analysis contains both plain Impulse Radio UWB (IR-UWB) and Transmitted Reference UWB (TR-UWB) systems as a source of interference. The supposed victim is a narrowband BPSK system with a band-pass filter. The effect of pulse-based UWB systems is analyzed in terms of bit error rate. The analysis is given in terms of the specific combinations of pulse repetition frequency and center frequency of the narrowband bandpass filter. In those situations, the UWB interference cannot be modeled as a Gaussian noise. It also manifests situations in which the victim is under the severest or the slightest interference from TR-UWB. According to its result, the analysis is validated via simulation.
The market and users' requirements have been rapidly changing and diversified. Under these heterogeneous and dynamic situations, not only the system structure itself, but also the accessible information services would be changed constantly. To cope with the continuously changing conditions of service provision and utilization, Faded Information Field (FIF) has been proposed, which is a agent-based distributed information service system architecture. In the case of a mono-service request, the system is designed to improve users' access time and preserve load balancing through the information structure. However, with interdependent requests of multi-service increasing, adaptability and timeliness have to be assured by the system. In this paper, the relationship that exists among the correlated services and the users' preferences for separate and integrated services is clarified. Based on these factors, the autonomous preference-aware information services integration technology to provide one-stop service for users multi-service requests is proposed. As compared to the conventional system, we show that proposed technology is able to reduce the total access time.
Qin LIU Yiqing HUANG Satoshi GOTO Takeshi IKENAGA
Compared with previous standards, H.264/AVC adopts variable block size motion estimation (VBSME) and multiple reference frames (MRF) to improve the video quality. Full search motion estimation algorithm (FS), which calculates every search candidate in the search window for 7 block type with multiple reference frames, consumes massive computation power. Mathematical analysis reveals that the aliasing problem of subsampling algorithm comes from high frequency signal components. Moreover, high frequency signal components are also the main issues that make MRF algorithm essential. As we know, a picture being rich of texture must contain lots of high frequency signals. So based on these mathematical investigations, two fast VBSME algorithms are proposed in this paper, namely edge block detection based subsampling method and motion vector based MRF early termination algorithm. Experiments show that strong correlation exists among the motion vectors of those blocks belonging to the same macroblock. Through exploiting this feature, a dynamically adjustment of the search ranges of integer motion estimation is proposed in this paper. Combing our proposed algorithms with UMHS almost saves 96-98% Integer Motion Estimation (IME) time compared to the exhaustive search algorithm. The induced coding quality loss is less than 0.8% bitrate increase or 0.04 dB PSNR decline on average.
Zhenyu LIU Satoshi GOTO Takeshi IKENAGA
The key to high performance in video coding lies on efficiently reducing the temporal redundancies. For this purpose, H.264/AVC coding standard has adopted variable block size motion estimation on multiple reference frames to improve the coding gain. However, the computational complexity of motion estimation is also increased in proportion to the product of the reference frame number and the intermode number. The mathematical analysis in this paper reveals that the prediction errors mainly depend on the image edge gradient amplitude and quantization parameter. Consequently, this paper proposes the image content based early termination algorithm, which outperforms the original method adopted by JVT reference software, especially at high and moderate bit rates. In light of rate-distortion theory, this paper also relates the homogeneity of image to the quantization parameter. For the homogenous block, its search computation for futile reference frames and intermodes can be efficiently discarded. Therefore, the computation saving performance increases with the value of quantization parameter. These content based fast algorithms were integrated with Unsymmetrical-cross Multihexagon-grid Search (UMHexagonS) algorithm to demonstrate their performance. Compared to the original UMHexagonS fast matching algorithm, 26.14-54.97% search time can be saved with an average of 0.0369 dB coding quality degradation.
Xuewen LIAO Shihua ZHU Erlin ZENG
A multiple-antenna receiving and combining scheme is proposed for high-data-rate transmitted-reference (TR) Ultra-Wideband (UWB) systems. The nonlinearity of the inter-symbol interference (ISI) model is alleviated via simple antenna combining. Under the simplified ISI model, frequency domain equalization (FDE) is adopted and greatly reduces the complexity of the equalizer. A simple estimation algorithm for the simplified ISI model is presented. Simulation results demonstrate that compared to the single receive antenna scheme, the proposed method can obtain a significant diversity gain and eliminate the BER floor effect. Moreover, compared to the complex second-order time domain equalizer, FDE showed better performance robustness in the case of imperfect model estimation.
Image quality assessment method is a methodology that measures the difference of quality between the reference image and its distorted one. In this paper, we propose a novel reduced-reference (RR) quality assessment method for JPEG-2000 compressed images, which exploits the statistical characteristics of context information extracted through partial entropy decoding or decoding. These statistical features obtained in the process of JPEG-2000 encoding are transmitted to the receiver as side information and used to estimate the quality of images transmitted over various noisy channels at the decompression side. In the framework of JPEG-2000, the context of a current coefficient is determined depending on the pattern of the significance and/or the sign of its neighbors in three bit-plane coding passes and four coding modes. As the context information represents the local property of images, it can efficiently describe textured pattern and edge orientation. The quality of transmitted images is measured by the difference of entropy of context information between received and original images. Moreover, the proposed quality assessment method can directly process the images in the JPEG-2000 compressed domain without full decompression. Therefore, our proposed can accelerate the work of assessing image quality. Through simulations, we demonstrate that our method achieves fairly good performance in terms of the quality measurement accuracy as well as the computational complexity.