Asami OHTAKE Seiko UCHINO Kunio AKEDO Masanao ERA Koichi SAKAGUCHI
The numerical dispersibility measurement system was fabricated based on optical transparency to objectively evaluate undetectable dispersibility by naked eyes. The small deference of dispersibility and the dispersibility behaviors were simultaneously elucidated by the system. The abundance of octadecyl groups was also discussed from the result of dispersibility behaviors. The objective numerical evaluation is needed for precise analysis of dispersibility with respect to graphene, graphene derivatives and graphene related materials.
Masanori FURUTA Hidenori OKUNI Masahiro HOSOYA Akihide SAI Junya MATSUNO Shigehito SAIGUSA Tetsuro ITAKURA
This paper presents an analog front-end circuit for a 60-GHz proximity wireless communication receiver. The feature of the proposed analog front-end circuit is a bandwidth more than 1-GHz wide. To expand the bandwidth of a low-pass filter and a voltage gain amplifier, a technique to reduce the parasitic capacitance of a transconductance amplifier is proposed. Since the bandwidth is also limited by on-resistance of the ADC sampling switch, a switch separation technique for reduction of the on-resistance is also proposed. In a high-speed ADC, the SNDR is limited by the sampling jitter. The developed high resolution VCO auto tuning effectively reduces the jitter of PLL. The prototype is fabricated in 65nm CMOS. The analog front-end circuit achieves over 1-GHz bandwidth and 27.2-dB SNDR with 224mW Power consumption.
Toshihiro SAKANO Yosuke KOBAYASHI Kazuhiro KONDO
We proposed and evaluated a speech intelligibility estimation method that does not require a clean speech reference signal. The propose method uses the features defined in the ITU-T standard P.563, which estimates the overall quality of speech without the reference signal. We selected two sets of features from the P.563 features; the basic 9-feature set, which includes basic features that characterize both speech and background noise, e.g., cepstrum skewness and LPC kurtosis, and the extended 31-feature set with 22 additional features for a more accurate description of the degraded speech and noise, e.g., SNR, average pitch, and spectral clarity among others. Four hundred noise samples were added to speech, and about 70% of these samples were used to train a support vector regression (SVR) model. The trained models were used to estimate the intelligibility of speech degraded by added noise. The proposed method showed a root mean square error (RMSE) value of about 10% and correlation with subjective intelligibility of about 0.93 for speech distorted with known noise type, and RMSE of about 16% and a correlation of about 0.84 for speech distorted with unknown noise type, both with either the 9 or the 31-dimension feature set. These results were higher than the estimation using frequency-weighed SNR calculated in critical frequency bands, which requires the clean reference signal for its calculation. We believe this level of accuracy proves the proposed method to be applicable to real-time speech quality monitoring in the field.
Xiaoyun WANG Jinsong ZHANG Masafumi NISHIDA Seiichi YAMAMOTO
This paper describes a novel method to improve the performance of second language speech recognition when the mother tongue of users is known. Considering that second language speech usually includes less fluent pronunciation and more frequent pronunciation mistakes, the authors propose using a reduced phoneme set generated by a phonetic decision tree (PDT)-based top-down sequential splitting method instead of the canonical one of the second language. The authors verify the efficacy of the proposed method using second language speech collected with a translation game type dialogue-based English CALL system. Experiments show that a speech recognizer achieved higher recognition accuracy with the reduced phoneme set than with the canonical phoneme set.
This letter studies the problem of cooperative spectrum sensing in wideband cognitive radio networks. Based on the basis expansion model (BEM), the problem of estimation of power spectral density (PSD) is transformed to estimation of BEM coefficients. The sparsity both in frequency domain and space domain is used to construct a sparse estimation structure. The theory of L1/2 regularization is used to solve the compressed sensing problem. Simulation results demonstrate the effectiveness of the proposed method.
This paper analyzes the correlation between various acoustic features and perceptual voice quality similarity, and proposes a perceptually similar speaker selection technique based on distance metric learning. To analyze the relationship between acoustic features and voice quality similarity, we first conduct a large-scale subjective experiment using the voices of 62 female speakers and perceptual voice quality similarity scores between all pairs of speakers are acquired. Next, multiple linear regression analysis is carried out; it shows that four acoustic features are highly correlated to voice quality similarity. The proposed speaker selection technique first trains a transform matrix based on distance metric learning using the perceptual voice quality similarity acquired in the subjective experiment. Given an input speech, acoustic features of the input speech are transformed using the trained transform matrix, after which speaker selection is performed based on the Euclidean distance on the transformed acoustic feature space. We perform speaker selection experiments and evaluate the performance of the proposed technique by comparing it to speaker selection without feature space transformation. The results indicate that transformation based on distance metric learning reduces the error rate by 53.9%.
Thai-Mai Thi DINH Quoc-Tuan NGUYEN Dinh-Thong NGUYEN
Most recent work on cooperative spectrum sensing using cognitive radios has focused on issues involving the sensing channels and seemed to ignore those involving the reporting channels. Furthermore, no research has treated the effect of correlated composite Rayleigh-lognormal fading, also known as Suzuki fading, in cognitive radio. This paper proposes a technique for reuse of shadowed CRs, discarded during the sensing phase, as amplified-and-forward (AF) diversity relays for other surviving CRs to mitigate the effects of such fading in reporting channels. A thorough analysis of and a closed-form expression for the outage probability of the resulting cooperative AF diversity network in correlated composite Rayleigh-lognormal fading channels are presented in this paper. In particular, an efficient solution to the “PDF of sum-of-powers” of correlated Suzuki-distributed random variables using moment generating function (MGF) is proposed.
This paper proposes a speech watermarking method based on the concept of formant tuning. The characteristic that formant tuning can improve the sound quality of synthesized speech was employed to achieve inaudibility for watermarking. In the proposed method, formants were firstly extracted with linear prediction (LP) analysis and then embedded with watermarks by symmetrically controlling a pair of line spectral frequencies (LSFs) as formant tuning. We evaluated the proposed method by two kinds of experiments regarding inaudibility and robustness compared with other methods. Inaudibility was evaluated with objective and subjective tests and robustness was evaluated with speech codecs and speech processing. The results revealed that the proposed method could satisfy both inaudibility and robustness that required for speech watermarking.
This paper proposes a method to reduce the playback suspension in a Video-on-Demand system based on the Peer-to-Peer technology (P2P VoD). Our main contribution is twofold. The first is the proposal of a hierarchical P2P architecture with the notion of dynamic swarms. Swarm is a group of peers to have similar playback position and those swarms are connected with an overlay so that requested pieces are forwarded from a swarm to another swarm in a bucket brigade manner, where the forward of pieces is regulated by the super-peer (SP) of each swarm. The second contribution is the proposal of a match making scheme between requests and uploaders. The simulation result indicates that the proposed scheme reduces the total waiting time of a randomized scheme by 24% and the load of the media server by 76%.
Masanori MORISE Satoshi TSUZUKI Hideki BANNO Kenji OZAWA
This research deals with muffled speech as the evaluation target and introduces a criterion for evaluating the auditory impression in muffled speech. It focuses on the vocal tract area function (VTAF) to evaluate the auditory impression, and the criterion uses temporal differentiation of this function to track the temporal variation of the shape of the mouth. The experimental results indicate that the proposed criterion can be used to evaluate the auditory impression as well as the subjective impression.
Masanori HIROTOMO Masakatu MORII
In this paper, we propose an efficient method for computing the weight spectrum of LDPC convolutional codes based on circulant matrices of quasi-cyclic codes. In the proposed method, we reduce the memory size of their parity-check matrices with the same distance profile as the original codes, and apply a forward and backward tree search algorithm to the parity-check matrices of reduced memory. We show numerical results of computing the free distance and the low-part weight spectrum of LDPC convolutional codes of memory about 130.
Nozomi MIYA Tota SUKO Goki YASUDA Toshiyasu MATSUSHIMA
In this paper, sequential prediction is studied. The typical assumptions about the probabilistic model in sequential prediction are following two cases. One is the case that a certain probabilistic model is given and the parameters are unknown. The other is the case that not a certain probabilistic model but a class of probabilistic models is given and the parameters are unknown. If there exist some parameters and some models such that the distributions that are identified by them equal the source distribution, an assumed model or a class of models can represent the source distribution. This case is called that specifiable condition is satisfied. In this study, the decision based on the Bayesian principle is made for a class of probabilistic models (not for a certain probabilistic model). The case that specifiable condition is not satisfied is studied. Then, the asymptotic behaviors of the cumulative logarithmic loss for individual sequence in the sense of almost sure convergence and the expected loss, i.e. redundancy are analyzed and the constant terms of the asymptotic equations are identified.
Tsutomu SASAO Yuta URANO Yukihiro IGUCHI
This paper shows a method to find a linear transformation that reduces the number of variables to represent a given incompletely specified index generation function. It first generates the difference matrix, and then finds a minimal set of variables using a covering table. Linear transformations are used to modify the covering table to produce a smaller solution. Reduction of the difference matrix is also considered.
Quang Thang DUONG Shinsuke IBI Seiichi SAMPEI
This paper proposes an adaptive band activity ratio control (ABC) with cascaded energy allocation (CEA) scheme to improve end-to-end spectral efficiency for two-hop amplify-and-forward orthogonal frequency division multiplexing relay systems under transmit energy constraint. Subchannel pairing (SP) based spectrum mapping maps spectral components transmitted over high gain subchannels in the source-to-relay link onto high gain subchannels of the relay-to-destination link to improve the spectral efficiency. However, SP suffers from a frame efficiency reduction due to the notification of information of spectral component order. To compensate for the deficiency of SP, the proposed scheme employs dynamic spectrum control with ABC in which spectral components are mapped onto subchannels having high channel gain in each link, while band activity ratio (BAR) is controlled to an optimal value, which is smaller than 1, so that all spectral components are transmitted over relatively high gain subchannels of the two links. To further improve the performance, energy allocation at the source node and the relay node is serially conducted based on convex optimization, and BAR is controlled to improve discrete-input continuous-output memoryless channel capacity at the relay node. In the proposed scheme, since only information of BAR needs to be notified, the notification overhead is drastically reduced compared to that in SP based spectrum mapping. Numerical analysis confirms that the proposed ABC combined with CEA significantly reduces the required notification overhead while achieving almost the same frame error rate performance compared with the SP based scheme.
Takashi SUDO Hirokazu TANAKA Chika SUGIMOTO Ryuji KOHNO
Hands-free communications between cellular phones must be robust enough to withstand echo-path variation, and highly nonlinear echoes must be suppressed at low cost, when acoustic echo cancellation or suppression is applied to them. This paper proposes a spectrum-selective nonlinear echo suppression (SS-ES) approach as a solution to these issues. SS-ES is characterized by the selection of either a spectrum of the residual signal from an adaptive filter or a spectrum of the sending input signal depending on the amount of linear echo cancellation in an adaptive filter. Compared to conventional methods, the objective evaluation results of the SS-ES approach show an improvement of approximately 0.8-2.2dB, 0.23-2.39dB, and 0.26-0.50 in average echo return loss enhancement (ERLE), average root-mean-square log-spectral distortion (RMS-LSD), and the perceptual evaluation of speech quality (PESQ) value, respectively, under echo-path variation and double-talk conditions.
In this paper, we propose a parameter estimation method using Volterra kernels for the nonlinear IIR filters, which are used for the linearization of closed-box loudspeaker systems. The nonlinear IIR filter, which originates from a mirror filter, employs nonlinear parameters of the loudspeaker system. Hence, it is very important to realize an appropriate estimation method for the nonlinear parameters to increase the compensation ability of nonlinear distortions. However, it is difficult to obtain exact nonlinear parameters using the conventional parameter estimation method for nonlinear IIR filter, which uses the displacement characteristic of the diaphragm. The conventional method has two problems. First, it requires the displacement characteristic of the diaphragm but it is difficult to measure such tiny displacements. Moreover, a laser displacement gauge is required as an extra measurement instrument. Second, it has a limitation in the excitation signal used to measure the displacement of the diaphragm. On the other hand, in the proposed estimation method for nonlinear IIR filter, the parameters are updated using simulated annealing (SA) according to the cost function that represents the amount of compensation and these procedures are repeated until a given iteration count. The amount of compensation is calculated through computer simulation in which Volterra kernels of a target loudspeaker system is utilized as the loudspeaker model and then the loudspeaker model is compensated by the nonlinear IIR filter with the present parameters. Hence, the proposed method requires only an ordinary microphone and can utilize any excitation signal to estimate the nonlinear parameters. Some experimental results demonstrate that the proposed method can estimate the parameters more accurately than the conventional estimation method.
Keunseok CHO Sangbae JEONG Minsoo HAHN
This paper proposes a new algorithm to encode the spectral envelope for G.729.1 more accurately. It applies the normalized least-mean- square (NLMS) algorithm to each subband energy of the modified discrete cosine transform (MDCT) in the time-domain alias cancellation (TDAC) of G.729.1. By utilizing the estimation error of subband energies by means of NLMS, allocated bit reduction for spectral envelope coding is achieved. The saved bits are then reused to improve the spectral envelope estimation and thus enhance the sound quality. Experimental results confirm that the proposed algorithm improves the sound quality under both clean and packet loss conditions.
Woo KYEONG SEONG Ji HUN PARK Hong KOOK KIM
Dysarthric speech results from damage to the central nervous system involving the articulator, which can mainly be characterized by poor articulation due to irregular sub-glottal pressure, loudness bursts, phoneme elongation, and unexpected pauses during utterances. Since dysarthric speakers have physical disabilities due to the impairment of their nervous system, they cannot easily control electronic devices. For this reason, automatic speech recognition (ASR) can be a convenient interface for dysarthric speakers to control electronic devices. However, the performance of dysarthric ASR severely degrades when there is background noise. Thus, in this paper, we propose a noise reduction method that improves the performance of dysarthric ASR. The proposed method selectively applies either a Wiener filtering algorithm or a Kalman filtering algorithm according to the result of voiced or unvoiced classification. Then, the performance of the proposed method is compared to a conventional Wiener filtering method in terms of ASR accuracy.
Jun-Sang PARK Sung-Ho YOON Youngjoon WON Myung-Sup KIM
Internet traffic classification is an essential step for stable service provision. The payload signature classifier is considered a reliable method for Internet traffic classification but is prohibitively computationally expensive for real-time handling of large amounts of traffic on high-speed networks. In this paper, we describe several design techniques to minimize the search space of traffic classification and improve the processing speed of the payload signature classifier. Our suggestions are (1) selective matching algorithms based on signature type, (2) signature reorganization using hierarchical structure and traffic locality, and (3) early packet sampling in flow. Each can be applied individually, or in any combination in sequence. The feasibility of our selections is proved via experimental evaluation on traffic traces of our campus and a commercial ISP. We observe 2 to 5 times improvement in processing speed against the untuned classification system and Snort Engine, while maintaining the same level of accuracy.
Akihiro TATENO Shimpei AKIMOTO Tomoaki NAGAOKA Kazuyuki SAITO Soichi WATANABE Masaharu TAKAHASHI Koichi ITO
As the electromagnetic (EM) environment is becoming increasingly diverse, it is essential to estimate specific absorption rates (SARs) and temperature elevations of pregnant females and their fetuses under various exposure situations. This study presents calculated SARs and temperature elevations in a fetus exposed to EM waves. The calculations involved numerical models for the anatomical structures of a pregnant Japanese woman at gestational stages of 13, 18, and 26 weeks; the EM source was a wireless portable terminal placed close to the abdomen of the pregnant female model. The results indicate that fetal SARs and temperature elevations are closely related to the position of the fetus relative to the EM source. We also found that, although the fetal SAR caused by a half-wavelength dipole antenna is sometimes comparable to or slightly more than the International Commission Non-Ionizing Radiation Protection guidelines, it is lower than the guideline level in more realistic situations, such as when a planar inverted-F antenna is used. Furthermore, temperature elevations were significantly below the threshold set to prevent the child from being born with developmental disabilities.