Bing BU Changchun BAO Maoshen JIA
This letter proposes an extended image-source model to simulate the room transfer function for a rotatable complex source in a three-dimensional reverberant room. The proposed model uses spherical harmonic decomposition to describe the exterior sound field from the complex source. Based on “axis flip” concept, the mirroring relations between the source and images are summarized by a unified mirroring operator that occurs on the soundfield coefficients. The rotation movement of the source is taken into account by exploiting the rotation property of spherical harmonics. The accuracy of our proposed model is verified through the appropriate simulation examples.
Chai Eu GUAN Ahmed I.A. GALAL Nagamitsu MIZOGUCHI Akira ISHIKAWA Shugo FUKAGAWA Ryuji KITAYA Haruichi KANAYA
The analysis and design of a full 360 degrees hybrid coupler phase shifter with integrated distributed elements low pass filters is presented. Pi-section filter is incorporated into hybrid coupler phase shifter for harmonic suppression. The physical size of the proposed structure is close to that of the conventional hybrid coupler phase shifter. The maximum phase shift range is bounded by the port impedance ratio of the hybrid coupler phase shifter. Furthermore, the phase shift range is reduced if series inductance in the reflective load deviates from the optimum value. Numerical and parametric analyses are used to find the equivalent circuit of the pi-section filter for consistent relative phase shift. To validate our analysis, the proposed phase shifter operates at 8.6GHz was fabricated and measured. Over the frequency range of interest, the fabricated phase shifter suppresses second harmonic and achieves analog phase shift of 0 to 360 degrees at the passband, agreeing with the theoretical and simulation results.
Automatic music transcription from audio has long been one of the most intriguing problems and a challenge in the field of music information retrieval, because it requires a series of low-level tasks such as onset/offset detection and F0 estimation, followed by high-level post-processing for symbolic representation. In this paper, a comprehensive transcription system for monophonic singing voice based on harmonic structure analysis is proposed. Given a precise tracking of the fundamental frequency, a novel acoustic feature is derived to signify the harmonic structure in singing voice signals, regardless of the loudness and pitch. It is then used to generate a parametric mixture model based on the von Mises-Fisher distribution, so that the model represents the intrinsic harmonic structures within a region of smoothly connected notes. To identify the note boundaries, the local homogeneity in the harmonic structure is exploited by two different methods: the self-similarity analysis and hidden Markov model. The proposed system identifies the note attributes including the onset time, duration and note pitch. Evaluations are conducted from various aspects to verify the performance improvement of the proposed system and its robustness, using the latest evaluation methodology for singing transcription. The results show that the proposed system significantly outperforms other systems including the state-of-the-art systems.
Periodic interference frequently affects the measurement of small signals and causes problems in clinical diagnostics. Adaptive filters can be used as potential tools for cancelling such interference. However, when the interference has a frequency fluctuation, the ideal adaptive-filter coefficients for cancelling the interference also fluctuate. When the adaptation property of the algorithm is slow compared with the frequency fluctuation, the interference-cancelling performance is degraded. However, if the adaptation is too quick, the performance is degraded owing to the target signal. To overcome this problem, we propose an adaptive filter that suppresses the fluctuation of the ideal coefficients by utilizing a $rac{pi}{2}$ phase-delay device. This method assumes a frequency response that characterizes the transmission path from the interference source to the main input signal to be sufficiently smooth. In the numerical examples, the proposed method exhibits good performance in the presence of a frequency fluctuation when the forgetting factor is large. Moreover, we show that the proposed method reduces the calculation cost.
Po-Yi SHIH Po-Chuan LIN Jhing-Fa WANG
This paper describes a novel harmonic-based robust voice activity detection (H-RVAD) method with harmonic spectral local peak (HSLP) feature. HSLP is extracted by spectral amplitude analysis between the adjacent formants, and such characteristic can be used to identify and verify audio stream containing meaningful human speech accurately in low SNR environment. And, an enhanced low SNR noisy speech recognition system framework with wakeup module, speech recognition module and confirmation module is proposed. Users can determine or reject the system feedback while a recognition result was given in the framework, to prevent any chance that the voiced noise misleads the recognition result. The H-RVAD method is evaluated by the AURORA2 corpus in eight types of noise and three SNR levels and increased overall average performance from 4% to 20%. In home noise, the performance of H-RVAD method can be performed from 4% to 14% sentence recognition rate in average.
Ryo ISHIKAWA Yoichiro TAKAYAMA Kazuhiko HONJO
A novel experimental design method based on a low-frequency active load-pull technique that includes harmonic tuning has been proposed for high-efficiency microwave power amplifiers. The intrinsic core component of a transistor with a maximum oscillation frequency of more than several tens of gigahertz can be approximately assumed as the nonlinear current source with no frequency dependence at an operation frequency of several gigahertz. In addition, the reactive parasitic elements in a transistor can be omitted at a frequency of much less than 1GHz. Therefore, the optimum impedance condition including harmonics for obtaining high efficiency in a nonlinear current source can be directly investigated based on a low-frequency active harmonic load-pull technique in the low-frequency region. The optimum load condition at the operation frequency for an external load circuit can be estimated by considering the properties of the reactive parasitic elements and the nonlinear current source. For an InGaAs/GaAs pHEMT, active harmonic load-pull considering up to the fifth-order harmonic frequency was experimentally carried out at the fundamental frequency of 20MHz. By using the estimated optimum impedance condition for an equivalent nonlinear current source, high-frequency amplifiers were designed and fabricated at the 1.9-GHz, 2.45-GHz, and 5.8-GHz bands. The fabricated amplifiers exhibited maximum drain efficiency values of 79%, 80%, and 74% at 1.9GHz, 2.47GHz, and 5.78GHz, respectively.
Asahi TAKAOKA Shingo OKUMA Satoshi TAYU Shuichi UENO
The harmonious coloring of an undirected simple graph is a vertex coloring such that adjacent vertices are assigned different colors and each pair of colors appears together on at most one edge. The harmonious chromatic number of a graph is the least number of colors used in such a coloring. The harmonious chromatic number of a path is known, whereas the problem to find the harmonious chromatic number is NP-hard even for trees with pathwidth at most 2. Hence, we consider the harmonious coloring of trees with pathwidth 1, which are also known as caterpillars. This paper shows the harmonious chromatic number of a caterpillar with at most one vertex of degree more than 2. We also show the upper bound of the harmonious chromatic number of a 3-regular caterpillar.
Jonggyun LIM Wonshil KANG Kang-Yoon LEE Hyunchul KU
A class-E power amplifier (PA) with novel dynamic biasing scheme is proposed to enhance power added efficiency (PAE) over a wide power range. A look-up table (LUT) adjusts input power and drain supply voltage simultaneously to keep switch mode condition of a power transistor and to optimize the PAE. Experimental results show that the class-E PA using the proposed scheme with harmonic suppression filter gives the PAE higher than 80% over 8.5,dB range with less than 40,dBc harmonic suppression.
Xiaoxiong XING Yoshinori DOBASHI Tsuyoshi YAMAMOTO Yosuke KATSURA Ken ANJYO
We present an algorithm for efficient rendering of animated hair under a dynamic, low-frequency lighting environment. We use spherical harmonics (SH) to represent the environmental light. The transmittances between a point on a hair strand and the light sources are also represented by SH functions. Then, a convolution of SH functions and the scattering function of a hair strand is precomputed. This allows us to efficiently compute the intensity at a point on the hair. However, the computation of the transmittance is very time-consuming. We address this problem by using a voxel-based approach: the transmittance is computed by using a voxelized hair model. We further accelerate the computation by sampling the voxels. By using our method, we can render a hair model consisting of tens of thousands of hair strands at interactive frame rates.
Jorge TREVINO Takuma OKAMOTO Yukio IWAYA Yôiti SUZUKI
Sound field reproduction systems seek to realistically convey 3D spatial audio by re-creating the sound pressure inside a region enclosing the listener. High-order Ambisonics (HOA), a sound field reproduction technology, is notable for defining a scalable encoding format that characterizes the sound field in a system-independent way. Sound fields sampled with a particular microphone array and encoded into the HOA format can be reproduced using any sound presentation device, typically a loudspeaker array, by using a HOA decoder. The HOA encoding format is based on the spherical harmonic decomposition; this makes it easier to design a decoder for large arrays of loudspeakers uniformly distributed over all directions. In practice, it is seldom possible to cover all directions with loudspeakers placed at regular angular intervals. An irregular array, one where the angular separation between adjacent loudspeakers is not constant, does not perform as well as a regular one when reproducing HOA due to the uneven sampling of the spherical harmonics. This paper briefly introduces the techniques used in HOA and advances a new approach to design HOA decoders for irregular loudspeaker arrays. The main difference between conventional methods and our proposal is the use of a new error metric: the radial derivative of the reconstruction error. Minimizing this metric leads to a smooth reproduction, accurate over a larger region than that achieved by conventional HOA decoders. We evaluate our proposal using the computer simulation of two 115-channel loudspeaker arrays: a regular and an irregular one. We find that our proposal results in a larger listening region when used to decode HOA for reproduction using the irregular array. On the other hand, applying our method matches the high-quality reproduction that can be attained with the regular array and conventional HOA decoders.
Pil-Ho LEE Hyun Bae LEE Young-Chan JANG
A 125MHz 64-phase delay-locked loop (DLL) is implemented for time recovery in a digital wire-line system. The architecture of the proposed DLL comprises a coarse-locking circuit added to a conventional DLL circuit, which consists of a delay line including a bias circuit, phase detector, charge pump, and loop filter. The proposed coarse-locking circuit reduces the locking time of the DLL and prevents harmonic locking, regardless of the duty cycle of the clock. In order to verify the performance of the proposed coarse-locking circuit, a 64-phase DLL with an operating frequency range of 40 to 200MHz is fabricated using a 0.18-µm 1-poly 6-metal CMOS process with a 1.8V supply. The measured rms and peak-to-peak jitter of the output clock are 3.07ps and 21.1ps, respectively. The DNL and INL of the 64-phase output clock are measured to be -0.338/+0.164 LSB and -0.464/+0.171 LSB, respectively, at an operating frequency of 125MHz. The area and power consumption of the implemented DLL are 0.3mm2 and 12.7mW, respectively.
Alexandros KORDONIS Takashi HIKIHARA
AC conversion has a huge variety of applications and so there are many ongoing research topics as in every type of power electronic conversion. New semiconductors allow the increase of the switching frequency fact that brings a whole new prospective improvement in converter's operation. Many other possible nonlinear operation regimes, including period doubling and chaotic oscillations, appear besides the conventional steady state operation. In this work, a nonlinear discrete-time model of an AC/AC buck type converter is proposed. A discrete time iterative map is derived to highlight the sensitive switching dynamics. The model is able to observe fast scale phenomena and short transient effects. It offers more information compared to other methods such as the averaging ones. According to Electro-Magnetic Compatibility (EMC) regulations, low wide-band noise is more acceptable than the high narrow-band, therefore the goal of this work is to spread the harmonic noise into a wide frequency spectrum which has lower amplitudes compared to the conventional comb-like spectrum with distinctive amplitudes at switching frequency multiples. Through the numerical and experimental consideration the converter can operate in a chaotic motion and the advantages of the performance improvement are also discussed.
This paper proposes novel robust speech F0 estimation using Summation Residual Harmonics (SRH) based on TV-CAR (Time-Varying Complex AR) analysis. The SRH-based F0 estimation was proposed by A. Alwan, in which the criterion is calculated from LP residual signals. The criterion is summation of residual spectrum value for harmonics. In this paper, we propose SRH-based F0 estimation based on the TV-CAR analysis, in which the criterion is calculated from the complex AR residual. Since complex AR residual provides higher resolution of spectrum, it can be considered that the criterion is effective for F0 estimation. The experimental results demonstrate that the proposed method performs better than conventional methods; weighted auto-correlation and YIN.
Teerachot SIRIBURANON Takahiro SATO Ahmed MUSA Wei DENG Kenichi OKADA Akira MATSUZAWA
This paper presents a 20 GHz push-push VCO realized by a 10 GHz super-harmonic coupled quadrature oscillator for a quadrature 60 GHz frequency synthesizer. The output nodes are peaked by a tunable second harmonic resonator. The proposed VCO is implemented in 65 nm CMOS process. It achieves a tuning range of 3.5 GHz from 16.1 GHz to 19.6 GHz with a phase noise of -106 dBc/Hz at 1 MHz offset. The power consumption of the core oscillators is 10.3 mW and an FoM of -181.3 dBc/Hz is achieved.
Dai-Kyung HYUN Dae-Jin JUNG Hae-Yeoun LEE Heung-Kyu LEE
In this paper, we propose a novel camera identification method based on photo-response non-uniformity (PRNU), which performs well even with rotated videos. One of the disadvantages of the PRNU-based camera identification methods is that they are very sensitive to de-synchronization. If a video under investigation is slightly rotated, the identification process without synchronization fails. The proposed method solves this kind of out-of-sync problem, by achieving rotation-tolerance using Optimal Tradeoff Circular Harmonic Function (OTCHF) correlation filter. The experimental results show that the proposed method identifies source device with high accuracy from rotated videos.
Sang Ha PARK Seokjin LEE Koeng-Mo SUNG
Non-negative matrix factorization (NMF) is widely used for music transcription because of its efficiency. However, the conventional NMF-based music transcription algorithm often causes harmonic confusion errors or time split-up errors, because the NMF decomposes the time-frequency data according to the activated frequency in its time. To solve these problems, we proposed an NMF with temporal continuity and harmonicity constraints. The temporal continuity constraint prevented the time split-up of the continuous time components, and the harmonicity constraint helped to bind the fundamental with harmonic frequencies by reducing the additional octave errors. The transcription performance of the proposed algorithm was compared with that of the conventional algorithms, which showed that the proposed method helped to reduce additional false errors and increased the overall transcription performance.
Jie PAN Kazuki HAYANO Masayuki MORI Koichi MAEZAWA
The oscillators based on an active transmission line periodically loaded with RTD pairs are studied using circuit simulation with special attention to the behavior of harmonics. Generation of strong high order harmonic (9th) was observed. This is caused by the frequency locking in the high frequency passband. The harmonic oscillators based on this phenomenon are promising for high performance THz sources.
Makoto NAKASHIZUKA Hiroyuki OKUMURA Youji IIGUNI
In this paper, we propose a method for supervised single-channel speech separation through sparse decomposition using periodic signal models. The proposed separation method employs sparse decomposition, which decomposes a signal into a set of periodic signals under a sparsity penalty. In order to achieve separation through sparse decomposition, the decomposed periodic signals have to be assigned to the corresponding sources. For the assignment of the periodic signal, we introduce clustering using a K-means algorithm to group the decomposed periodic signals into as many clusters as the number of speakers. After the clustering, each cluster is assigned to its corresponding speaker using preliminarily learnt codebooks. Through separation experiments, we compare our method with MaxVQ, which performs separation on the frequency spectrum domain. The experimental results in terms of signal-to-distortion ratio show that the proposed sparse decomposition method is comparable to the frequency domain approach and has less computational costs for assignment of speech components.
Shoichi OSHIMA Mamoru UGAJIN Mitsuru HARADA
A new low-power feedback structure for a power amplifier (PA) reduces signal distortion while keeping the power efficiency of the PA high. The feedback structure injects the envelope of the third-order harmonics into the input signal. In adopting this method for a class-A amplifier, we obtain over 10% higher efficiency while maintaining the same adjacent channel power ratio (ACPR). The power consumption of additional circuit is 200 µW.
Kengo MURASAWA Koki SATO Takehiko HIDAKA
A generation effect of higher harmonics for an externally applied signal in a photoconductive (PC) terahertz (THz)-wave emitter has been found. This effect is applicable to accurately measuring for frequencies of THz waves. This paper describes reasons why higher harmonics are generated in a PC device. The dependence of the photoconductance on the applied voltage in the PC device consists of a flat range and a negative slant range, and one sharply bending point is then formed at the boundary between the flat and slant ranges. When the PC device is irradiated by two laser beams with slightly different optical frequencies, the photoconductance is strongly modulated at the optical beat frequency in the THz region by photomixing the two laser beams. As a result, three bending points are formed in the average photoconductance (introduced as the average of the temporal photoconductance varying at the THz frequency). The slants comprised of the three bending points are different from each other. When the variation range of the applied voltage driven by the signal input on the biased voltage covers the voltage of one of the bending points, the photoconductance (or the average photoconductance in optical beating) varies along the different two slopes, the resultant temporal photocurrent is largely distorted, and then the harmonics of the signal input are generated in the photocurrent. The following features are clarified: (1) the harmonics of the signal input are generated by appropriately adjusting the bias voltage and the amplitude of the signal input, regardless of the presence/absence of optical beating; (2) the efficiency of the harmonic generation is about 10-4 -10-5; and (3) the harmonics over 35th order with almost flat amplitudes (-3.8 dB/octave) are generated.