IEICE global.ieice.org Site

Keyword Search Result

[Keyword] music(98hit)

21-40hit(98hit)

A Local Feature Aggregation Method for Music Retrieval
Jin S. SEO

LETTER

Pubricized:
2017/10/16
Vol:
E101-D No:1
Page(s):
64-67
The song-level feature summarization is an essential building block for browsing, retrieval, and indexing of digital music. This paper proposes a local pooling method to aggregate the feature vectors of a song over the universal background model. Two types of local activation patterns of feature vectors are derived; one representation is derived in the form of histogram, and the other is given by a binary vector. Experiments over three publicly-available music datasets show that the proposed local aggregation of the auditory features is promising for music-similarity computation.
Robust Singing Transcription System Using Local Homogeneity in the Harmonic Structure
Hoon HEO Kyogu LEE

PAPER-Music Information Processing

Pubricized:
2017/02/18
Vol:
E100-D No:5
Page(s):
1114-1123
Automatic music transcription from audio has long been one of the most intriguing problems and a challenge in the field of music information retrieval, because it requires a series of low-level tasks such as onset/offset detection and F0 estimation, followed by high-level post-processing for symbolic representation. In this paper, a comprehensive transcription system for monophonic singing voice based on harmonic structure analysis is proposed. Given a precise tracking of the fundamental frequency, a novel acoustic feature is derived to signify the harmonic structure in singing voice signals, regardless of the loudness and pitch. It is then used to generate a parametric mixture model based on the von Mises-Fisher distribution, so that the model represents the intrinsic harmonic structures within a region of smoothly connected notes. To identify the note boundaries, the local homogeneity in the harmonic structure is exploited by two different methods: the self-similarity analysis and hidden Markov model. The proposed system identifies the note attributes including the onset time, duration and note pitch. Evaluations are conducted from various aspects to verify the performance improvement of the proposed system and its robustness, using the latest evaluation methodology for singing transcription. The results show that the proposed system significantly outperforms other systems including the state-of-the-art systems.
Fast Lyric Area Extraction from Images of Printed Korean Music Scores
Cong Minh DINH Hyung Jeong YANG Guee Sang LEE Soo Hyung KIM

PAPER-Artificial Intelligence, Data Mining

Pubricized:
2016/02/23
Vol:
E99-D No:6
Page(s):
1576-1584
In recent years, optical music recognition (OMR) has been extensively developed, particularly for use with mobile devices that require fast processing to recognize and play live the notes in images captured from sheet music. However, most techniques that have been developed thus far have focused on playing back instrumental music and have ignored the importance of lyric extraction, which is time consuming and affects the accuracy of the OMR tools. The text of the lyrics adds complexity to the page layout, particularly when lyrics touch or overlap musical symbols, in which case it is very difficult to separate them from each other. In addition, the distortion that appears in captured musical images makes the lyric lines curved or skewed, making the lyric extraction problem more complicated. This paper proposes a new approach in which lyrics are detected and extracted quickly and effectively. First, in order to resolve the distortion problem, the image is undistorted by a method using information of stave lines and bar lines. Then, through the use of a frequency count method and heuristic rules based on projection, the lyric areas are extracted, the cases where symbols touch the lyrics are resolved, and most of the information from the musical notation is kept even when the lyrics and music notes are overlapping. Our algorithm demonstrated a short processing time and remarkable accuracy on two test datasets of images of printed Korean musical scores: the first set included three hundred scanned musical images; the second set had two hundred musical images that were captured by a digital camera.
An Extension of MUSIC Exploiting Higher-Order Moments via Nonlinear Mapping
Yuya SUGIMOTO Shigeki MIYABE Takeshi YAMADA Shoji MAKINO Biing-Hwang JUANG

PAPER-Engineering Acoustics

Vol:
E99-A No:6
Page(s):
1152-1162
MUltiple SIgnal Classification (MUSIC) is a standard technique for direction of arrival (DOA) estimation with high resolution. However, MUSIC cannot estimate DOAs accurately in the case of underdetermined conditions, where the number of sources exceeds the number of microphones. To overcome this drawback, an extension of MUSIC using cumulants called 2q-MUSIC has been proposed, but this method greatly suffers from the variance of the statistics, given as the temporal mean of the observation process, and requires long observation. In this paper, we propose a new approach for extending MUSIC that exploits higher-order moments of the signal for the underdetermined DOA estimation with smaller variance. We propose an estimation algorithm that nonlinearly maps the observed signal onto a space with expanded dimensionality and conducts MUSIC-based correlation analysis in the expanded space. Since the dimensionality of the noise subspace is increased by the mapping, the proposed method enables the estimation of DOAs in the case of underdetermined conditions. Furthermore, we describe the class of mapping that allows us to analyze the higher-order moments of the observed signal in the original space. We compare 2q-MUSIC and the proposed method through an experiment assuming that the true number of sources is known as prior information to evaluate in terms of the bias-variance tradeoff of the statistics and computational complexity. The results clarify that the proposed method has advantages for both computational complexity and estimation accuracy in short-time analysis, i.e., the time duration of the analyzed data is short.
Continuous Music-Emotion Recognition Based on Electroencephalogram
Nattapong THAMMASAN Koichi MORIYAMA Ken-ichi FUKUI Masayuki NUMAO

PAPER-Music Information Processing

Pubricized:
2016/01/22
Vol:
E99-D No:4
Page(s):
1234-1241
Research on emotion recognition using electroencephalogram (EEG) of subjects listening to music has become more active in the past decade. However, previous works did not consider emotional oscillations within a single musical piece. In this research, we propose a continuous music-emotion recognition approach based on brainwave signals. While considering the subject-dependent and changing-over-time characteristics of emotion, our experiment included self-reporting and continuous emotion annotation in the arousal-valence space. Fractal dimension (FD) and power spectral density (PSD) approaches were adopted to extract informative features from raw EEG signals and then we applied emotion classification algorithms to discriminate binary classes of emotion. According to our experimental results, FD slightly outperformed PSD approach both in arousal and valence classification, and FD was found to have the higher correlation with emotion reports than PSD. In addition, continuous emotion recognition during music listening based on EEG was found to be an effective method for tracking emotional reporting oscillations and provides an opportunity to better understand human emotional processes.
Improved Estimation of Direction-of-Arrival by Adaptive Selection of Algorithms in Angular Spread Environments
Tomomi AOKI Yasuhiko TANABE

PAPER-Antennas and Propagation

Vol:
E98-B No:12
Page(s):
2454-2462
This paper proposes a novel direction-of-arrival (DOA) estimation method that can reduce performance degradation due to angular spread. Some algorithms previously proposed for such estimation make assumptions about the distributions of amplitude and phase for incident waves because most DOA estimation algorithms are sensitive to angular spread. However, when the assumptions are inaccurate, these algorithms perform poorly as compared with algorithms without countermeasures against angular spread. In this paper, we propose a method for selecting an appropriate DOA estimation algorithm according to the channel vector of each source signal as estimated by independent component analysis. Computer simulations show that the proposed method can robustly estimate DOA in environments with angular spread.
DOA Estimation Based on GSA for CDMA Signals
Chao-Li MENG Shiaw-Wu CHEN Ann-Chen CHANG

LETTER-Digital Signal Processing

Vol:
E98-A No:10
Page(s):
2182-2186
This letter deals with direction-of-arrival (DOA) estimate problem based on gravitational search algorithm (GSA) with multiple signal classification (MUSIC) criterion for code-division multiple access (CDMA) signals. It has been shown that the estimate accuracy of the searching-based MUSIC estimator strictly depends on the number of search grids used during the search process, which is time consuming and the required number of search grids is not clear to determine. In conjunction with the GSA-based optimization, the high resolution DOA estimation can be obtained; meanwhile the searching grid size is no need to know previously. In this letter, we firstly present a GSA-based DOA estimator with MUSIC criterion under high interferer-to-noise ratio circumstances. Second, for the purpose to increase the estimation accuracy, we also propose an improved GSA with adaptive multiple accelerations, which depend on Newton-Raphson method. Several computer simulations are provided for illustration and comparison.
An Energy Efficient Time-Frequency Transformation of Chirp Signals in Multipath Channels for MUSIC-Based TOA Estimation
Sangdeok KIM Jong-Wha CHONG

PAPER-Digital Signal Processing

Vol:
E98-A No:8
Page(s):
1769-1776
Range estimation based on time of arrival (TOA) is becoming increasingly important with the emergence of location-based applications and next-generation location-aware wireless sensor networks. For radar and positioning systems, chirp signals have primarily been used due to their inborn signal properties for decomposition. Recently, chirp signal has been selected as the baseline standard of ISO/IEC 24730-5 and IEEE 802.15.4a in 2.4GHz, organized for the development of a real-time accurate positioning system. When estimating the TOA of the received signals in multipath channel, the super-resolution algorithms, known as estimation of signal parameters via rotational invariance techniques (ESPRIT), multiple signal classification method (MUSIC) and matrix pencil (MP), are preferred due to their superiority in decomposing the received paths. For the super-resolution algorithm-based TOA estimation of chirp signals, the received chirp signals must be transformed into a sinusoidal form for the super-resolution algorithm. The conventional transformation, the de-chirping technique, changes the received chirp signals to sinusoids so that the super-resolution algorithms can estimate the TOA of the received chirp signals through a frequency estimation of the transformed sinusoids. In practice, the initial timing synchronizer at receiver tries to find the maximum energy point at which the received paths are overlapped maximally. At this time, the conventional de-chirping yields lossy transformed sinusoids for the first arrival path from the received samples synchronized to the maximum energy point. The first arrival path is not involved in the transformed sinusoids with the conventional transformation, leading to performance degradation. However, the proposed energy efficient time-frequency transformation achieves lossless transformation by using the extended reference chirp signals. The proposed transformation is incorporated with MUSIC-based TOA estimation. The effectiveness of the proposed transformation is analyzed and verified. The root mean squared error (RMSE) of the proposed transformation is compared with Cramer-Rao lower bound and those for the conventional algorithms such as super-resolution, ESPRIT and matrix pencil algorithm in multipath channel.
Estimation of a Received Signal at an Arbitrary Remote Location Using MUSIC Method
Makoto TANAKA Hisato IWAI Hideichi SASAOKA

PAPER

Vol:
E98-B No:5
Page(s):
806-813
In recent years, various applications based on propagation characteristics have been developed. They generally utilize the locality of the fading characteristics of multipath environments. On the other hand, if a received signal at a remote location can be estimated beyond the correlation distance of the multipath fading environment, a wide variety of new applications can be possible. In this paper, we attempt to estimate a received signal at a remote location using the MUSIC method and the least squares method. Based on the plane wave assumption for each arriving wave, multipath environment is analyzed through estimation of the directions of arrival by the MUISC method and the complex amplitudes of the received signals by the least squares method, respectively. We present evaluation results on the estimation performance of the method by computer simulations.
Supporting Jogging at an Even Pace by Synchronizing Music Playback Speed with Runner's Pace
Tetsuro KITAHARA Shunsuke HOKARI Tatsuya NAGAYASU

LETTER-Human-computer Interaction

Pubricized:
2015/01/09
Vol:
E98-D No:4
Page(s):
968-971
In this paper, we propose a jogging support system that plays back background music while synchronizing its tempo with the user's jogging pace. Keeping an even pace is important in jogging but it is not easy due to tiredness. Our system indicates the variation of the runner's pace by changing the playback speed of music according to the user's pace variation. Because this requires the runner to keep an even pace in order to enjoy the music at its normal speed, the runner will be spontaneously influenced to keep an even pace. Experimental results show that our system reduced the variation of jogging pace.
Music Signal Separation Based on Supervised Nonnegative Matrix Factorization with Orthogonality and Maximum-Divergence Penalties
Daichi KITAMURA Hiroshi SARUWATARI Kosuke YAGI Kiyohiro SHIKANO Yu TAKAHASHI Kazunobu KONDO

LETTER-Engineering Acoustics

Vol:
E97-A No:5
Page(s):
1113-1118
In this letter, we address monaural source separation based on supervised nonnegative matrix factorization (SNMF) and propose a new penalized SNMF. Conventional SNMF often degrades the separation performance owing to the basis-sharing problem. Our penalized SNMF forces nontarget bases to become different from the target bases, which increases the separated sound quality.
Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Deep Belief Networks
Ji-Hyun SONG Hong-Sub AN Sangmin LEE

LETTER-Speech and Hearing

Vol:
E97-A No:2
Page(s):
661-664
In this paper, we propose a robust speech/music classification algorithm to improve the performance of speech/music classification in the selectable mode vocoder (SMV) of 3GPP2 using deep belief networks (DBNs), which is a powerful hierarchical generative model for feature extraction and can determine the underlying discriminative characteristic of the extracted features. The six feature vectors selected from the relevant parameters of the SMV are applied to the visible layer in the proposed DBN-based method. The performance of the proposed algorithm is evaluated using the detection accuracy and error probability of speech and music for various music genres. The proposed algorithm yields better results when compared with the original SMV method and support vector machine (SVM) based method.
Predominant Melody Extraction from Polyphonic Music Signals Based on Harmonic Structure
Jea-Yul YOON Chai-Jong SONG Hochong PARK

LETTER-Music Information Processing

Vol:
E96-D No:11
Page(s):
2504-2507
A new method for predominant melody extraction from polyphonic music signals based on harmonic structure is proposed. The proposed method first extracts a set of fundamental frequency candidates by analyzing the distance between spectral peaks. Then, the predominant fundamental frequency is selected by pitch tracking according to the harmonic strength of the selected candidates. Finally, the method runs pitch smoothing on a large temporal scale for eliminating pitch doubling error, and conducts voicing frame detection. The proposed method shows the best overall performance for ADC 2004 DB in the MIREX 2011 audio melody extraction task.
A Music Similarity Function Based on the Centroid Model
Jin Soo SEO

LETTER-Music Information Processing

Vol:
E96-D No:7
Page(s):
1573-1576
Music-similarity computation is an essential building block for the browsing, retrieval, and indexing of digital music archives. This paper proposes a music similarity function based on the centroid model, which divides the feature space into non-overlapping clusters for the efficient computation of the timber distance of two songs. We place particular emphasis on the centroid deviation as a feature for music-similarity computation. Experiments show that the centroid-model representation of the auditory features is promising for music-similarity computation.
Super Resolution TOA Estimation Algorithm with Maximum Likelihood ICA Based Pre-Processing
Tetsuhiro OKANO Shouhei KIDERA Tetsuo KIRIMOTO

PAPER-Sensing

Vol:
E96-B No:5
Page(s):
1194-1201
High-resolution time of arrival (TOA) estimation techniques have great promise for the high range resolution required in recently developed radar systems. A widely known super-resolution TOA estimation algorithm for such applications, the multiple-signal classification (MUSIC) in the frequency domain, has been proposed, which exploits an orthogonal relationship between signal and noise eigenvectors obtained by the correlation matrix of the observed transfer function. However, this method suffers severely from a degraded resolution when a number of highly correlated interference signals are mixed in the same range gate. As a solution for this problem, this paper proposes a novel TOA estimation algorithm by introducing a maximum likelihood independent component analysis (MLICA) approach, in which multiple complex sinusoidal signals are efficiently separated by the likelihood criteria determined by the probability density function (PDF) of a complex sinusoid. This MLICA schemes can decompose highly correlated interference signals, and the proposed method then incorporates the MLICA into the MUSIC method, to enhance the range resolution in richly interfered situations. The results from numerical simulations and experimental investigation demonstrate that our proposed pre-processing method can enhance TOA estimation resolution compared with that obtained by the original MUSIC, particularly for lower signal-to-noise ratios.
A Novel High-Resolution Propagation Measurement Scheme for Indoor Terrestrial TV Signal Reception Based on Two-Dimensional Virtual Array Technique
Kazuo MOROKUMA Atsushi TAKEMOTO Yoshio KARASAWA

PAPER-Antennas and Propagation

Vol:
E96-B No:4
Page(s):
986-993
We propose a novel propagation measurement scheme for terrestrial TV signal indoor reception based on a virtual array technique. The system proposed in this paper carries out two-branch recording of target signals and the reference signal. By using the signal phase reference in the reference signal, we clarify the spatial propagation characteristics obtained from the two-dimensional virtual array outputs. Outdoor measurements were performed first to investigate the validity of the proposed measurement system. The results confirm its effectiveness in accurately determining the direction-of-arrival (DOA). We then investigated the propagation characteristics in an indoor environment. The angular spectrum obtained showed clear wave propagation structure. Thus, our proposed system is promising as a very accurate measurement tool for indoor propagation analysis.
Polyphonic Music Transcription by Nonnegative Matrix Factorization with Harmonicity and Temporality Criteria
Sang Ha PARK Seokjin LEE Koeng-Mo SUNG

LETTER-Engineering Acoustics

Vol:
E95-A No:9
Page(s):
1610-1614
Non-negative matrix factorization (NMF) is widely used for music transcription because of its efficiency. However, the conventional NMF-based music transcription algorithm often causes harmonic confusion errors or time split-up errors, because the NMF decomposes the time-frequency data according to the activated frequency in its time. To solve these problems, we proposed an NMF with temporal continuity and harmonicity constraints. The temporal continuity constraint prevented the time split-up of the continuous time components, and the harmonicity constraint helped to bind the fundamental with harmonic frequencies by reducing the additional octave errors. The transcription performance of the proposed algorithm was compared with that of the conventional algorithms, which showed that the proposed method helped to reduce additional false errors and increased the overall transcription performance.
Time-Reversal MUSIC Imaging with Time-Domain Gating Technique
Heedong CHOI Yasutaka OGAWA Toshihiko NISHIMURA Takeo OHGANE

PAPER-Antennas and Propagation

Vol:
E95-B No:7
Page(s):
2377-2385
A time-reversal (TR) approach with multiple signal classification (MUSIC) provides super-resolution for detection and localization using multistatic data collected from an array antenna system. The theory of TR-MUSIC assumes that the number of antenna elements is greater than that of scatterers (targets). Furthermore, it requires many sets of frequency-domain data (snapshots) in seriously noisy environments. Unfortunately, these conditions are not practical for real environments due to the restriction of a reasonable antenna structure as well as limited measurement time. We propose an approach that treats both noise reduction and relaxation of the transceiver restriction by using a time-domain gating technique accompanied with the Fourier transform before applying the TR-MUSIC imaging algorithm. Instead of utilizing the conventional multistatic data matrix (MDM), we employ a modified MDM obtained from the gating technique. The resulting imaging functions yield more reliable images with only a few snapshots regardless of the limitation of the antenna arrays.
Efficient Generation of Dancing Animation Synchronizing with Music Based on Meta Motion Graphs
Jianfeng XU Koichi TAKAGI Shigeyuki SAKAZAWA

PAPER-Computer Graphics

Vol:
E95-D No:6
Page(s):
1646-1655
This paper presents a system for automatic generation of dancing animation that is synchronized with a piece of music by re-using motion capture data. Basically, the dancing motion is synthesized according to the rhythm and intensity features of music. For this purpose, we propose a novel meta motion graph structure to embed the necessary features including both rhythm and intensity, which is constructed on the motion capture database beforehand. In this paper, we consider two scenarios for non-streaming music and streaming music, where global search and local search are required respectively. In the case of the former, once a piece of music is input, the efficient dynamic programming algorithm can be employed to globally search a best path in the meta motion graph, where an objective function is properly designed by measuring the quality of beat synchronization, intensity matching, and motion smoothness. In the case of the latter, the input music is stored in a buffer in a streaming mode, then an efficient search method is presented for a certain amount of music data (called a segment) in the buffer with the same objective function, resulting in a segment-based search approach. For streaming applications, we define an additional property in the above meta motion graph to deal with the unpredictable future music, which guarantees that there is some motion to match the unknown remaining music. A user study with totally 60 subjects demonstrates that our system outperforms the stat-of-the-art techniques in both scenarios. Furthermore, our system improves the synthesis speed greatly (maximal speedup is more than 500 times), which is essential for mobile applications. We have implemented our system on commercially available smart phones and confirmed that it works well on these mobile phones.
Clustering Algorithm for Unsupervised Monaural Musical Sound Separation Based on Non-negative Matrix Factorization
Sang Ha PARK Seokjin LEE Koeng-Mo SUNG

LETTER-Engineering Acoustics

Vol:
E95-A No:4
Page(s):
818-823
Non-negative matrix factorization (NMF) is widely used for monaural musical sound source separation because of its efficiency and good performance. However, an additional clustering process is required because the musical sound mixture is separated into more signals than the number of musical tracks during NMF separation. In the conventional method, manual clustering or training-based clustering is performed with an additional learning process. Recently, a clustering algorithm based on the mel-frequency cepstrum coefficient (MFCC) was proposed for unsupervised clustering. However, MFCC clustering supplies limited information for clustering. In this paper, we propose various timbre features for unsupervised clustering and a clustering algorithm with these features. Simulation experiments are carried out using various musical sound mixtures. The results indicate that the proposed method improves clustering performance, as compared to conventional MFCC-based clustering.

21-40hit(98hit)

Keyword Search Result

[Keyword] music(98hit)

A Local Feature Aggregation Method for Music Retrieval

Robust Singing Transcription System Using Local Homogeneity in the Harmonic Structure

Fast Lyric Area Extraction from Images of Printed Korean Music Scores

An Extension of MUSIC Exploiting Higher-Order Moments via Nonlinear Mapping

Continuous Music-Emotion Recognition Based on Electroencephalogram

Improved Estimation of Direction-of-Arrival by Adaptive Selection of Algorithms in Angular Spread Environments

DOA Estimation Based on GSA for CDMA Signals

An Energy Efficient Time-Frequency Transformation of Chirp Signals in Multipath Channels for MUSIC-Based TOA Estimation

Estimation of a Received Signal at an Arbitrary Remote Location Using MUSIC Method

Supporting Jogging at an Even Pace by Synchronizing Music Playback Speed with Runner's Pace

Music Signal Separation Based on Supervised Nonnegative Matrix Factorization with Orthogonality and Maximum-Divergence Penalties

Speech/Music Classification Enhancement for 3GPP2 SMV Codec Based on Deep Belief Networks

Predominant Melody Extraction from Polyphonic Music Signals Based on Harmonic Structure

A Music Similarity Function Based on the Centroid Model

Super Resolution TOA Estimation Algorithm with Maximum Likelihood ICA Based Pre-Processing

A Novel High-Resolution Propagation Measurement Scheme for Indoor Terrestrial TV Signal Reception Based on Two-Dimensional Virtual Array Technique

Polyphonic Music Transcription by Nonnegative Matrix Factorization with Harmonicity and Temporality Criteria

Time-Reversal MUSIC Imaging with Time-Domain Gating Technique

Efficient Generation of Dancing Animation Synchronizing with Music Based on Meta Motion Graphs

Clustering Algorithm for Unsupervised Monaural Musical Sound Separation Based on Non-negative Matrix Factorization

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles