The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] MLP(6hit)

1-6hit
  • MDX-Mixer: Music Demixing by Leveraging Source Signals Separated by Existing Demixing Models Open Access

    Tomoyasu NAKANO  Masataka GOTO  

     
    PAPER-Music Information Processing

      Pubricized:
    2024/04/05
      Vol:
    E107-D No:8
      Page(s):
    1079-1088

    This paper presents MDX-Mixer, which improves music demixing (MDX) performance by leveraging source signals separated by multiple existing MDX models. Deep-learning-based MDX models have improved their separation performances year by year for four kinds of sound sources: “vocals,” “drums,” “bass,” and “other”. Our research question is whether mixing (i.e., weighted sum) the signals separated by state-of-the-art MDX models can obtain either the best of everything or higher separation performance. Previously, in singing voice separation and MDX, there have been studies in which separated signals of the same sound source are mixed with each other using time-invariant or time-varying positive mixing weights. In contrast to those, this study is novel in that it allows for negative weights as well and performs time-varying mixing using all of the separated source signals and the music acoustic signal before separation. The time-varying weights are estimated by modeling the music acoustic signals and their separated signals by dividing them into short segments. In this paper we propose two new systems: one that estimates time-invariant weights using 1×1 convolution, and one that estimates time-varying weights by applying the MLP-Mixer layer proposed in the computer vision field to each segment. The latter model is called MDX-Mixer. Their performances were evaluated based on the source-to-distortion ratio (SDR) using the well-known MUSDB18-HQ dataset. The results show that the MDX-Mixer achieved higher SDR than the separated signals given by three state-of-the-art MDX models.

  • ECG-Based Heartbeat Classification Using Two-Level Convolutional Neural Network and RR Interval Difference

    Yande XIANG  Jiahui LUO  Taotao ZHU  Sheng WANG  Xiaoyan XIANG  Jianyi MENG  

     
    PAPER-Biological Engineering

      Pubricized:
    2018/01/12
      Vol:
    E101-D No:4
      Page(s):
    1189-1198

    Arrhythmia classification based on electrocardiogram (ECG) is crucial in automatic cardiovascular disease diagnosis. The classification methods used in the current practice largely depend on hand-crafted manual features. However, extracting hand-crafted manual features may introduce significant computational complexity, especially in the transform domains. In this study, an accurate method for patient-specific ECG beat classification is proposed, which adopts morphological features and timing information. As to the morphological features of heartbeat, an attention-based two-level 1-D CNN is incorporated in the proposed method to extract different grained features automatically by focusing on various parts of a heartbeat. As to the timing information, the difference between previous and post RR intervels is computed as a dynamic feature. Both the extracted morphological features and the interval difference are used by multi-layer perceptron (MLP) for classifing ECG signals. In addition, to reduce memory storage of ECG data and denoise to some extent, an adaptive heartbeat normalization technique is adopted which includes amplitude unification, resolution modification, and signal difference. Based on the MIT-BIH arrhythmia database, the proposed classification method achieved sensitivity Sen=93.4% and positive predictivity Ppr=94.9% in ventricular ectopic beat (VEB) detection, sensitivity Sen=86.3% and positive predictivity Ppr=80.0% in supraventricular ectopic beat (SVEB) detection, and overall accuracy OA=97.8% under 6-bit ECG signal resolution. Compared with the state-of-the-art automatic ECG classification methods, these results show that the proposed method acquires comparable accuracy of heartbeat classification though ECG signals are represented by lower resolution.

  • MLP/BP-Based Soft Decision Feedback Equalization with Bit-Interleaved TCM for Wireless Applications

    Terng-Ren HSU  Chien-Ching LIN  Terng-Yin HSU  Chen-Yi LEE  

     
    LETTER-Neural Networks and Bioengineering

      Vol:
    E90-A No:4
      Page(s):
    879-884

    For more efficient data transmissions, a new MLP/BP-based channel equalizer is proposed to compensate for multi-path fading in wireless applications. In this work, for better system performance, we apply the soft output and the soft feedback structure as well as the soft decision channel decoding. Moreover, to improve packet error rate (PER) and bit error rate (BER), we search for the optimal scaling factor of the transfer function in the output layer of the MLP/BP neural networks and add small random disturbances to the training data. As compared with the conventional MLP/BP-based DFEs and the soft output MLP/BP-based DFEs, the proposed MLP/BP-based soft DFEs under multi-path fading channels can improve over 3-0.6 dB at PER=10-1 and over 3.3-0.8 dB at BER=10-3.

  • Adaptive Postprocessing Algorithm in Block-Coded Images Using Block Classification and MLP

    Kee-Koo KWON  Byung-Ju KIM  Suk-Hwan LEE  Seong-Geun KWON  Kuhn-Il LEE  

     
    LETTER-Image

      Vol:
    E86-A No:4
      Page(s):
    961-967

    A novel postprocessing algorithm for reducing the blocking artifacts in block-based coded images is proposed using block classification and adaptive multi-layer perceptron (MLP). This algorithm is exploited the nonlinearity property of the neural network learning algorithm to reduce the blocking artifacts more accurately. In this algorithm, each block is classified into four classes; smooth, horizontal edge, vertical edge, and complex blocks, based on the characteristic of their discrete cosine transform (DCT) coefficients. Thereafter, according to the class information of the neighborhood block, adaptive neural network filters (NNF) are then applied to the horizontal and vertical block boundaries. That is, for each class a different two-layer NNF is used to remove the blocking artifacts. Experimental results show that the proposed algorithm produced better results than conventional algorithms both subjectively and objectively.

  • Broadband Active Noise Control Using a Neural Network

    Casper K. CHEN  Tzi-Dar CHIUEH  Jyh-Horng CHEN  

     
    PAPER-Speech Processing and Acoustics

      Vol:
    E81-D No:8
      Page(s):
    855-861

    This paper presents a neural network-based control system for Adaptive Noise Control (ANC). The control system derives a secondary signal to destructively interfere with the original noise to cut down the noise power. This paper begins with an introduction to feedback ANC systems and then describes our adaptive algorithm in detail. Three types of noise signals, recorded in destroyer, F16 airplane and MR imaging room respectively, were then applied to our noise control system which was implemented by software. We obtained an average noise power attenuation of about 20 dB. It was shown that our system performed as well as traditional DSP controllers for narrow-band noise and achieved better results for nonlinear broadband noise problems. In this paper we also present a hardware implementation method for the proposed algorithm. This hardware architecture allows fast and efficient field training in new environments and makes real-time real-life applications possible.

  • Combining Multiple Classifiers in a Hybrid System for High Performance Chinese Syllable Recognition

    Liang ZHOU  Satoshi IMAI  

     
    PAPER-Speech Processing and Acoustics

      Vol:
    E79-D No:11
      Page(s):
    1570-1578

    A multiple classifier system can be a powerful solution for robust pattern recognition. It is expected that the appropriate combination of multiple classifiers may reduce errors, provide robustness, and achieve higher performance. In this paper, high performance Chinese syllable recognition is presented using combinations of multiple classifiers. Chinese syllable recognition is divided into base syllable recognition (disregarding the tones) and recognition of 4 tones. For base syllable recognition, we used a combination of two multisegment vector quantization (MSVQ) classifiers based on different features (instantaneous and transitional features of speech). For tone recognition, vector quantization (VQ) classifier was first used, and was comparable to multilayer perceptron (MLP) classifier. To get robust or better performance, a combination of distortion-based classifier (VQ) and discriminant-based classifier (MLP) is proposed. The evaluations have been carried out using standard syllable database CRDB in China, and experimental results have shown that combination of multiple classifiers with different features or different methodologies can improve recognition performance. Recognition accuracy for base syllable, tone, and tonal syllable is 96.79%, 99.82% and 96.24% respectively. Since these results were evaluated on a standard database, they can be used as a benchmark that allows direct comparison against other approaches.