The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] EPA(260hit)

1-20hit(260hit)

  • MDX-Mixer: Music Demixing by Leveraging Source Signals Separated by Existing Demixing Models Open Access

    Tomoyasu NAKANO  Masataka GOTO  

     
    PAPER-Music Information Processing

      Pubricized:
    2024/04/05
      Vol:
    E107-D No:8
      Page(s):
    1079-1088

    This paper presents MDX-Mixer, which improves music demixing (MDX) performance by leveraging source signals separated by multiple existing MDX models. Deep-learning-based MDX models have improved their separation performances year by year for four kinds of sound sources: “vocals,” “drums,” “bass,” and “other”. Our research question is whether mixing (i.e., weighted sum) the signals separated by state-of-the-art MDX models can obtain either the best of everything or higher separation performance. Previously, in singing voice separation and MDX, there have been studies in which separated signals of the same sound source are mixed with each other using time-invariant or time-varying positive mixing weights. In contrast to those, this study is novel in that it allows for negative weights as well and performs time-varying mixing using all of the separated source signals and the music acoustic signal before separation. The time-varying weights are estimated by modeling the music acoustic signals and their separated signals by dividing them into short segments. In this paper we propose two new systems: one that estimates time-invariant weights using 1×1 convolution, and one that estimates time-varying weights by applying the MLP-Mixer layer proposed in the computer vision field to each segment. The latter model is called MDX-Mixer. Their performances were evaluated based on the source-to-distortion ratio (SDR) using the well-known MUSDB18-HQ dataset. The results show that the MDX-Mixer achieved higher SDR than the separated signals given by three state-of-the-art MDX models.

  • On the First Separating Redundancy of Array LDPC Codes Open Access

    Haiyang LIU  Lianrong MA  

     
    LETTER-Coding Theory

      Pubricized:
    2023/08/16
      Vol:
    E107-A No:4
      Page(s):
    670-674

    Given an odd prime q and an integer m ≤ q, a binary mq × q2 quasi-cyclic parity-check matrix H(m, q) can be constructed for an array low-density parity-check (LDPC) code C (m, q). In this letter, we investigate the first separating redundancy of C (m, q). We prove that H (m, q) is 1-separating for any pair of (m, q), from which we conclude that the first separating redundancy of C (m, q) is upper bounded by mq. Then we show that our upper bound on the first separating redundancy of C (m, q) is tighter than the general deterministic and constructive upper bounds in the literature. For m=2, we further prove that the first separating redundancy of C (2, q) is 2q for any odd prime q. For m ≥ 3, we conjecture that the first separating redundancy of C (m, q) is mq for any fixed m and sufficiently large q.

  • Inference Discrepancy Based Curriculum Learning for Neural Machine Translation

    Lei ZHOU  Ryohei SASANO  Koichi TAKEDA  

     
    PAPER-Natural Language Processing

      Pubricized:
    2023/10/18
      Vol:
    E107-D No:1
      Page(s):
    135-143

    In practice, even a well-trained neural machine translation (NMT) model can still make biased inferences on the training set due to distribution shifts. For the human learning process, if we can not reproduce something correctly after learning it multiple times, we consider it to be more difficult. Likewise, a training example causing a large discrepancy between inference and reference implies higher learning difficulty for the MT model. Therefore, we propose to adopt the inference discrepancy of each training example as the difficulty criterion, and according to which rank training examples from easy to hard. In this way, a trained model can guide the curriculum learning process of an initial model identical to itself. We put forward an analogy to this training scheme as guiding the learning process of a curriculum NMT model by a pretrained vanilla model. In this paper, we assess the effectiveness of the proposed training scheme and take an insight into the influence of translation direction, evaluation metrics and different curriculum schedules. Experimental results on translation benchmarks WMT14 English ⇒ German, WMT17 Chinese ⇒ English and Multitarget TED Talks Task (MTTT) English ⇔ German, English ⇔ Chinese, English ⇔ Russian demonstrate that our proposed method consistently improves the translation performance against the advanced Transformer baseline.

  • Optimal (r, δ)-Locally Repairable Codes from Reed-Solomon Codes

    Lin-Zhi SHEN  Yu-Jie WANG  

     
    LETTER-Coding Theory

      Pubricized:
    2023/05/30
      Vol:
    E106-A No:12
      Page(s):
    1589-1592

    For an [n, k, d] (r, δ)-locally repairable codes ((r, δ)-LRCs), its minimum distance d satisfies the Singleton-like bound. The construction of optimal (r, δ)-LRC, attaining this Singleton-like bound, is an important research problem in recent years for thier applications in distributed storage systems. In this letter, we use Reed-Solomon codes to construct two classes of optimal (r, δ)-LRCs. The optimal LRCs are given by the evaluations of multiple polynomials of degree at most r - 1 at some points in Fq. The first class gives the [(r + δ - 1)t, rt - s, δ + s] optimal (r, δ)-LRC over Fq provided that r + δ + s - 1≤q, s≤δ, s

  • Unsupervised Techniques for Identifying the Mode of a Multi-Functional Radar with Varying Pulse Sequences

    Jayson ROOK  Chi-Hao CHENG  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/08/01
      Vol:
    E106-D No:11
      Page(s):
    1822-1830

    A multifunctional radar (MFR) with varying pulse sequences can change its signal characteristics and/or pattern, based on the presence of targets and to avoid being jammed. To take a countermeasure against an MFR, it is crucial for an electronic warfare (EW) system to be able to identify and separate a MFR's modes via analyzing intercepted radar signals, without a priori knowledge. In this article, two correlation-based methods, one taking the signal's order into account and another one ignoring the signal's order, are proposed and investigated for this task. The results demonstrate their great potential.

  • Construction of Singleton-Type Optimal LRCs from Existing LRCs and Near-MDS Codes

    Qiang FU  Buhong WANG  Ruihu LI  Ruipan YANG  

     
    PAPER-Coding Theory

      Pubricized:
    2023/01/31
      Vol:
    E106-A No:8
      Page(s):
    1051-1056

    Modern large scale distributed storage systems play a central role in data center and cloud storage, while node failure in data center is common. The lost data in failure node must be recovered efficiently. Locally repairable codes (LRCs) are designed to solve this problem. The locality of an LRC is the number of nodes that participate in recovering the lost data from node failure, which characterizes the repair efficiency. An LRC is called optimal if its minimum distance attains Singleton-type upper bound [1]. In this paper, using basic techniques of linear algebra over finite field, infinite optimal LRCs over extension fields are derived from a given optimal LRC over base field(or small field). Next, this paper investigates the relation between near-MDS codes with some constraints and LRCs, further, proposes an algorithm to determine locality of dual of a given linear code. Finally, based on near-MDS codes and the proposed algorithm, those obtained optimal LRCs are shown.

  • Deep Multiplicative Update Algorithm for Nonnegative Matrix Factorization and Its Application to Audio Signals

    Hiroki TANJI  Takahiro MURAKAMI  

     
    PAPER-Digital Signal Processing

      Pubricized:
    2023/01/19
      Vol:
    E106-A No:7
      Page(s):
    962-975

    The design and adjustment of the divergence in audio applications using nonnegative matrix factorization (NMF) is still open problem. In this study, to deal with this problem, we explore a representation of the divergence using neural networks (NNs). Instead of the divergence, our approach extends the multiplicative update algorithm (MUA), which estimates the NMF parameters, using NNs. The design of the extended MUA incorporates NNs, and the new algorithm is referred to as the deep MUA (DeMUA) for NMF. While the DeMUA represents the algorithm for the NMF, interestingly, the divergence is obtained from the incorporated NN. In addition, we propose theoretical guides to design the incorporated NN such that it can be interpreted as a divergence. By appropriately designing the NN, MUAs based on existing divergences with a single hyper-parameter can be represented by the DeMUA. To train the DeMUA, we applied it to audio denoising and supervised signal separation. Our experimental results show that the proposed architecture can learn the MUA and the divergences in sparse denoising and speech separation tasks and that the MUA based on generalized divergences with multiple parameters shows favorable performances on these tasks.

  • An Identifier Locator Separation Protocol for the Shared Prefix Model over IEEE WAVE IPv6 Networks Open Access

    Sangjin NAM  Sung-Gi MIN  

     
    PAPER-Network

      Pubricized:
    2022/10/21
      Vol:
    E106-B No:4
      Page(s):
    317-330

    As the active safety of vehicles has become essential, vehicular communication has been gaining attention. The IETF IPWAVE working group has proposed the shared prefix model-based vehicular link model. In the shared prefix model, a prefix is shared among RSUs to prevent changes in IPv6 addresses of a vehicle within a shared prefix domain. However, vehicle movement must be tracked to deliver packets to the serving RSU of the vehicle within a shared prefix domain. The Identifier/Locator Separation Protocol (ILSP) is one of the techniques used to handle vehicle movement. It has several drawbacks such as the inability to communicate with a standard IPv6 module without special components and the requirement to pass signaling messages between end hosts. Such drawbacks severely limit the service availability for a vehicle in the Internet. We propose an ILSP for a shared prefix model over IEEE WAVE IPv6 networks. The proposed protocol supports IPv6 communication between a standard IPv6 node in the Internet and a vehicle supporting the proposed protocol. In addition, the protocol hides vehicle movement within a shared prefix domain to peer hosts, eliminating the signaling between end hosts. The proposed protocol introduces a special NDP module based on IETF IPWAVE vehicular NDP to support vehicular mobility management within a shared prefix domain and minimize link-level multicast in WAVE networks.

  • Constructions of Optimal Single-Parity Locally Repairable Codes with Multiple Repair Sets

    Yang DING  Qingye LI  Yuting QIU  

     
    LETTER-Coding Theory

      Pubricized:
    2022/08/03
      Vol:
    E106-A No:1
      Page(s):
    78-82

    Locally repairable codes have attracted lots of interest in Distributed Storage Systems. If a symbol of a code can be repaired respectively by t disjoint groups of other symbols, each groups has size at most r, we say that the code symbol has (r, t)-locality. In this paper, we employ parity-check matrix to construct information single-parity (r, t)-locality LRCs. All our codes attain the Singleton-like bound of LRCs where each repair group contains a single parity symbol and thus are optimal.

  • Comparison of Value- and Reference-Based Memory Page Compaction in Virtualized Systems

    Naoki AOYAMA  Hiroshi YAMADA  

     
    PAPER-Software System

      Pubricized:
    2022/08/31
      Vol:
    E105-D No:12
      Page(s):
    2075-2084

    The issue of copying values or references has historically been studied for managing memory objects, especially in distributed systems. In this paper, we explore a new topic on copying values v.s. references, for memory page compaction on virtualized systems. Memory page compaction moves target physical pages to a contiguous memory region at the operating system kernel level to create huge pages. Memory virtualization provides an opportunity to perform memory page compaction by copying the references of the physical pages. That is, instead of copying pages' values, we can move guest physical pages by changing the mappings of guest-physical to machine-physical pages. The goal of this paper is a quantitative comparison between value- and reference-based memory page compaction. To do so, we developed a software mechanism that achieves memory page compaction by appropriately updating the references of guest-physical pages. We prototyped the mechanism on Linux 4.19.29 and the experimental results show that the prototype's page compaction is up to 78% faster and achieves up to 17% higher performance on the memory-intensive real-world applications as compared to the default value-copy compaction scheme.

  • A 16/32Gbps Dual-Mode SerDes Transmitter with Linearity Enhanced SST Driver

    Li DING  Jing JIN  Jianjun ZHOU  

     
    PAPER

      Pubricized:
    2022/05/13
      Vol:
    E105-A No:11
      Page(s):
    1443-1449

    This brief presents A 16/32Gb/s dual-mode transmitter including a linearity calibration loop to maintain amplitude linearity of the SST driver. Linearity detection and corresponding master-slave power supply circuits are designed to implement the proposed architecture. The proposed transmitter is manufactured in a 22nm FD-SOI process. The linearity calibration loop reduces the peak INL errors of the transmitter by 50%, and the RLM rises from 92.4% to 98.5% when the transmitter is in PAM4 mode. The chip area of the transmitter is 0.067mm2, while the proposed linearity enhanced part is 0.05×0.02mm2 and the total power consumption is 64.6mW with a 1.1V power supply. The linearity calibration loop can be detached from the circuit without consuming extra power.

  • End-to-End Object Separation for Threat Detection in Large-Scale X-Ray Security Images

    Joanna Kazzandra DUMAGPI  Yong-Jin JEONG  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2022/07/25
      Vol:
    E105-D No:10
      Page(s):
    1807-1811

    Fine-grained image analysis, such as pixel-level approaches, improves threat detection in x-ray security images. In the practical setting, the cost of obtaining complete pixel-level annotations increases significantly, which can be reduced by partially labeling the dataset. However, handling partially labeled datasets can lead to training complicated multi-stage networks. In this paper, we propose a new end-to-end object separation framework that trains a single network on a partially labeled dataset while also alleviating the inherent class imbalance at the data and object proposal level. Empirical results demonstrate significant improvement over existing approaches.

  • Blind Signal Separation for Array Radar Measurement Using Mathematical Model of Pulse Wave Propagation Open Access

    Takuya SAKAMOTO  

     
    PAPER-Sensing

      Pubricized:
    2022/02/18
      Vol:
    E105-B No:8
      Page(s):
    981-989

    This paper presents a novel blind signal separation method for the measurement of pulse waves at multiple body positions using an array radar system. The proposed method is based on a mathematical model of pulse wave propagation. The model relies on three factors: (1) a small displacement approximation, (2) beam pattern orthogonality, and (3) an impulse response model of pulse waves. The separation of radar echoes is formulated as an optimization problem, and the associated objective function is established using the mathematical model. We evaluate the performance of the proposed method using measured radar data from participants lying in a prone position. The accuracy of the proposed method, in terms of estimating the body displacements, is measured using reference data taken from laser displacement sensors. The average estimation errors are found to be 10-21% smaller than those of conventional methods. These results indicate the effectiveness of the proposed method for achieving noncontact measurements of the displacements of multiple body positions.

  • Supervised Audio Source Separation Based on Nonnegative Matrix Factorization with Cosine Similarity Penalty Open Access

    Yuta IWASE  Daichi KITAMURA  

     
    PAPER-Engineering Acoustics

      Pubricized:
    2021/12/08
      Vol:
    E105-A No:6
      Page(s):
    906-913

    In this study, we aim to improve the performance of audio source separation for monaural mixture signals. For monaural audio source separation, semisupervised nonnegative matrix factorization (SNMF) can achieve higher separation performance by employing small supervised signals. In particular, penalized SNMF (PSNMF) with orthogonality penalty is an effective method. PSNMF forces two basis matrices for target and nontarget sources to be orthogonal to each other and improves the separation accuracy. However, the conventional orthogonality penalty is based on an inner product and does not affect the estimation of the basis matrix properly because of the scale indeterminacy between the basis and activation matrices in NMF. To cope with this problem, a new PSNMF with cosine similarity between the basis matrices is proposed. The experimental comparison shows the efficacy of the proposed cosine similarity penalty in supervised audio source separation.

  • Speaker-Independent Audio-Visual Speech Separation Based on Transformer in Multi-Talker Environments

    Jing WANG  Yiyu LUO  Weiming YI  Xiang XIE  

     
    PAPER-Speech and Hearing

      Pubricized:
    2022/01/11
      Vol:
    E105-D No:4
      Page(s):
    766-777

    Speech separation is the task of extracting target speech while suppressing background interference components. In applications like video telephones, visual information about the target speaker is available, which can be leveraged for multi-speaker speech separation. Most previous multi-speaker separation methods are mainly based on convolutional or recurrent neural networks. Recently, Transformer-based Seq2Seq models have achieved state-of-the-art performance in various tasks, such as neural machine translation (NMT), automatic speech recognition (ASR), etc. Transformer has showed an advantage in modeling audio-visual temporal context by multi-head attention blocks through explicitly assigning attention weights. Besides, Transformer doesn't have any recurrent sub-networks, thus supporting parallelization of sequence computation. In this paper, we propose a novel speaker-independent audio-visual speech separation method based on Transformer, which can be flexibly applied to unknown number and identity of speakers. The model receives both audio-visual streams, including noisy spectrogram and speaker lip embeddings, and predicts a complex time-frequency mask for the corresponding target speaker. The model is made up by three main components: audio encoder, visual encoder and Transformer-based mask generator. Two different structures of encoders are investigated and compared, including ResNet-based and Transformer-based. The performance of the proposed method is evaluated in terms of source separation and speech quality metrics. The experimental results on the benchmark GRID dataset show the effectiveness of the method on speaker-independent separation task in multi-talker environments. The model generalizes well to unseen identities of speakers and noise types. Though only trained on 2-speaker mixtures, the model achieves reasonable performance when tested on 2-speaker and 3-speaker mixtures. Besides, the model still shows an advantage compared with previous audio-visual speech separation works.

  • Five Cells and Tilepaint are NP-Complete

    Chuzo IWAMOTO  Tatsuya IDE  

     
    PAPER

      Pubricized:
    2021/10/18
      Vol:
    E105-D No:3
      Page(s):
    508-516

    Five Cells and Tilepaint are Nikoli's pencil puzzles. We study the computational complexity of Five Cells and Tilepaint puzzles. It is shown that deciding whether a given instance of each puzzle has a solution is NP-complete.

  • Two-Stage Fine-Grained Text-Level Sentiment Analysis Based on Syntactic Rule Matching and Deep Semantic

    Weizhi LIAO  Yaheng MA  Yiling CAO  Guanglei YE  Dongzhou ZUO  

     
    PAPER

      Pubricized:
    2021/04/28
      Vol:
    E104-D No:8
      Page(s):
    1274-1280

    Aiming at the problem that traditional text-level sentiment analysis methods usually ignore the emotional tendency corresponding to the object or attribute. In this paper, a novel two-stage fine-grained text-level sentiment analysis model based on syntactic rule matching and deep semantics is proposed. Based on analyzing the characteristics and difficulties of fine-grained sentiment analysis, a two-stage fine-grained sentiment analysis algorithm framework is constructed. In the first stage, the objects and its corresponding opinions are extracted based on syntactic rules matching to obtain preliminary objects and opinions. The second stage based on deep semantic network to extract more accurate objects and opinions. Aiming at the problem that the extraction result contains multiple objects and opinions to be matched, an object-opinion matching algorithm based on the minimum lexical separation distance is proposed to achieve accurate pairwise matching. Finally, the proposed algorithm is evaluated on several public datasets to demonstrate its practicality and effectiveness.

  • Cyclic LRCs with Availability from Linearized Polynomials

    Pan TAN  Zhengchun ZHOU   Haode YAN  Yong WANG  

     
    LETTER-Coding Theory

      Pubricized:
    2021/01/18
      Vol:
    E104-A No:7
      Page(s):
    991-995

    Locally repairable codes (LRCs) with availability have received considerable attention in recent years since they are able to solve many problems in distributed storage systems such as repairing multiple node failures and managing hot data. Constructing LRCs with locality r and availability t (also called (r, t)-LRCs) with new parameters becomes an interesting research subject in coding theory. The objective of this paper is to propose two generic constructions of cyclic (r, t)-LRCs via linearized polynomials over finite fields. These two constructions include two earlier ones of cyclic LRCs from trace functions and truncated trace functions as special cases and lead to LRCs with new parameters that can not be produced by earlier ones.

  • Deep Clustering for Improved Inter-Cluster Separability and Intra-Cluster Homogeneity with Cohesive Loss

    Byeonghak KIM  Murray LOEW  David K. HAN  Hanseok KO  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/01/28
      Vol:
    E104-D No:5
      Page(s):
    776-780

    To date, many studies have employed clustering for the classification of unlabeled data. Deep separate clustering applies several deep learning models to conventional clustering algorithms to more clearly separate the distribution of the clusters. In this paper, we employ a convolutional autoencoder to learn the features of input images. Following this, k-means clustering is conducted using the encoded layer features learned by the convolutional autoencoder. A center loss function is then added to aggregate the data points into clusters to increase the intra-cluster homogeneity. Finally, we calculate and increase the inter-cluster separability. We combine all loss functions into a single global objective function. Our new deep clustering method surpasses the performance of existing clustering approaches when compared in experiments under the same conditions.

  • On the Separating Redundancy of Ternary Golay Codes

    Haiyang LIU  Lianrong MA  Hao ZHANG  

     
    LETTER-Coding Theory

      Pubricized:
    2020/09/17
      Vol:
    E104-A No:3
      Page(s):
    650-655

    Let G11 (resp., G12) be the ternary Golay code of length 11 (resp., 12). In this letter, we investigate the separating redundancies of G11 and G12. In particular, we determine the values of sl(G11) for l = 1, 3, 4 and sl(G12) for l = 1, 4, 5, where sl(G11) (resp., sl(G12)) is the l-th separating redundancy of G11 (resp., G12). We also provide lower and upper bounds on s2(G11), s2(G12), and s3(G12).

1-20hit(260hit)