The search functionality is under construction.

Keyword Search Result

[Keyword] x-vector(4hit)

1-4hit
  • Blind Bandwidth Extension with a Non-Linear Function and Its Evaluation on Automatic Speaker Verification

    Ryota KAMINISHI  Haruna MIYAMOTO  Sayaka SHIOTA  Hitoshi KIYA  

     
    PAPER

      Pubricized:
    2019/10/25
      Vol:
    E103-D No:1
      Page(s):
    42-49

    This study evaluates the effects of some non-learning blind bandwidth extension (BWE) methods on state-of-the-art automatic speaker verification (ASV) systems. Recently, a non-linear bandwidth extension (N-BWE) method has been proposed as a blind, non-learning, and light-weight BWE approach. Other non-learning BWEs have also been developed in recent years. For ASV evaluations, most data available to train ASV systems is narrowband (NB) telephone speech. Meanwhile, wideband (WB) data have been used to train the state-of-the-art ASV systems, such as i-vector, d-vector, and x-vector. This can cause sampling rate mismatches when all datasets are used. In this paper, we investigate the influence of sampling rate mismatches in the x-vector-based ASV systems and how non-learning BWE methods perform against them. The results showed that the N-BWE method improved the equal error rate (EER) on ASV systems based on the x-vector when the mismatches were present. We researched the relationship between objective measurements and EERs. Consequently, the N-BWE method produced the lowest EERs on both ASV systems and obtained the lower RMS-LSD value and the higher STOI score.

  • Auto-Tuning of Thread Assignment for Matrix-Vector Multiplication on GPUs

    Jinwei WANG  Xirong MA  Yuanping ZHU  Jizhou SUN  

     
    PAPER-Fundamentals of Information Systems

      Vol:
    E96-D No:11
      Page(s):
    2319-2326

    Modern GPUs have evolved to become a more general processor capable of executing scientific and engineering computations. It provides a highly parallel computing environment due to its large number of computing cores, which are suitable for numerous data parallel arithmetic computations, particularly linear algebra operations. The matrix-vector multiplication is one of the most important dense linear algebraic operations. It is applied to a diverse set of applications in many fields and must therefore be fully optimized to achieve a high-performance. In this paper, we proposed a novel auto-tuning method for matrix-vector multiplication on GPUs, where the number of assigned threads that are used to compute one element of the result vector can be auto-tuned according to the size of matrix. On the Nvidia's GPU GTX 650 with the most recent Kepler architecture, we developed an auto-tuner that can automatically select the optimal number of assigned threads for calculation. Based on the auto-tuner's result, we developed a versatile generic matrix-vector multiplication kernel with the CUDA programming model. A series of experiments on different shapes and sizes of matrices were conducted for comparing the performance of our kernel with that of the kernels from CUBLAS 5.0, MAGMA 1.3 and a warp method. The experiments results show that the performance of our matrix-vector multiplication kernel is close to the optimal behavior with increasing of the size of the matrix and has very little dependency on the shape of the matrix, which is a significant improvement compared to the other three kernels that exhibit unstable performance behavior for different shapes of matrices.

  • Scalable and Systolic Montgomery Multipliers over GF(2m)

    Chin-Chin CHEN  Chiou-Yng LEE  Erl-Huei LU  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E91-A No:7
      Page(s):
    1763-1771

    This work presents a novel scalable and systolic Montgomery's algorithm in GF(2m). The proposed algorithm is based on the Toeplitz matrix-vector representation, which obtains the scalable and systolic Montgomery multiplier in a flexible manner, and can adapt to the required precision. Analytical results indicate that the proposed multiplier over the generic field of GF(2m) has a latency of d+n(2n+1), where n = m / d , and d denotes the selected digital size. The latency is reduced to d+n(n+1) clock cycles when the field is constructed from generalized equally-spaced polynomials. Since the selected digital size is d ≥5 bits, the proposed architectures have lower time-space complexity than traditional digit-serial multipliers. Moreover, the proposed architectures have regularity, modularity and local interconnect ability, making them very suitable for VLSI implementation.

  • Low-Complexity Parallel Systolic Montgomery Multipliers over GF(2m) Using Toeplitz Matrix-Vector Representation

    Chiou-Yng LEE  

     
    PAPER-Circuit Theory

      Vol:
    E91-A No:6
      Page(s):
    1470-1477

    In this paper, a generalized Montgomery multiplication algorithm in GF(2m) using the Toeplitz matrix-vector representation is presented. The hardware architectures derived from this algorithm provide low-complexity bit-parallel systolic multipliers with trinomials and pentanomials. The results reveal that our proposed multipliers reduce the space complexity of approximately 15% compared with an existing systolic Montgomery multiplier for trinomials. Moreover, the proposed architectures have the features of regularity, modularity, and local interconnection. Accordingly, they are well suited to VLSI implementation.