The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] MPO(945hit)

81-100hit(945hit)

  • Tensor Factor Analysis for Arbitrary Speaker Conversion

    Daisuke SAITO  Nobuaki MINEMATSU  Keikichi HIROSE  

     
    PAPER-Speech and Hearing

      Pubricized:
    2020/03/13
      Vol:
    E103-D No:6
      Page(s):
    1395-1405

    This paper describes a novel approach to flexible control of speaker characteristics using tensor representation of multiple Gaussian mixture models (GMM). In voice conversion studies, realization of conversion from/to an arbitrary speaker's voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC) based on an eigenvoice GMM (EV-GMM) was proposed. In the EVC, a speaker space is constructed based on GMM supervectors which are high-dimensional vectors derived by concatenating the mean vectors of each of the speaker GMMs. In the speaker space, each speaker is represented by a small number of weight parameters of eigen-supervectors. In this paper, we revisit construction of the speaker space by introducing the tensor factor analysis of training data set. In our approach, each speaker is represented as a matrix of which the row and the column respectively correspond to the dimension of the mean vector and the Gaussian component. The speaker space is derived by the tensor factor analysis of the set of the matrices. Our approach can solve an inherent problem of supervector representation, and it improves the performance of voice conversion. In addition, in this paper, effects of speaker adaptive training before factorization are also investigated. Experimental results of one-to-many voice conversion demonstrate the effectiveness of the proposed approach.

  • Temporally Forward Nonlinear Scale Space for High Frame Rate and Ultra-Low Delay A-KAZE Matching System

    Songlin DU  Yuan LI  Takeshi IKENAGA  

     
    PAPER

      Pubricized:
    2020/03/06
      Vol:
    E103-D No:6
      Page(s):
    1226-1235

    High frame rate and ultra-low delay are the most essential requirements for building excellent human-machine-interaction systems. As a state-of-the-art local keypoint detection and feature extraction algorithm, A-KAZE shows high accuracy and robustness. Nonlinear scale space is one of the most important modules in A-KAZE, but it not only has at least one frame delay and but also is not hardware friendly. This paper proposes a hardware oriented nonlinear scale space for high frame rate and ultra-low delay A-KAZE matching system. In the proposed matching system, one part of nonlinear scale space is temporally forward and calculated in the previous frame (proposal #1), so that the processing delay is reduced to be less than 1 ms. To improve the matching accuracy affected by proposal #1, pre-adjustment of nonlinear scale (proposal #2) is proposed. Previous two frames are used to do motion estimation to predict the motion vector between previous frame and current frame. For further improvement of matching accuracy, pixel-level pre-adjustment (proposal #3) is proposed. The pre-adjustment changes from block-level to pixel-level, each pixel is assigned an unique motion vector. Experimental results prove that the proposed matching system shows average matching accuracy higher than 95% which is 5.88% higher than the existing high frame rate and ultra-low delay matching system. As for hardware performance, the proposed matching system processes VGA videos (640×480 pixels/frame) at the speed of 784 frame/second (fps) with a delay of 0.978 ms/frame.

  • Temporal Constraints and Block Weighting Judgement Based High Frame Rate and Ultra-Low Delay Mismatch Removal System

    Songlin DU  Zhe WANG  Takeshi IKENAGA  

     
    PAPER

      Pubricized:
    2020/03/18
      Vol:
    E103-D No:6
      Page(s):
    1236-1246

    High frame rate and ultra-low delay matching system plays an increasingly important role in human-machine interactions, because it guarantees high-quality experiences for users. Existing image matching algorithms always generate mismatches which heavily weaken the performance the human-machine-interactive systems. Although many mismatch removal algorithms have been proposed, few of them achieve real-time speed with high frame rate and low delay, because of complicated arithmetic operations and iterations. This paper proposes a temporal constraints and block weighting judgement based high frame rate and ultra-low delay mismatch removal system. The proposed method is based on two temporal constraints (proposal #1 and proposal #2) to firstly find some true matches, and uses these true matches to generate block weighting (proposal #3). Proposal #1 finds out some correct matches through checking a triangle route formed by three adjacent frames. Proposal #2 further reduces mismatch risk by adding one more time of matching with opposite matching direction. Finally, proposal #3 distinguishes the unverified matches to be correct or incorrect through weighting of each block. Software experiments show that the proposed mismatch removal system achieves state-of-the-art accuracy in mismatch removal. Hardware experiments indicate that the designed image processing core successfully achieves real-time processing of 784fps VGA (640×480 pixels/frame) video on field programmable gate array (FPGA), with a delay of 0.858 ms/frame.

  • Composition Proposal Generation for Manga Creation Support

    Hironori ITO  Yasuhito ASANO  

     
    PAPER

      Pubricized:
    2019/12/27
      Vol:
    E103-D No:5
      Page(s):
    949-957

    In recent years, cognition and use of manga pervade, and people who use manga for various purposes such as entertainment, study, marketing are increasing more and more. However, when people who do not specialize in it create it for these purposes, they can write plots expressing what they want to convey but the technique of the composition which arranges elements in manga such as characters or balloons corresponding to the plot create obstacles to using its merits for comprehensibility based on high flexibility of its expression. Therefore, we consider that support of this composition technique is necessary for amateurs to use manga while taking advantage of its benefits. We propose a method of generating composition proposal to support manga creation by amateurs. For the method, we also define new manga metadata model which summarize and extend metadata models by earlier studies. It represents the compostion and the plot in manga. We apply a neural machine translation mechanism for learing the relation between the composition and the plot. It considers that the plot annotation is the source of the composition annotation that is the target, and learns from the annotation dataset based on the metadata model. We conducted experiments to evaluate how the composition proposal generated by our method helps amateur manga creation, and demonstrated that it is useful.

  • Optimization Problems for Consecutive-k-out-of-n:G Systems

    Lei ZHOU  Hisashi YAMAMOTO  Taishin NAKAMURA  Xiao XIAO  

     
    PAPER-Reliability, Maintainability and Safety Analysis

      Vol:
    E103-A No:5
      Page(s):
    741-748

    A consecutive-k-out-of-n:G system consists of n components which are arranged in a line and the system works if and only if at least k consecutive components work. This paper discusses the optimization problems for a consecutive-k-out-of-n:G system. We first focus on the optimal number of components at the system design phase. Then, we focus on the optimal replacement time at the system operation phase by considering a preventive replacement, which the system is replaced at the planned time or the time of system failure which occurs first. The expected cost rates of two optimization problems are considered as objective functions to be minimized. Finally, we give study cases for the proposed optimization problems and evaluate the feasibility of the policies.

  • Successive Interference Cancellation of ICA-Aided SDMA for GFSK Signaling in BLE Systems

    Masahiro TAKIGAWA  Shinsuke IBI  Seiichi SAMPEI  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2019/11/12
      Vol:
    E103-B No:5
      Page(s):
    495-503

    This paper proposes a successive interference cancellation (SIC) of independent component analysis (ICA) aided spatial division multiple access (SDMA) for Gaussian filtered frequency shift keying (GFSK) in Bluetooth low energy (BLE) systems. The typical SDMA scheme requires estimations of channel state information (CSI) using orthogonal pilot sequences. However, the orthogonal pilot is not embedded in the BLE packet. This fact motivates us to add ICA detector into BLE systems. In this paper, focusing on the covariance matrix of ICA outputs, SIC can be applied with Cholesky decomposition. Then, in order to address the phase ambiguity problems created by the ICA process, we propose a differential detection scheme based on the MAP algorithm. In practical scenarios, it is subject to carrier frequency offset (CFO) as well as symbol timing offset (STO) induced by the hardware impairments present in the BLE peripherals. The packet error rate (PER) performance is evaluated by computer simulations when BLE peripherals simultaneously communicate in the presence of CFO and STO.

  • A Retrieval Method for 3D CAD Assembly Models Using 3D Radon Transform and Spherical Harmonic Transform

    Kaoru KATAYAMA  Takashi HIRASHIMA  

     
    PAPER

      Pubricized:
    2020/02/20
      Vol:
    E103-D No:5
      Page(s):
    992-1001

    We present a retrieval method for 3D CAD assemblies consisted of multiple components. The proposed method distinguishes not only shapes of 3D CAD assemblies but also layouts of their components. Similarity between two assemblies is computed from feature quantities of the components constituting the assemblies. In order to make the similarity robust to translation and rotation of an assembly in 3D space, we use the 3D Radon transform and the spherical harmonic transform. We show that this method has better retrieval precision and efficiency than targets for comparison by experimental evaluation.

  • Ergodic Capacity of Composite Fading Channels in Cognitive Radios with Series Formula for Product of κ-µ and α-µ Fading Distributions

    He HUANG  Chaowei YUAN  

     
    PAPER-Antennas and Propagation

      Pubricized:
    2019/10/08
      Vol:
    E103-B No:4
      Page(s):
    458-466

    In this study, product of two independent and non-identically distributed (i.n.i.d.) random variables (RVs) for κ-µ fading distribution and α-µ fading distribution is considered. The statistics of the product of RVs has been broadly applied in a large number of communications fields, such as cascaded fading channels, multiple input multiple output (MIMO) systems, radar communications and cognitive radios (CR). Exact close-form expressions of probability density function (PDF) and cumulative distribution function (CDF) with exact series formulas for the product of two i.n.i.d. fading distributions κ-µ and α-µ are deduced more accurately to represent the provided product expressions and generalized composite multipath shadowing models. Furthermore, ergodic channel capacity (ECC) is obtained to measure maximum fading channel capacity. At last, interestingly unlike κ-µ, η-µ, α-µ in [9], [17], [18], these analytical results are validated with Monte Carlo simulations and it shows that for provided κ-µ/α-µ model, non-linear parameter has more important influence than multipath component in PDF and CDF, and when the ratio between the total power of the dominant components and the total power of the scattered waves is same, higher α can significantly improve channel capacity over composite fading channels.

  • Low Delay 4K 120fps HEVC Decoder with Parallel Processing Architecture

    Ken NAKAMURA  Daisuke KOBAYASHI  Yuya OMORI  Tatsuya OSAWA  Takayuki ONISHI  Koyo NITTA  Hiroe IWASAKI  

     
    PAPER

      Vol:
    E103-C No:3
      Page(s):
    77-84

    In this paper, we describe a novel low-delay 4K 120-fps real-time HEVC decoder with a parallel processing architecture that conforms to the HEVC main 4:2:2 10 profile. It supports the hierarchical temporal scalable streams required for Ultra High Definition high-frame-rate broadcasting and also supports low-delay and high-bitrate decoding for video transmission uses. To achieve this support, the decoding processes are parallelized and pipelined at the frame level, slice level, and coding tree unit row level. The proposed decoder was implemented on three FPGAs operated at 133 and 150 MHz, and it achieved 300-Mbps stream decoding and 37-msec end-to-end delay with our concurrently developed 4K 120-fps encoder.

  • Local Memory Mapping of Multicore Processors on an Automatic Parallelizing Compiler

    Yoshitake OKI  Yuto ABE  Kazuki YAMAMOTO  Kohei YAMAMOTO  Tomoya SHIRAKAWA  Akimasa YOSHIDA  Keiji KIMURA  Hironori KASAHARA  

     
    PAPER

      Vol:
    E103-C No:3
      Page(s):
    98-109

    Utilization of local memory from real-time embedded systems to high performance systems with multi-core processors has become an important factor for satisfying hard deadline constraints. However, challenges lie in the area of efficiently managing the memory hierarchy, such as decomposing large data into small blocks to fit onto local memory and transferring blocks for reuse and replacement. To address this issue, this paper presents a compiler optimization method that automatically manage local memory of multi-core processors. The method selects and maps multi-dimensional data onto software specified memory blocks called Adjustable Blocks. These blocks are hierarchically divisible with varying sizes defined by the features of the input application. Moreover, the method introduces mapping structures called Template Arrays to maintain the indices of the decomposed multi-dimensional data. The proposed work is implemented on the OSCAR automatic parallelizing compiler and evaluations were performed on the Renesas RP2 8-core processor. Experimental results from NAS Parallel Benchmark, SPEC benchmark, and multimedia applications show the effectiveness of the method, obtaining maximum speed-ups of 20.44 with 8 cores utilizing local memory from single core sequential versions that use off-chip memory.

  • Temporal Domain Difference Based Secondary Background Modeling Algorithm

    Guowei TENG  Hao LI  Zhenglong YANG  

     
    LETTER-Communication Theory and Signals

      Vol:
    E103-A No:2
      Page(s):
    571-575

    This paper proposes a temporal domain difference based secondary background modeling algorithm for surveillance video coding. The proposed algorithm has three key technical contributions as following. Firstly, the LDBCBR (Long Distance Block Composed Background Reference) algorithm is proposed, which exploits IBBS (interval of background blocks searching) to weaken the temporal correlation of the foreground. Secondly, both BCBR (Block Composed Background Reference) and LDBCBR are exploited at the same time to generate the temporary background reference frame. The secondary modeling algorithm utilizes the temporary background blocks generated by BCBR and LDBCBR to get the final background frame. Thirdly, monitor the background reference frame after it is generated is also important. We would update the background blocks immediately when it has a big change, shorten the modeling period of the areas where foreground moves frequently and check the stable background regularly. The proposed algorithm is implemented in the platform of IEEE1857 and the experimental results demonstrate that it has significant improvement in coding efficiency. In surveillance test sequences recommended by the China AVS (Advanced Audio Video Standard) working group, our method achieve BD-Rate gain by 6.81% and 27.30% comparing with BCBR and the baseline profile.

  • Recurrent Neural Network Compression Based on Low-Rank Tensor Representation

    Andros TJANDRA  Sakriani SAKTI  Satoshi NAKAMURA  

     
    PAPER-Music Information Processing

      Pubricized:
    2019/10/17
      Vol:
    E103-D No:2
      Page(s):
    435-449

    Recurrent Neural Network (RNN) has achieved many state-of-the-art performances on various complex tasks related to the temporal and sequential data. But most of these RNNs require much computational power and a huge number of parameters for both training and inference stage. Several tensor decomposition methods are included such as CANDECOMP/PARAFAC (CP), Tucker decomposition and Tensor Train (TT) to re-parameterize the Gated Recurrent Unit (GRU) RNN. First, we evaluate all tensor-based RNNs performance on sequence modeling tasks with a various number of parameters. Based on our experiment results, TT-GRU achieved the best results in a various number of parameters compared to other decomposition methods. Later, we evaluate our proposed TT-GRU with speech recognition task. We compressed the bidirectional GRU layers inside DeepSpeech2 architecture. Based on our experiment result, our proposed TT-format GRU are able to preserve the performance while reducing the number of GRU parameters significantly compared to the uncompressed GRU.

  • Fully Homomorphic Encryption Scheme Based on Decomposition Ring Open Access

    Seiko ARITA  Sari HANDA  

     
    PAPER

      Vol:
    E103-A No:1
      Page(s):
    195-211

    In this paper, we propose the decomposition ring homomorphic encryption scheme, that is a homomorphic encryption scheme built on the decomposition ring, which is a subring of cyclotomic ring. By using the decomposition ring the structure of plaintext slot becomes ℤpl, instead of GF(pd) in conventional schemes on the cyclotomic ring. For homomorphic multiplication of integers, one can use the full of ℤpl slots using the proposed scheme, although in conventional schemes one can use only one-dimensional subspace GF(p) in each GF(pd) slot. This allows us to realize fast and compact homomorphic encryption for integer plaintexts. In fact, our benchmark results indicate that our decomposition ring homomorphic encryption schemes are several times faster than HElib for integer plaintexts due to its higher parallel computation.

  • Unconventional Jamming Scheme for Multiple Quadrature Amplitude Modulations Open Access

    Shaoshuai ZHUANSUN  Jun-an YANG  Cong TANG  

     
    PAPER-Transmission Systems and Transmission Equipment for Communications

      Pubricized:
    2019/04/05
      Vol:
    E102-B No:10
      Page(s):
    2036-2044

    It is generally believed that jamming signals similar to communication signals tend to demonstrate better jamming effects. We believe that the above conclusion only works in certain situations. To select the correct jamming scheme for a multi-level quadrature amplitude modulation (MQAM) signal in a complex environment, an optimal jamming method based on orthogonal decomposition (OD) is proposed. The method solves the jamming problem from the perspective of the in-phase dimension and quadrature dimension and exhibits a better jamming effect than normal methods. The method can construct various unconventional jamming schemes to cope with a complex environment and verify the existing jamming schemes. The Experimental results demonstrate that when the jammer ideally knows the received power at the receiver, the proposed method will always have the optimal jamming effects, and the constructed unconventional jamming scheme has an excellent jamming effect compared with normal schemes in the case of a constellation distortion.

  • Hardware-Based Principal Component Analysis for Hybrid Neural Network Trained by Particle Swarm Optimization on a Chip

    Tuan Linh DANG  Yukinobu HOSHINO  

     
    PAPER-Neural Networks and Bioengineering

      Vol:
    E102-A No:10
      Page(s):
    1374-1382

    This paper presents a hybrid architecture for a neural network (NN) trained by a particle swarm optimization (PSO) algorithm. The NN is implemented on the hardware side while the PSO is executed by a processor on the software side. In addition, principal component analysis (PCA) is also applied to reduce correlated information. The PCA module is implemented in hardware by the SystemVerilog programming language to increase operating speed. Experimental results showed that the proposed architecture had been successfully implemented. In addition, the hardware-based NN trained by PSO (NN-PSO) program was faster than the software-based NN trained by the PSO program. The proposed NN-PSO with PCA also obtained better recognition rates than the NN-PSO without-PCA.

  • A Note on the Zero-Difference Balanced Functions with New Parameters

    Shanding XU  Xiwang CAO  Jian GAO  

     
    LETTER-Cryptography and Information Security

      Vol:
    E102-A No:10
      Page(s):
    1402-1405

    As a generalization of perfect nonlinear (PN) functions, zero-difference balanced (ZDB) functions play an important role in coding theory, cryptography and communications engineering. Inspired by a foregoing work of Liu et al. [1], we present a class of ZDB functions with new parameters based on the cyclotomy in finite fields. Employing these ZDB functions, we obtain simultaneously optimal constant composition codes and perfect difference systems of sets.

  • Adaptive Multi-Scale Tracking Target Algorithm through Drone

    Qiusheng HE  Xiuyan SHAO  Wei CHEN  Xiaoyun LI  Xiao YANG  Tongfeng SUN  

     
    PAPER

      Pubricized:
    2019/04/26
      Vol:
    E102-B No:10
      Page(s):
    1998-2005

    In order to solve the influence of scale change on target tracking using the drone, a multi-scale target tracking algorithm is proposed which based on the color feature tracking algorithm. The algorithm realized adaptive scale tracking by training position and scale correlation filters. It can first obtain the target center position of next frame by computing the maximum of the response, where the position correlation filter is learned by the least squares classifier and the dimensionality reduction for color features is analyzed by principal component analysis. The scale correlation filter is obtained by color characteristics at 33 rectangular areas which is set by the scale factor around the central location and is reduced dimensions by orthogonal triangle decomposition. Finally, the location and size of the target are updated by the maximum of the response. By testing 13 challenging video sequences taken by the drone, the results show that the algorithm has adaptability to the changes in the target scale and its robustness along with many other performance indicators are both better than the most state-of-the-art methods in illumination Variation, fast motion, motion blur and other complex situations.

  • On the Competitive Analysis for the Multi-Objective Time Series Search Problem

    Toshiya ITOH  Yoshinori TAKEI  

     
    PAPER-Optimization

      Vol:
    E102-A No:9
      Page(s):
    1150-1158

    For the multi-objective time series search problem, Hasegawa and Itoh [Theoretical Computer Science, Vol.78, pp.58-66, 2018] presented the best possible online algorithm balanced price policy for any monotone function f:Rk→R. Specifically the competitive ratio with respect to the monotone function f(c1,...,ck)=(c1+…+ck)/k is referred to as the arithmetic mean component competitive ratio. Hasegawa and Itoh derived the explicit representation of the arithmetic mean component competitive ratio for k=2, but it has not been known for any integer k≥3. In this paper, we derive the explicit representations of the arithmetic mean component competitive ratio for k=3 and k=4, respectively. On the other hand, we show that it is computationally difficult to derive the explicit representation of the arithmetic mean component competitive ratio for arbitrary integer k in a way similar to the cases for k=2, 3, and 4.

  • Differences among Summation Polynomials over Various Forms of Elliptic Curves

    Chen-Mou CHENG  Kenta KODERA  Atsuko MIYAJI  

     
    PAPER-Cryptography and Information Security

      Vol:
    E102-A No:9
      Page(s):
    1061-1071

    The security of elliptic curve cryptography is closely related to the computational complexity of the elliptic curve discrete logarithm problem (ECDLP). Today, the best practical attacks against ECDLP are exponential-time generic discrete logarithm algorithms such as Pollard's rho method. A recent line of inquiry in index calculus for ECDLP started by Semaev, Gaudry, and Diem has shown that, under certain heuristic assumptions, such algorithms could lead to subexponential attacks to ECDLP. In this study, we investigate the computational complexity of ECDLP for elliptic curves in various forms — including Hessian, Montgomery, (twisted) Edwards, and Weierstrass representations — using index calculus. Using index calculus, we aim to determine whether there is any significant difference in the computational complexity of ECDLP for elliptic curves in various forms. We provide empirical evidence and insight showing an affirmative answer in this paper.

  • TDCTFIC: A Novel Recommendation Framework Fusing Temporal Dynamics, CNN-Based Text Features and Item Correlation

    Meng Ting XIONG  Yong FENG  Ting WU  Jia Xing SHANG  Bao Hua QIANG  Ya Nan WANG  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2019/05/14
      Vol:
    E102-D No:8
      Page(s):
    1517-1525

    The traditional recommendation system (RS) can learn the potential personal preferences of users and potential attribute characteristics of items through the rating records between users and items to make recommendations.However, for the new items with no historical rating records,the traditional RS usually suffers from the typical cold start problem. Additional auxiliary information has usually been used in the item cold start recommendation,we further bring temporal dynamics,text and relevance in our models to release item cold start.Two new cold start recommendation models TmTx(Time,Text) and TmTI(Time,Text,Item correlation) proposed to solve the item cold start problem for different cold start scenarios.While well-known methods like TimeSVD++ and CoFactor partially take temporal dynamics,comments,and item correlations into consideration to solve the cold start problem but none of them combines these information together.Two models proposed in this paper fused features such as time,text,and relevance can effectively improve the performance under item cold start.We select the convolutional neural network (CNN) to extract features from item description text which provides the model the ability to deal with cold start items.Both proposed models can effectively improve the performance with item cold start.Experimental results on three real-world data set show that our proposed models lead to significant improvement compared with the baseline methods.

81-100hit(945hit)