The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Z(5900hit)

1141-1160hit(5900hit)

  • A New Algorithm for Reducing Components of a Gaussian Mixture Model

    Naoya YOKOYAMA  Daiki AZUMA  Shuji TSUKIYAMA  Masahiro FUKUI  

     
    PAPER

      Vol:
    E99-A No:12
      Page(s):
    2425-2434

    In statistical methods, such as statistical static timing analysis, Gaussian mixture model (GMM) is a useful tool for representing a non-Gaussian distribution and handling correlation easily. In order to repeat various statistical operations such as summation and maximum for GMMs efficiently, the number of components should be restricted around two. In this paper, we propose a method for reducing the number of components of a given GMM to two (2-GMM). Moreover, since the distribution of each component is represented often by a linear combination of some explanatory variables, we propose a method to compute the covariance between each explanatory variable and the obtained 2-GMM, that is, the sensitivity of 2-GMM to each explanatory variable. In order to evaluate the performance of the proposed methods, we show some experimental results. The proposed methods minimize the normalized integral square error of probability density function of 2-GMM by the sacrifice of the accuracy of sensitivities of 2-GMM.

  • A Highly-Adaptable and Small-Sized In-Field Power Analyzer for Low-Power IoT Devices

    Ryosuke KITAYAMA  Takashi TAKENAKA  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER

      Vol:
    E99-A No:12
      Page(s):
    2348-2362

    Power analysis for IoT devices is strongly required to protect attacks from malicious attackers. It is also very important to reduce power consumption itself of IoT devices. In this paper, we propose a highly-adaptable and small-sized in-field power analyzer for low-power IoT devices. The proposed power analyzer has the following advantages: (A) The proposed power analyzer realizes signal-averaging noise reduction with synchronization signal lines and thus it can reduce wide frequency range of noises; (B) The proposed power analyzer partitions a long-term power analysis process into several analysis segments and measures voltages and currents of each analysis segment by using small amount of data memories. By combining these analysis segments, we can obtain long-term analysis results; (C) The proposed power analyzer has two amplifiers that amplify current signals adaptively depending on their magnitude. Hence maximum readable current can be increased with keeping minimum readable current small enough. Since all of (A), (B) and (C) do not require complicated mechanisms nor circuits, the proposed power analyzer is implemented on just a 2.5cm×3.3cm board, which is the smallest size among the other existing power analyzers for IoT devices. We have measured power and energy consumption of the AES encryption process on the IoT device and demonstrated that the proposed power analyzer has only up to 1.17% measurement errors compared to a high-precision oscilloscope.

  • Beamforming Optimization via Max-Min SINR in MU-MISO SWIPT Systems under Bounded Channel Uncertainty

    Zhengyu ZHU  Zhongyong WANG  Zheng CHU  Di ZHANG  

     
    LETTER-Digital Signal Processing

      Vol:
    E99-A No:12
      Page(s):
    2576-2580

    In this letter, we consider robust beamforming optimization for a multiuser multiple-input single-output system with simultaneous wireless information and power transmission (SWIPT) for the case of imperfect channel state information. Adopting the ellipsoidal uncertainty on channel vector, the robust beamforming design are reformulated as convex semi-definite programming (SDP) by rank-one relaxation. To reduce the complexity, an ellipsoidal uncertainty on channel covariance is studied to derive the equivalent form of original problem. Simulation results are provided to demonstrate the effectiveness of the proposed schemes.

  • On the Topological Entropy of the Discretized Markov β-Transformations

    Hiroshi FUJISAKI  

     
    PAPER-Fundamentals of Information Theory

      Vol:
    E99-A No:12
      Page(s):
    2238-2247

    We define the topological entropy of the discretized Markov transformations. Previously, we obtained the topological entropy of the discretized dyadic transformation. In this research, we obtain the topological entropy of the discretized golden mean transformation. We also generalize this result and give the topological entropy of the discretized Markov β-transformations with the alphabet Σ={0,1,…,k-1} and the set F={(k-1)c,…,(k-1)(k-1)}(1≤c≤k-1) of (k-c) forbidden blocks, whose underlying transformations exhibit a wide class of greedy β-expansions of real numbers.

  • Cache-Aware GPU Optimization for Out-of-Core Cone Beam CT Reconstruction of High-Resolution Volumes

    Yuechao LU  Fumihiko INO  Kenichi HAGIHARA  

     
    PAPER-Computer System

      Pubricized:
    2016/09/05
      Vol:
    E99-D No:12
      Page(s):
    3060-3071

    This paper proposes a cache-aware optimization method to accelerate out-of-core cone beam computed tomography reconstruction on a graphics processing unit (GPU) device. Our proposed method extends a previous method by increasing the cache hit rate so as to speed up the reconstruction of high-resolution volumes that exceed the capacity of device memory. More specifically, our approach accelerates the well-known Feldkamp-Davis-Kress algorithm by utilizing the following three strategies: (1) a loop organization strategy that identifies the best tradeoff point between the cache hit rate and the number of off-chip memory accesses; (2) a data structure that exploits high locality within a layered texture; and (3) a fully pipelined strategy for hiding file input/output (I/O) time with GPU execution and data transfer times. We implement our proposed method on NVIDIA's latest Maxwell architecture and provide tuning guidelines for adjusting the execution parameters, which include the granularity and shape of thread blocks as well as the granularity of I/O data to be streamed through the pipeline, which maximizes reconstruction performance. Our experimental results show that it took less than three minutes to reconstruct a 20483-voxel volume from 1200 20482-pixel projection images on a single GPU; this translates to a speedup of approximately 1.47 as compared to the previous method.

  • A Peer-to-Peer Content-Distribution Scheme Resilient to Key Leakage

    Tatsuyuki MATSUSHITA  Shinji YAMANAKA  Fangming ZHAO  

     
    PAPER-Distributed system

      Pubricized:
    2016/08/25
      Vol:
    E99-D No:12
      Page(s):
    2956-2967

    Peer-to-peer (P2P) networks have attracted increasing attention in the distribution of large-volume and frequently accessed content. In this paper, we mainly consider the problem of key leakage in secure P2P content distribution. In secure content distribution, content is encrypted so that only legitimate users can access the content. Usually, users (peers) cannot be fully trusted in a P2P network because malicious ones might leak their decryption keys. If the redistribution of decryption keys occurs, copyright holders may incur great losses caused by free riders who access content without purchasing it. To decrease the damage caused by the key leakage, the individualization of encrypted content is necessary. The individualization means that a different (set of) decryption key(s) is required for each user to access content. In this paper, we propose a P2P content distribution scheme resilient to the key leakage that achieves the individualization of encrypted content. We show the feasibility of our scheme by conducting a large-scale P2P experiment in a real network.

  • Fast Live Migration for IO-Intensive VMs with Parallel and Adaptive Transfer of Page Cache via SAN

    Soramichi AKIYAMA  Takahiro HIROFUCHI  Ryousei TAKANO  Shinichi HONIDEN  

     
    PAPER-Operating system

      Pubricized:
    2016/08/24
      Vol:
    E99-D No:12
      Page(s):
    3024-3034

    Live migration plays an important role on improving efficiency of cloud data centers by enabling dynamically replacing virtual machines (VMs) without disrupting services running on them. Although many studies have proposed acceleration mechanisms of live migration, IO-intensive VMs still suffer from long total migration time due to a large amount of page cache. Existing studies for this problem either force the guest OS to delete the page cache before a migration, or they do not consider dynamic characteristics of cloud data centers. We propose a parallel and adaptive transfer of page cache for migrating IO-intensive VMs which (1) does not delete the page cache and is still fast by utilizing the storage area network of a data center, and (2) achieves the shortest total migration time without tuning hand-crafted parameters. Experiments showed that our method reduces total migration time of IO-intensive VMs up to 33.9%.

  • A Runtime Optimization Selection Framework to Realize Energy Efficient Networks-on-Chip

    Yuan HE  Masaaki KONDO  Takashi NAKADA  Hiroshi SASAKI  Shinobu MIWA  Hiroshi NAKAMURA  

     
    PAPER-Architecture

      Pubricized:
    2016/08/24
      Vol:
    E99-D No:12
      Page(s):
    2881-2890

    Networks-on-Chip (or NoCs, for short) play important roles in modern and future multi-core processors as they are highly related to both performance and power consumption of the entire chip. Up to date, many optimization techniques have been developed to improve NoC's bandwidth, latency and power consumption. But a clear answer to how energy efficiency is affected with these optimization techniques is yet to be found since each of these optimization techniques comes with its own benefits and overheads while there are also too many of them. Thus, here comes the problem of when and how such optimization techniques should be applied. In order to solve this problem, we build a runtime framework to throttle these optimization techniques based on concise performance and energy models. With the help of this framework, we can successfully establish adaptive selections over multiple optimization techniques to further improve performance or energy efficiency of the network at runtime.

  • Montgomery Multiplier Design for ECDSA Signature Generation Processor

    Masato TAMURA  Makoto IKEDA  

     
    PAPER

      Vol:
    E99-A No:12
      Page(s):
    2444-2452

    This paper presents the optimal implementation methods for 256-bit elliptic curve digital signature algorithm (ECDSA) signature generation processors with high speed Montgomery multipliers. We have explored the radix of the data path of the Montgomery multiplier from 2-bit to 256-bit operation and proposed the use of pipelined Montgomery multipliers for signature generation speed, area, and energy optimization. The key factor in the design optimization is how to perform modular multiplication. The high radix Montgomery multiplier is known to be an efficient implementation for high-speed modular multiplication. We have implemented ECDSA signature generation processors with high radix Montgomery multipliers using 65-nm SOTB CMOS technology. Post-layout results show that the fastest ECDSA signature generation time of 63.5µs with radix-256-bit, a two-module four-streams pipeline architecture, and an area of 0.365mm2 (which is the smallest) with a radix-16-bit zero-pipeline architecture, and the smallest signature generation energy of 9.51µJ with radix-256-bit zero-pipeline architecture.

  • Comparison of Two Signature Schemes Based on the MQ Problem and Quartz

    Routo TERADA  Ewerton R. ANDRADE  

     
    PAPER-Cryptography and Information Security

      Vol:
    E99-A No:12
      Page(s):
    2527-2538

    Patarin proposed a crytographic trapdoor called Hidden Field Equation (HFE), a trapdoor based on the Multivariate Quadratic (MQ) and the Isomorphism of Polynomials (IP) problems. The MQ problem was proved by Patarin et al.'s to be NP-complete. Although the basic HFE has been proved to be vulnerable to attacks, its variants obtained by some modifications have been proved to be stronger against attacks. The Quartz digital signature scheme based on the HFEv- trapdoor (a variant of HFE) with particular choices of parameters, has been shown to be stronger against algebraic attacks to recover the private key. Furthermore, it generates reasonably short signatures. However, Joux et al. proved (based on the Birthday Paradox Attack) that Quartz is malleable in the sense that, if an adversary gets a valid pair of message and signature, a valid signature to another related message is obtainable with 250 computations and 250 queries to the signing oracle. Currently, the recommended minimum security level is 2112. Our signature scheme is also based on Quartz but we achieve a 2112 security level against Joux et al.'s attack. It is also more efficient in signature verification and vector initializations. Furthermore, we implemented both the original and our improved Quartz signature and run empirical comparisons.

  • Equivalent Circuit Modeling of a Semiconductor-Integrated Bow-Tie Antenna for the Physical Interpretation of the Radiation Characteristics in the Terahertz Region

    Hirokazu YAMAKURA  Michihiko SUHARA  

     
    PAPER-Semiconductor Materials and Devices

      Vol:
    E99-C No:12
      Page(s):
    1312-1322

    We have derived the physics-based equivalent circuit model of a semiconductor-integrated bow-tie antenna (BTA) for expressing its impedance and radiation characteristics as a terahertz transmitter. The equivalent circuit branches and components, consisting of 16 RLC parameters are determined based on electromagnetic simulations. All the values of the circuit elements are identified using the particle swarm optimization (PSO) that is one of the modern multi-purpose optimization methods. Moreover, each element value can also be explained by the structure of the semiconductor-integrated BTA, the device size, and the material parameters.

  • Performance Optimization of Light-Field Applications on GPU

    Yuttakon YUTTAKONKIT  Shinya TAKAMAEDA-YAMAZAKI  Yasuhiko NAKASHIMA  

     
    PAPER-Computer System

      Pubricized:
    2016/08/24
      Vol:
    E99-D No:12
      Page(s):
    3072-3081

    Light-field image processing has been widely employed in many areas, from mobile devices to manufacturing applications. The fundamental process to extract the usable information requires significant computation with high-resolution raw image data. A graphics processing unit (GPU) is used to exploit the data parallelism as in general image processing applications. However, the sparse memory access pattern of the applications reduced the performance of GPU devices for both systematic and algorithmic reasons. Thus, we propose an optimization technique which redesigns the memory access pattern of the applications to alleviate the memory bottleneck of rendering application and to increase the data reusability for depth extraction application. We evaluated our optimized implementations with the state-of-the-art algorithm implementations on several GPUs where all implementations were optimally configured for each specific device. Our proposed optimization increased the performance of rendering application on GTX-780 GPU by 30% and depth extraction application on GTX-780 and GTX-980 GPUs by 82% and 18%, respectively, compared with the original implementations.

  • RFS: An LSM-Tree-Based File System for Enhanced Microdata Performance

    Lixin WANG  Yutong LU  Wei ZHANG  Yan LEI  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2016/09/06
      Vol:
    E99-D No:12
      Page(s):
    3035-3046

    File system workloads are increasing write-heavy. The growing capacity of RAM in modern nodes allows many reads to be satisfied from memory while writes must be persisted to disk. Today's sophisticated local file systems like Ext4, XFS and Btrfs optimize for reads but suffer from workloads dominated by microdata (including metadata and tiny files). In this paper we present an LSM-tree-based file system, RFS, which aims to take advantages of the write optimization of LSM-tree to provide enhanced microdata performance, while offering matching performance for large files. RFS incrementally partitions the namespace into several metadata columns on a per-directory basis, preserving disk locality for directories and reducing the write amplification of LSM-trees. A write-ordered log-structured layout is used to store small files efficiently, rather than embedding the contents of small files into inodes. We also propose an optimization of global bloom filters for efficient point lookups. Experiments show our library version of RFS can handle microwrite-intensive workloads 2-10 times faster than existing solutions such as Ext4, Btrfs and XFS.

  • An Efficient Algorithm of Discrete Particle Swarm Optimization for Multi-Objective Task Assignment

    Nannan QIAO  Jiali YOU  Yiqiang SHENG  Jinlin WANG  Haojiang DENG  

     
    PAPER-Distributed system

      Pubricized:
    2016/08/24
      Vol:
    E99-D No:12
      Page(s):
    2968-2977

    In this paper, a discrete particle swarm optimization method is proposed to solve the multi-objective task assignment problem in distributed environment. The objectives of optimization include the makespan for task execution and the budget caused by resource occupation. A two-stage approach is designed as follows. In the first stage, several artificial particles are added into the initialized swarm to guide the search direction. In the second stage, we redefine the operators of the discrete PSO to implement addition, subtraction and multiplication. Besides, a fuzzy-cost-based elite selection is used to improve the computational efficiency. Evaluation shows that the proposed algorithm achieves Pareto improvement in comparison to the state-of-the-art algorithms.

  • A Low-Power VLSI Architecture for HEVC De-Quantization and Inverse Transform

    Heming SUN  Dajiang ZHOU  Shuping ZHANG  Shinji KIMURA  

     
    PAPER

      Vol:
    E99-A No:12
      Page(s):
    2375-2387

    In this paper, we present a low-power system for the de-quantization and inverse transform of HEVC. Firstly, we present a low-delay circuit to process the coded results of the syntax elements, and then reduce the number of multipliers from 16 to 4 for the de-quantization process of each 4x4 block. Secondly, we give two efficient data mapping schemes for the memory between de-quantization and inverse transform, and the memory for transpose. Thirdly, the zero information is utilized through the whole system. For two memory parts, the write and read operation of zero blocks/ rows/ coefficients can all be skipped to save the power consumption. The results show that up to 86% power consumption can be saved for the memory part under the configuration of “Random-access” and common QPs. For the logical part, the proposed architecture for de-quantization can reduce 77% area consumption. Overall, our system can support real-time coding for 8K x 4K 120fps video sequences and the normalized area consumption can be reduced by 68% compared with the latest work.

  • Bi-Direction Interaural Matching Filter and Decision Weighting Fusion for Sound Source Localization in Noisy Environments

    Hong LIU  Mengdi YUE  Jie ZHANG  

     
    LETTER-Speech and Hearing

      Pubricized:
    2016/09/12
      Vol:
    E99-D No:12
      Page(s):
    3192-3196

    Sound source localization is an essential technique in many applications, e.g., speech enhancement, speech capturing and human-robot interaction. However, the performance of traditional methods degrades in noisy or reverberant environments, and it is sensitive to the spatial location of sound source. To solve these problems, we propose a sound source localization framework based on bi-direction interaural matching filter (IMF) and decision weighting fusion. Firstly, bi-directional IMF is put forward to describe the difference between binaural signals in forward and backward directions, respectively. Then, a hybrid interaural matching filter (HIMF), which is obtained by the bi-direction IMF through decision weighting fusion, is used to alleviate the affection of sound locations on sound source localization. Finally, the cosine similarity between the HIMFs computed from the binaural audio and transfer functions is employed to measure the probability of the source location. Constructing the similarity for all the spatial directions as a matrix, we can determine the source location by Maximum A Posteriori (MAP) estimation. Compared with several state-of-the-art methods, experimental results indicate that HIMF is more robust in noisy environments.

  • Cognition-Aware Summarization of Photos Representing Events

    Bei LIU  Makoto P. KATO  Katsumi TANAKA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2016/09/01
      Vol:
    E99-D No:12
      Page(s):
    3140-3153

    The use of photo summarization technology to summarize a photo collection is often oriented to users who own the photo collection. However, people's interest in sharing photos with others highlights the importance of cognition-aware summarization of photos by which viewers can easily recognize the exact event those photos represent. In this research, we address the problem of cognition-aware summarization of photos representing events, and propose to solve this problem and to improve the perceptual quality of a photo set by proactively preventing misrecognization that a photo set might bring. Three types of neighbor events that can possibly cause misrecognizations are discussed in this paper, namely sub-events, super-events and sibling-events. We analyze the reasons for these misrecognitions and then propose three criteria to prevent from them. A combination of the criteria is used to generate summarization of photos that can represent an event with several photos. Our approach was empirically demonstrated with photos from Flickr by utilizing their visual features and related tags. The results indicated the effectiveness of our proposed methods in comparison with a baseline method.

  • Efficient Multiplication Based on Dickson Bases over Any Finite Fields

    Sun-Mi PARK  Ku-Young CHANG  Dowon HONG  Changho SEO  

     
    PAPER-Algorithms and Data Structures

      Vol:
    E99-A No:11
      Page(s):
    2060-2074

    We propose subquadratic space complexity multipliers for any finite field $mathbb{F}_{q^n}$ over the base field $mathbb{F}_q$ using the Dickson basis, where q is a prime power. It is shown that a field multiplication in $mathbb{F}_{q^n}$ based on the Dickson basis results in computations of Toeplitz matrix vector products (TMVPs). Therefore, an efficient computation of a TMVP yields an efficient multiplier. In order to derive efficient $mathbb{F}_{q^n}$ multipliers, we develop computational schemes for a TMVP over $mathbb{F}_{q}$. As a result, the $mathbb{F}_{2^n}$ multipliers, as special cases of the proposed $mathbb{F}_{q^n}$ multipliers, have lower time complexities as well as space complexities compared with existing results. For example, in the case that n is a power of 3, the proposed $mathbb{F}_{2^n}$ multiplier for an irreducible Dickson trinomial has about 14% reduced space complexity and lower time complexity compared with the best known results.

  • Improved Half-Pixel Generation for Image Enlargement

    Keita KOBAYASHI  Hiroyuki TSUJI  Tomoaki KIMURA  

     
    LETTER-Image

      Vol:
    E99-A No:11
      Page(s):
    2012-2015

    In this paper, we propose a digital image enlargement method based on a fuzzy technique that improves half-pixel generation, especially for convex and concave signals. The proposed method is a modified version of the image enlargement scheme previously proposed by the authors, which achieves accurate half-pixel interpolation and enlarges the original image by convolution with the Lanczos function. However, the method causes impulse-like artifacts in the enlarged image. In this paper, therefore, we introduce a fuzzy set and fuzzy rule for generating half-pixels to improve the interpolation of convex and concave signals. Experimental results demonstrate that, in terms of image quality, the proposed method shows superior performance compared to bicubic interpolation and our previous method.

  • Nonlinear Acoustic Echo Cancellation by Exact-Online Adaptive Alternating Minimization

    Hiroki KURODA  Masao YAMAGISHI  Isao YAMADA  

     
    PAPER-Digital Signal Processing

      Vol:
    E99-A No:11
      Page(s):
    2027-2036

    For the nonlinear acoustic echo cancellation, we present an algorithm to estimate the threshold of the clipping effect and the room impulse response vector by suppressing their time-varying cost function. A common way to suppress the time-varying cost function of a pair of parameters is to alternatingly minimize the function with respect to each parameter while keeping the other fixed, which we refer to as adaptive alternating minimization. However, since the cost function for the threshold is nonconvex, the conventional methods approximate the exact minimizations by gradient descent updates, which causes serious degradation of the estimation accuracy in some occasions. In this paper, by exploring the fact that the cost function for the threshold becomes piecewise quadratic, we propose to exactly minimize the cost function for the threshold in a closed form while suppressing the cost function for the impulse response vector in an online manner, which we call exact-online adaptive alternating minimization. The proposed method is expected to approximate more efficiently the adaptive alternating minimization strategy than the conventional methods. Numerical experiments demonstrate the efficacy of the proposed method.

1141-1160hit(5900hit)