The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] EDR(34hit)

1-20hit(34hit)

  • A Class of Left Dihedral Codes Over Rings $mathbb{F}_q+umathbb{F}_q$

    Yuan CAO  Yonglin CAO  Jian GAO  

     
    PAPER-Coding Theory and Techniques

      Vol:
    E100-A No:12
      Page(s):
    2585-2593

    Let $mathbb{F}_q$ be a finite field of q elements, $R=mathbb{F}_q+umathbb{F}_q$ (u2=0) and D2n= be a dihedral group of order n. Left ideals of the group ring R[D2n] are called left dihedral codes over R of length 2n, and abbreviated as left D2n-codes over R. Let n be a positive factor of qe+1 for some positive integer e. In this paper, any left D2n-code over R is uniquely decomposed into a direct sum of concatenated codes with inner codes Ai and outer codes Ci, where Ai is a cyclic code over R of length n and Ci is a linear code of length 2 over a Galois extension ring of R. More precisely, a generator matrix for each outer code Ci is given. Moreover, a formula to count the number of these codes is obtained, the dual code for each left D2n-code is determined and all self-dual left D2n-codes over R are presented, respectively.

  • Mapping Multi-Level Loop Nests onto CGRAs Using Polyhedral Optimizations

    Dajiang LIU  Shouyi YIN  Leibo LIU  Shaojun WEI  

     
    PAPER

      Vol:
    E98-A No:7
      Page(s):
    1419-1430

    The coarse-grained reconfigurable architecture (CGRA) is a promising computing platform that provides both high performance and high power-efficiency. The computation-intensive portions of an application (e.g. loop nests) are often mapped onto CGRA for acceleration. However, mapping loop nests onto CGRA efficiently is quite a challenge due to the special characteristics of CGRA. To optimize the mapping of loop nests onto CGRA, this paper makes three contributions: i) Establishing a precise performance model of mapping loop nests onto CGRA, ii) Formulating the loop nests mapping as a nonlinear optimization problem based on polyhedral model, iii) Extracting an efficient heuristic algorithm and building a complete flow of mapping loop nests onto CGRA (PolyMAP). Experiment results on most kernels of the PolyBench and real-life applications show that our proposed approach can improve the performance of the kernels by 27% on average, as compared to the state-of-the-art methods. The runtime complexity of our approach is also acceptable.

  • Battery-Aware Loop Nests Mapping for CGRAs

    Yu PENG  Shouyi YIN  Leibo LIU  Shaojun WEI  

     
    PAPER-Architecture

      Vol:
    E98-D No:2
      Page(s):
    230-242

    Coarse-grained Reconfigurable Architecture (CGRA) is a promising mobile computing platform that provides both high performance and high energy efficiency. In an application, loop nests are usually mapped onto CGRA for further acceleration, so optimizing the mapping is an important goal for design of CGRAs. Moreover, obviously almost all of mobile devices are powered by batteries, how to reduce energy consumption also becomes one of primary concerns in using CGRAs. This paper makes three contributions: a) Proposing an energy consumption model for CGRA; b) Formulating loop nests mapping problem to minimize the battery charge loss; c) Extract an efficient heuristic algorithm called BPMap. Experiment results on most kernels of the benchmarks and real-life applications show that our methods can improve the performance of the kernels and lower the energy consumption.

  • Nested Loop Parallelization Using Polyhedral Optimization in High-Level Synthesis

    Akihiro SUDA  Hideki TAKASE  Kazuyoshi TAKAGI  Naofumi TAKAGI  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E97-A No:12
      Page(s):
    2498-2506

    We propose a synthesis method of nested loops into parallelized circuits by integrating the polyhedral optimization, which is a state-of-the-art technique in the field of software, into high-level synthesis. Our method constructs circuits equipped with multiple processing elements (PEs), using information generated by the polyhedral optimizing compiler. Since multiple PEs cannot concurrently access the off-chip RAM, a method for constructing on-chip buffers is also proposed. Our buffering method reduces the off-chip RAM access conflicts and further enables burst accesses and data reuses. In our experimental result, the buffered circuits generated by our method are 8.2 times on average and 26.5 times at maximum faster than the sequential non-buffered ones, when each of the parallelized circuits is configured with eight PEs.

  • High-Throughput Partially Parallel Inter-Chip Link Architecture for Asynchronous Multi-Chip NoCs

    Naoya ONIZAWA  Akira MOCHIZUKI  Hirokatsu SHIRAHAMA  Masashi IMAI  Tomohiro YONEDA  Takahiro HANYU  

     
    PAPER-Dependable Computing

      Vol:
    E97-D No:6
      Page(s):
    1546-1556

    This paper introduces a partially parallel inter-chip link architecture for asynchronous multi-chip Network-on-Chips (NoCs). The multi-chip NoCs that operate as a large NoC have been recently proposed for very large systems, such as automotive applications. Inter-chip links are key elements to realize high-performance multi-chip NoCs using a limited number of I/Os. The proposed asynchronous link based on level-encoded dual-rail (LEDR) encoding transmits several bits in parallel that are received by detecting the phase information of the LEDR signals at each serial link. It employs a burst-mode data transmission that eliminates a per-bit handshake for a high-speed operation, but the elimination may cause data-transmission errors due to cross-talk and power-supply noises. For triggering data retransmission, errors are detected from the embedded phase information; error-detection codes are not used. The throughput is theoretically modelled and is optimized by considering the bit-error rate (BER) of the link. Using delay parameters estimated for a 0.13 µm CMOS technology, the throughput of 8.82 Gbps is achieved by using 10 I/Os, which is 90.5% higher than that of a link using 9 I/Os without an error-detection method operating under negligible low BER (<10-20).

  • Effective Frame Selection for Blind Source Separation Based on Frequency Domain Independent Component Analysis

    Yusuke MIZUNO  Kazunobu KONDO  Takanori NISHINO  Norihide KITAOKA  Kazuya TAKEDA  

     
    PAPER-Engineering Acoustics

      Vol:
    E97-A No:3
      Page(s):
    784-791

    Blind source separation is a technique that can separate sound sources without such information as source location, the number of sources, and the utterance content. Multi-channel source separation using many microphones separates signals with high accuracy, even if there are many sources. However, these methods have extremely high computational complexity, which must be reduced. In this paper, we propose a computational complexity reduction method for blind source separation based on frequency domain independent component analysis (FDICA) and examine temporal data that are effective for source separation. A frame with many sound sources is effective for FDICA source separation. We assume that a frame with a low kurtosis has many sound sources and preferentially select such frames. In our proposed method, we used the log power spectrum and the kurtosis of the magnitude distribution of the observed data as selection criteria and conducted source separation experiments using speech signals from twelve speakers. We evaluated the separation performances by the signal-to-interference ratio (SIR) improvement score. From our results, the SIR improvement score was 24.3dB when all the frames were used, and 23.3dB when the 300 frames selected by our criteria were used. These results clarified that our proposed selection criteria based on kurtosis and magnitude is effective. Furthermore, we significantly reduced the computational complexity because it is proportional to the number of selected frames.

  • Affine Transformations for Communication and Reconfiguration Optimization of Mapping Loop Nests on CGRAs

    Shouyi YIN  Dajiang LIU  Leibo LIU  Shaojun WEI  

     
    PAPER-Design Methodology

      Vol:
    E96-D No:8
      Page(s):
    1582-1591

    A coarse-grained reconfigurable architecture (CGRA) is typically hybrid architecture, which is composed of a reconfigurable processing unit (RPU) and a host microprocessor. Many computation-intensive kernels (e.g., loop nests) are often mapped onto RPUs to speed up the execution of programs. Thus, mapping optimization of loop nests is very important to improve the performance of CGRA. Processing element (PE) utilization rate, communication volume and reconfiguration cost are three crucial factors for the performance of RPUs. Loop transformations can affect these three performance influencing factors greatly, and would be of much significance when mapping loops onto RPUs. In this paper, a joint loop transformation approach for RPUs is proposed, where the PE utilization rate, communication cost and reconfiguration cost are under a joint consideration. Our approach could be integrated into compilers for CGRAs to improve the operating performance. Compared with the communication-minimal approach, experimental results show that our scheme can improve 5.8% and 13.6% of execution time on motion estimation (ME) and partial differential equation (PDE) solvers kernels, respectively. Also, run-time complexity is acceptable for the practical cases.

  • Mapping Optimization of Affine Loop Nests for Reconfigurable Computing Architecture

    Dajiang LIU  Shouyi YIN  Chongyong YIN  Leibo LIU  Shaojun WEI  

     
    PAPER-Computer Architecture

      Vol:
    E95-D No:12
      Page(s):
    2898-2907

    Reconfigurable computing system is a class of parallel architecture with the ability of computing in hardware to increase performance, while remaining much of flexibility of a software solution. This architecture is particularly suitable for running regular and compute-intensive tasks, nevertheless, most compute-intensive tasks spend most of their running time in nested loops. Polyhedron model is a powerful tool to give a reasonable transformation on such nested loops. In this paper, a number of issues are addressed towards the goal of optimization of affine loop nests for reconfigurable cell array (RCA), such as approach to make the most use of processing elements (PE) while minimizing the communication volume by loop transformation in polyhedron model, determination of tilling form by the intra-statement dependence analysis and determination of tilling size by the tilling form and the RCA size. Experimental results on a number of kernels demonstrate the effectiveness of the mapping optimization approaches developed. Compared with DFG-based optimization approach, the execution performances of 1-d jacobi and matrix multiplication are improved by 28% and 48.47%. Lastly, the run-time complexity is acceptable for the practical cases.

  • Finding the Minimum Number of Face Guards is NP-Hard

    Chuzo IWAMOTO  Yusuke KITAGAKI  Kenichi MORITA  

     
    LETTER-Fundamentals of Information Systems

      Vol:
    E95-D No:11
      Page(s):
    2716-2719

    We study the complexity of finding the minimum number of face guards which can observe the whole surface of a polyhedral terrain. Here, a face guard is allowed to be placed on the faces of a terrain, and the guard can walk around on the allocated face. It is shown that finding the minimum number of face guards is NP-hard.

  • Blind Source Separation Using Dodecahedral Microphone Array under Reverberant Conditions

    Motoki OGASAWARA  Takanori NISHINO  Kazuya TAKEDA  

     
    PAPER-Engineering Acoustics

      Vol:
    E94-A No:3
      Page(s):
    897-906

    The separation and localization of sound source signals are important techniques for many applications, such as highly realistic communication and speech recognition systems. These systems are expected to work without such prior information as the number of sound sources and the environmental conditions. In this paper, we developed a dodecahedral microphone array and proposed a novel separation method with our developed device. This method refers to human sound localization cues and uses acoustical characteristics obtained by the shape of the dodecahedral microphone array. Moreover, this method includes an estimation method of the number of sound sources that can operate without prior information. The sound source separation performances were evaluated under simulated and actual reverberant conditions, and the results were compared with the conventional method. The experimental results showed that our separation performance outperformed the conventional method.

  • A Low Power SOC Architecture for the V2.0+EDR Bluetooth Using a Unified Verification Platform

    Jeonghun KIM  Suki KIM  Kwang-Hyun BAEK  

     
    PAPER-Computer System

      Vol:
    E93-D No:9
      Page(s):
    2500-2508

    This paper presents a low-power System on Chip (SOC) architecture for the v2.0+EDR (Enhanced Data Rate) Bluetooth and its applications. Our design includes a link controller, modem, RF transceiver, Sub-Band Codec (SBC), Expanded Instruction Set Computer (ESIC) processor, and peripherals. To decrease power consumption of the proposed SOC, we reduce data transfer using a dual-port memory, including a power management unit, and a clock gated approach. We also address some of issues and benefits of reusable and unified environment on a centralized data structure and SOC verification platform. This includes flexibility in meeting the final requirements using technology-independent tools wherever possible in various processes and for projects. The other aims of this work are to minimize design efforts by avoiding the same work done twice by different people and to reuse the similar environment and platform for different projects. This chip occupies a die size of 30 mm2 in 0.18 µm CMOS, and the worst-case current of the total chip is 54 mA.

  • An Asynchronous FPGA Based on LEDR/4-Phase-Dual-Rail Hybrid Architecture

    Shota ISHIHARA  Yoshiya KOMATSU  Masanori HARIYAMA  Michitaka KAMEYAMA  

     
    PAPER-Electronic Circuits

      Vol:
    E93-C No:8
      Page(s):
    1338-1348

    This paper presents an asynchronous FPGA that combines 4-phase dual-rail encoding and LEDR (Level-Encoded Dual-Rail) encoding. 4-phase dual-rail encoding is employed to achieve small area and low power for function units, while LEDR encoding is employed to achieve high throughput and low power for the data transfer using programmable interconnection resources. Area-efficient protocol converters and their control circuits are also proposed in transistor-level implementation. The proposed FPGA is designed using the e-Shuttle 65nm CMOS process. Compared to the 4-phase-dual-rail-based FPGA, the throughput is increased by 69% with almost the same transistor count. Compared to the LEDR-based FPGA, the transistor count is reduced by 47% with almost the same throughput. In terms of power consumption, the proposed FPGA achieves the lowest power compared to the 4-phase-dual-rail-based and the LEDR-based FPGAs. Compared to the synchronous FPGA, the proposed FPGA has lower power consumption when the workload is below 35%.

  • Evaluation of Trihedral Corner Reflector for SAR Polarimetric Calibration

    Shunichi KUSANO  Motoyuki SATO  

     
    LETTER

      Vol:
    E92-C No:1
      Page(s):
    112-115

    A trihedral corner reflector is often used for SAR polarimetric calibration. However, the scattering property of the reflector used for the calibration may not be correct if the high frequency approximation is not satisfied or if an incident angle deviates from the symmetric axis of the reflector. In order to know the conditions for precise SAR polarimetric calibration, we evaluated the polarimetric response of the reflector by a numerical simulation using the method of moment (MoM). It is found that allowable incident angle deviation is 5 degree to azimuth direction and 4 degree to elevation direction for precise SAR polarimetric calibration when the size of the reflector is 7.5 times larger than the wavelength of an incident wave.

  • Assisting Pictogram Selection with Categorized Semantics

    Heeryon CHO  Toru ISHIDA  Satoshi OYAMA  Rieko INABA  Toshiyuki TAKASAKI  

     
    PAPER-Knowledge Applications and Intelligent User Interfaces

      Vol:
    E91-D No:11
      Page(s):
    2638-2646

    Since participants at both end of the communication channel must share common pictogram interpretation to communicate, the pictogram selection task must consider both participants' pictogram interpretations. Pictogram interpretation, however, can be ambiguous. To assist the selection of pictograms more likely to be interpreted as intended, we propose a categorical semantic relevance measure which calculates how relevant a pictogram is to a given interpretation in terms of a given category. The proposed measure defines similarity measurement and probability of interpretation words using pictogram interpretations and frequencies gathered from a web survey. Moreover, the proposed measure is applied to categorized pictogram interpretations to enhance pictogram retrieval performance. Five pictogram categories used for categorizing pictogram interpretations are defined based on the five first-level classifications defined in the Concept Dictionary of the EDR Electronic Dictionary. Retrieval performances among not-categorized interpretations, categorized interpretations, and categorized and weighted interpretations using semantic relevance measure were compared, and the categorized semantic relevance approaches showed more stable performances than the not-categorized approach.

  • Evaluation of a Field-Programmable VLSI Based on an Asynchronous Bit-Serial Architecture

    Masanori HARIYAMA  Shota ISHIHARA  Michitaka KAMEYAMA  

     
    PAPER

      Vol:
    E91-C No:9
      Page(s):
    1419-1426

    This paper presents a novel asynchronous architecture of Field-programmable gate arrays (FPGAs) to reduce the power consumption. In the dynamic power consumption of the conventional FPGAs, the power consumed by the switch blocks and clock distribution is dominant since FPGAs have complex switch blocks and the large number of registers for high programmability. To reduce the power consumption of switch blocks and clock distribution, asynchronous bit-serial architecture is proposed. To ensure the correct operation independent of data-path lengths, we use the level-encoded dual-rail encoding and propose its area-efficient implementation. The proposed field-programmable VLSI is implemented in a 90 nm CMOS technology. The delay and the power consumption of the proposed FPVLSI are respectively 61% and 58% of those of 4-phase dual-rail encoding which is the most common encoding in delay insensitive encoding.

  • Dihedral Butterfly Digraph and Its Cayley Graph Representation

    Haruaki ONISHI  Yuuki TANAKA  Yukio SHIBATA  

     
    PAPER-Graphs and Networks

      Vol:
    E91-A No:2
      Page(s):
    613-622

    In this paper, we present a new extension of the butterfly digraph, which is known as one of the topologies used for interconnection networks. The butterfly digraph was previously generalized from binary to d-ary. We define a new digraph by adding a signed label to each vertex of the d-ary butterfly digraph. We call this digraph the dihedral butterfly digraph and study its properties. Furthermore, we show that this digraph can be represented as a Cayley graph. It is well known that a butterfly digraph can be represented as a Cayley graph on the wreath product of two cyclic groups [1]. We prove that a dihedral butterfly digraph can be represented as a Cayley graph in two ways.

  • Memory Size Computation for Real-Time Multimedia Applications Based on Polyhedral Decomposition

    Hongwei ZHU  Ilie I. LUICAN  Florin BALASA  

     
    PAPER-System Level Design

      Vol:
    E89-A No:12
      Page(s):
    3378-3386

    In real-time multimedia processing systems a very large part of the power consumption is due to the data storage and data transfer. Moreover, the area cost is often largely dominated by the memory modules. In deriving an optimized (for area and/or power) memory architecture, memory size computation is an important step in the exploration of the possible algorithmic specifications of multimedia applications. This paper presents a novel non-scalar approach for computing exactly the memory size in real-time multimedia algorithms. This methodology uses both algebraic techniques specific to the data-flow analysis used in modern compilers and, also, more recent advances in the theory of polyhedra. In contrast with all the previous works which are only estimation methods, this approach performs exact memory computations even for applications significantly large in terms of the code size, number of scalars, and number of array references.

  • Polarimetric Scattering Analysis for a Finite Dihedral Corner Reflector

    Kei HAYASHI  Ryoichi SATO  Yoshio YAMAGUCHI  Hiroyoshi YAMADA  

     
    PAPER-Sensing

      Vol:
    E89-B No:1
      Page(s):
    191-195

    This paper examines polarimetric scattering characteristics caused by a dihedral corner reflector of finite size. The dihedral corner reflector is a basic model of double-bounce structure in urban area. The detailed scattering information serves the interpretation of Polarimetric Synthetic Aperture Radar (POLSAR) data analysis. The Finite-Difference Time-Domain (FDTD) method is utilized for the scattering calculation because of its simplicity and flexibility in the target shape modeling. This paper points out that there exists a stable double-bounce squint angle region both for perfect electric conductor (PEC) and dielectric corner reflectors. Beyond this stable squint angular region, the scattering characteristics become completely different from the assumed response. A criterion on the double-bounce scattering is proposed based on the physical optics (PO) approximation. The detailed analyses on the polarimetric index (co-polarization ratio) with respect to squint angle and an experimental result measured in an anechoic chamber are shown.

  • Optimizing a Triangular Mesh for Shape Reconstruction from Images

    Atsutada NAKATSUJI  Yasuyuki SUGAYA  Kenichi KANATANI  

     
    PAPER

      Vol:
    E88-D No:10
      Page(s):
    2269-2276

    In reconstructing 3-D from images based on feature points, one usually defines a triangular mesh that has these feature points as vertices and displays the scene as a polyhedron. If the scene itself is a polyhedron, however, some of the displayed edges may be inconsistent with the true shape. This paper presents a new technique for automatically eliminating such inconsistencies by using a special template. We also present a technique for removing spurious occluding edges. All the procedures do not require any thresholds to be adjusted. Using real images, we demonstrate that our method has high capability to correct inconsistencies.

  • Polarimetric Radar Calibration Method Using Polarization-Preserving and Polarization-Selective Reflectors

    Masaharu FUJITA  Chikage MURAKAMI  

     
    PAPER-Sensing

      Vol:
    E88-B No:8
      Page(s):
    3428-3435

    Polarimetric calibration of radar is indispensable for using radar data effectively. This paper proposes a polarimetric radar calibration algorithm using polarization-preserving and polarization-selective reflectors as reference targets. The algorithm assumes radar antenna reciprocity but allows different co-polarization transmission characteristics between horizontal and vertical polarization channels. In processing, the second order terms of small cross-talk factors in antenna polarization transfer characteristics are ignored. The major advantage of the present algorithm is that it does not need assumptions on the scattering characteristics of the background natural surface and is independent of external phase calibration. The results of error analysis show that the present algorithm has sufficient tolerance against errors of reference targets. The validity of the present algorithm was evaluated by analyzing the Spaceborne Imaging Radar C (SIR-C) data and the results were satisfactory.

1-20hit(34hit)