The search functionality is under construction.

Author Search Result

[Author] Takao ONOYE(65hit)

1-20hit(65hit)

  • Quantitative Prediction of On-Chip Capacitive and Inductive Crosstalk Noise and Tradeoff between Wire Cross-Sectional Area and Inductive Crosstalk Effect

    Yasuhiro OGASAHARA  Masanori HASHIMOTO  Takao ONOYE  

     
    PAPER

      Vol:
    E90-A No:4
      Page(s):
    724-731

    Capacitive and inductive crosstalk noises are expected to be more serious in advanced technologies. However, capacitive and inductive crosstalk noises in the future have not been concurrently and sufficiently discussed quantitatively, though capacitive crosstalk noise has been intensively studied solely as a primary factor of interconnect delay variation. This paper quantitatively predicts the impact of capacitive and inductive crosstalk in prospective processes, and reveals that interconnect scaling strategies strongly affect relative dominance between capacitive and inductive coupling. Our prediction also makes the point that the interconnect resistance significantly influences both inductive coupling noise and propagation delay. We then evaluate a tradeoff between wire cross-sectional area and worst-case propagation delay focusing on inductive coupling noise, and show that an appropriate selection of wire cross-section can reduce delay uncertainty at the small sacrifice of propagation delay.

  • 3D Acoustic Image Localization Algorithm by Embedded DSP

    Wataru KOBAYASHI  Noriaki SAKAMOTO  Takao ONOYE  Isao SHIRAKAWA  

     
    PAPER

      Vol:
    E84-A No:6
      Page(s):
    1423-1430

    This paper describes a realtime 3D sound localization algorithm to be implemented with the use of a low power embedded DSP. A distinctive feature of this implementation approach is that the audible frequency band is divided into three, in accordance with the analysis of the sound reflection and diffraction effects through different media from a certain sound source to human ears. In the low, intermediate, and high frequency subbands, different schemes of the 3D sound localization are devised by means of an IIR filter, parametric equalizers, and a comb filter, respectively, so as to be run realtime on a low power embedded DSP. This algorithm aims at providing a listener with the 3D sound effects through headphones at low cost and low power consumption.

  • A Low-Power DSP Core Architecture for Low Bitrate Speech Codec

    Hiroyuki OKUHATA  Morgan H. MIKI  Takao ONOYE  Isao SHIRAKAWA  

     
    PAPER

      Vol:
    E81-A No:8
      Page(s):
    1616-1621

    A VLSI implementation of a low-power DSP core is described, which is dedicated to the G. 723. 1 low bitrate speech codec. A number of sophisticated DSP microarchitectures are devised mainly on dual multiply accumulators, rounding and saturation mechanisms, and two-banked on-chip memory. The main attempt is focused on lowering the clock frequency, and therefore on reducing the total power consumption, at the cost of a fairly small increase of chip area. The proposed DSP architecture has been integrated in the total area of 7. 75 mm2 by using a 0. 35 µm CMOS technology, which can operate at 10 MHz with the dissipation of 44. 9 mW from a single 3 V supply.

  • Stress Probability Computation for Estimating NBTI-Induced Delay Degradation

    Hiroaki KONOURA  Yukio MITSUYAMA  Masanori HASHIMOTO  Takao ONOYE  

     
    PAPER-Device and Circuit Modeling and Analysis

      Vol:
    E94-A No:12
      Page(s):
    2545-2553

    PMOS stress (ON) probability has a strong impact on circuit timing degradation due to NBTI effect. This paper evaluates how the granularity of stress probability calculation affects NBTI prediction using a state-of-the-art long term prediction model. Experimental evaluations show that the stress probability should be estimated at transistor level to accurately predict the increase in delay, especially when the circuit operation and/or inputs are highly biased. We then devise and evaluate two annotation methods of stress probability to gate-level timing analysis; one guarantees the pessimism desirable for timing analysis and the other aims to obtain the result close to transistor-level timing analysis. Experimental results show that gate-level timing analysis with transistor-level stress probability calculation estimates the increase in delay with 12.6% error.

  • Implementation of Multi-Agent Object Attention System Based on Biologically Inspired Attractor Selection

    Ryoji HASHIMOTO  Tomoya MATSUMURA  Yoshihiro NOZATO  Kenji WATANABE  Takao ONOYE  

     
    PAPER-Video Processing Systems

      Vol:
    E91-A No:10
      Page(s):
    2909-2917

    A multi-agent object attention system is proposed, which is based on biologically inspired attractor selection model. Object attention is facilitated by using a video sequence and a depth map obtained through a compound-eye image sensor TOMBO. Robustness of the multi-agent system over environmental changes is enhanced by utilizing the biological model of adaptive response by attractor selection. To implement the proposed system, an efficient VLSI architecture is employed with reducing enormous computational costs and memory accesses required for depth map processing and multi-agent attractor selection process. According to the FPGA implementation result of the proposed object attention system, which is accomplished by using 7,063 slices, 640512 pixel input images can be processed in real-time with three agents at a rate of 9 fps in 48 MHz operation.

  • Design of Realtime 3-D Sound Processing System

    Kosuke TSUJINO  Kazuhiko FURUYA  Wataru KOBAYASHI  Tomonori IZUMI  Takao ONOYE  Yukihiro NAKAMURA  

     
    PAPER

      Vol:
    E88-D No:5
      Page(s):
    954-962

    An interactive 3-D sound processing system and its implementation is described, which is to provide virtual auditory environments to listeners. While conventional 3-D sound processing systems require high performance workstations or large DSP arrays, the proposed system is reduced in hardware size for practical applications. The proposed system is implemented using a prevailing IBM-compatible PC and a single DSP. Since the organization of the proposed system is independent of implementation details such as operation precision and number of audio tracks, the proposed system can be ported to various hardware entities. In addition, an easy-to-use user interface is also implemented on PC software for realtime input of 3-D sound movement. Owing to these features, the presented system is valuable as a prototype for various implementation of 3-D sound processing systems, while the current implementation is useful as a 3-D sound content production system.

  • An Experimental Study on Body-Biasing Layout Style Focusing on Area Efficiency and Speed Controllability

    Koichi HAMAMOTO  Hiroshi FUKETA  Masanori HASHIMOTO  Yukio MITSUYAMA  Takao ONOYE  

     
    LETTER-Integrated Electronics

      Vol:
    E92-C No:2
      Page(s):
    281-285

    Body-biasing is expected to be a common design technique, and then area efficient implementation in layout has been demanded. Body-biasing outside standard cells is one of possible layouts. However in this case body-bias controllability, especially when forward bias is applied, is a concern. To investigate the controllability, we fabricated and measured a ring oscillator in a 90 nm technology. Our measurement result and evaluation of area efficiency reveal that body-biased circuits can be implemented with area overhead of less than 1% yet with sufficient speed controllability.

  • Performance Evaluation of Software-Based Error Detection Mechanisms for Supply Noise Induced Timing Errors

    Yutaka MASUDA  Takao ONOYE  Masanori HASHIMOTO  

     
    PAPER

      Vol:
    E100-A No:7
      Page(s):
    1452-1463

    Software-based error detection techniques, which includes error detection mechanism (EDM) transformation, are used for error localization in post-silicon validation. This paper evaluates the performance of EDM for timing error localization with a noise-aware logic simulator and 65-nm test chips assuming the following two EDM usage scenarios; (1) localizing a timing error occurred in the original program, and (2) localizing as many potential timing errors as possible. Simulation results show that the EDM transformation customized for quick error detection cannot locate electrical timing errors in the original program in the first scenario, but it detects 86% of non-masked errors potential bugs in the second scenario, which mean the EDM performance of detecting electrical timing errors affecting execution results is high. Hardware measurement results show that the EDM detects 25% of original timing errors and 56% of non-masked errors. Here, these hardware measurement results are not consistent with the simulation results. To investigate the reason, we focus on the following two differences between hardware and simulation; (1) design of power distribution network, and (2) definition of timing error occurrence frequency. We update the simulation setup for filling the difference and re-execute the simulation. We confirm that the simulation and the chip measurement results are consistent.

  • Embedded System Implementation of Sound Localization in Proximal Region

    Nobuyuki IWANAGA  Tomoya MATSUMURA  Akihiro YOSHIDA  Wataru KOBAYASHI  Takao ONOYE  

     
    PAPER-Engineering Acoustics

      Vol:
    E91-A No:3
      Page(s):
    763-771

    A sound localization method in the proximal region is proposed, which is based on a low-cost 3D sound localization algorithm with the use of head-related transfer functions (HRTFs). The auditory parallax model is applied to the current algorithm so that more accurate HRTFs can be used for sound localization in the proximal region. In addition, head-shadowing effects based on rigid-sphere model are reproduced in the proximal region by means of a second-order IIR filter. A subjective listening test demonstrates the effectiveness of the proposed method. Embedded system implementation of the proposed method is also described claiming that the proposed method improves sound effects in the proximal region only with 5.1% increase of memory capacity and 8.3% of computational costs.

  • 3D Sound Rendering for Multiple Sound Sources Based on Fuzzy Clustering

    Masashi OKADA  Nobuyuki IWANAGA  Tomoya MATSUMURA  Takao ONOYE  Wataru KOBAYASHI  

     
    PAPER

      Vol:
    E93-A No:11
      Page(s):
    2163-2172

    In this paper, we propose a new 3D sound rendering method for multiple sound sources with limited computational resources. The method is based on fuzzy clustering, which achieves dual benefits of two general methods based on amplitude-panning and hard clustering. In embedded systems where the number of reproducible sound sources is restricted, the general methods suffer from localization errors and/or serious quality degradation, whereas the proposed method settles the problems by executing clustering-process and amplitude-panning simultaneously. Computational cost evaluation based on DSP implementation and subjective listening test have been performed to demonstrate the applicability for embedded systems and the effectiveness of the proposed method.

  • Clock Skew Evaluation Considering Manufacturing Variability in Mesh-Style Clock Distribution

    Shinya ABE  Masanori HASHIMOTO  Takao ONOYE  

     
    PAPER-Device and Circuit Modeling and Analysis

      Vol:
    E91-A No:12
      Page(s):
    3481-3487

    Influence of manufacturing variability on circuit performance has been increasing because of finer manufacturing process and lowered supply voltage. In this paper, we focus on mesh-style clock distribution which is believed to be effective for reducing clock skew, and we evaluate clock skew considering manufacturing and design variabilities. Considering MOS transistor variation -- random and spatially-correlated variation -- and non-uniform flip-flop (FF) placement, we demonstrate that spatially-correlated variation and severe non-uniform FF distribution can be major sources of clock skew. We also examine the dependency of clock skew on design parameters, and reveal that finer clock mesh does not necessarily reduce clock skew.

  • Performance Estimation at Architecture Level for Embedded Systems

    Hiroshi MIZUNO  Hiroyuki KOBAYASHI  Takao ONOYE  Isao SHIRAKAWA  

     
    PAPER-Performance Estimation

      Vol:
    E85-A No:12
      Page(s):
    2636-2644

    This paper devises a sophisticated approach to the performance estimation of an embedded hardware-software codesign system at the architecture level, which intends to optimize the hardware-software configuration in terms of processing time, power dissipation, and hardware cost. A distinctive feature of this approach consists in constructing a performance estimation model proper to each component of an embedded system, such as CPU core, RAM/ROM, cache memory, and application-specific hardware, by taking account of not only the functional performance but also the data transfer. The proposed estimation schemes are incorporated into an existing instruction set simulator, so that the actual performance can be estimated accurately at the architecture level. The experimental results demonstrate that the performance estimation approach enables the precise design decision at the architecture level, which greatly contributes toward enhancing the design ability dedicatedly for mobile appliances.

  • Trade-Off Analysis between Timing Error Rate and Power Dissipation for Adaptive Speed Control with Timing Error Prediction

    Hiroshi FUKETA  Masanori HASHIMOTO  Yukio MITSUYAMA  Takao ONOYE  

     
    PAPER-Logic Synthesis, Test and Verfication

      Vol:
    E92-A No:12
      Page(s):
    3094-3102

    Timing margin of a chip varies chip by chip due to manufacturing variability, and depends on operating environment and aging. Adaptive speed control with timing error prediction is promising to mitigate the timing margin variation, whereas it inherently has a critical risk of timing error occurrence when a circuit is slowed down. This paper presents how to evaluate the relation between timing error rate and power dissipation in self-adaptive circuits with timing error prediction. The discussion is experimentally validated using adders in subthreshold operation in a 90 nm CMOS process. We show a trade-off between timing error rate and power dissipation, and reveal the dependency of the trade-off on design parameters.

  • A Process and Temperature Tolerant Oscillator-Based True Random Number Generator

    Takehiko AMAKI  Masanori HASHIMOTO  Takao ONOYE  

     
    PAPER-Circuit Design

      Vol:
    E97-A No:12
      Page(s):
    2393-2399

    This paper presents an oscillator-based true random number generator (TRNG) that dynamically unbiases 0/1 probability. The proposed TRNG automatically adjusts the duty cycle of a fast oscillator to 50%, and generates unbiased random numbers tolerating process variation and dynamic temperature fluctuation. A prototype chip of the proposed TRNG was fabricated with a 65nm CMOS process. Measurement results show that the developed duty cycle monitor obtained the probability of ‘1’ 4,100 times faster than the conventional output bit observation, or estimated the probability with 70 times higher accuracy. The proposed TRNG adjusted the probability of ‘1’ to within 50±0.07% in five chips in the temperature range of 0°C to 75°C. Consequently, the proposed TRNG passed the NIST and DIEHARD tests at 7.5Mbps with 6,670µm2 area.

  • FOREWORD

    Hirokazu TANAKA  Takao ONOYE  

     
    FOREWORD

      Vol:
    E98-A No:11
      Page(s):
    2209-2210
  • Measurement Circuits for Acquiring SET Pulse Width Distribution with Sub-FO1-Inverter-Delay Resolution

    Ryo HARADA  Yukio MITSUYAMA  Masanori HASHIMOTO  Takao ONOYE  

     
    PAPER-Device and Circuit Modeling and Analysis

      Vol:
    E93-A No:12
      Page(s):
    2417-2423

    This paper presents two circuits to measure pulse width distribution of single event transients (SETs). We first review requirements for SET measurement in accelerated neutron radiation test and point out problems of previous works, in terms of time resolution, time/area efficiency for obtaining large samples and certainty in absolute values of pulse width. We then devise two measurement circuits and a pulse generator circuit that satisfy all the requirements and attain sub-FO1-inverter-delay resolution, and propose a measurement procedure for assuring the absolute width values. Operation of one of the proposed circuits was confirmed by a radiation experiment of alpha particles with a fabricated test chip.

  • Voice Communication on Multimedia ATM Network Using Shared VCI Cell

    Toshihiro MASAKI  Yasuhiro NAKATANI  Takao ONOYE  Nariyoshi YAMAI  Koso MURAKAMI  

     
    PAPER-ATM switch interworking

      Vol:
    E81-B No:2
      Page(s):
    340-346

    This paper presents novel multimedia ATM networks which are capable of transmitting voice data efficiently and unify the switching methods among heterogeneous traffic. Fully ATMized multimedia networks are using fellow cell switches. The proposed assembly method can pack plural calls which have different virtual channel connection (VCC) into one cell. Every call in cells is able to be dynamically rearranged by the fellow cell switch to achieve an efficient use of network resources. The switching functions are supported by shared virtual channel identifier (VCI) cells and fellow cells in it. The fellow cell switch for 622 Mbps links is integrated into a single chip. The multimedia ATM networks including voice transmission can be constructed by the fellow cell switches being attached to the standard ATM switches.

  • Implementation of Java Accelerator for High-Performance Embedded Systems

    Motoki KIMURA  Morgan Hirosuke MIKI  Takao ONOYE  Isao SHIRAKAWA  

     
    PAPER-Simulation Accelerator

      Vol:
    E86-A No:12
      Page(s):
    3079-3088

    A Java execution environment is implemented, in which a hardware engine is operated in parallel with an embedded processor. This pair of hardware facilities together with an additional software kernel are devised for existing embedded systems, so as to execute Java applications more efficiently in such a way that 39 instructions are added to the original Java Virtual Machine to implement the software kernel. The exploration of design parameters is also attempted to attain a low hardware cost and high performance. The proposed hardware engine of a 6-stage pipeline can be integrated in a single chip using 30 k gates together with the instruction and data cache memories. The proposed approach improves the execution speed by a factor of 5 in comparison with the J2ME software implementation.

  • Implementation of Viterbi Decoder toward GPU-Based SDR Receiver

    Kosuke TOMITA  Masahide HATANAKA  Takao ONOYE  

     
    PAPER

      Vol:
    E98-A No:11
      Page(s):
    2246-2253

    Viterbi decoding is commonly used for several protocols, but computational cost is quite high and thus it is necessary to implement it effectively. This paper describes GPU implementation of Viterbi decoder utilizing three-point Viterbi decoding algorithm (TVDA), in which the received bits are divided into multiple chunks and several chunks are decoded simultaneously. Coalesced access and Warp Shuffle, which is new instruction introduced are also utilized in order to improve decoder performance. In addition, iterative execution of parallel chunks decoding reduces the latency of proposed Viterbi decoder in order to utilize the decoder as a part of GPU-based SDR transceiver. As the result, the throughput of proposed Viterbi decoder is improved by 23.1%.

  • VLSI Architecture of Switching Control for AAL Type2 Switch

    Masahide HATANAKA  Toshihiro MASAKI  Takao ONOYE  Koso MURAKAMI  

     
    PAPER

      Vol:
    E83-A No:3
      Page(s):
    435-441

    This paper presents the switching control and VLSI architecture for the AAL2 switch. The ATM network with the AAL2 switch can efficiently transmit low-bit-rate data, even if the network has many endpoints. The switch is capable of not only switching AAL2 cells but also converting the header of other types of ATMs. The AAL2 switch is integrated into a single chip. The proposed ATM network is constructed by AAL2 switches attached to the ATM switches.

1-20hit(65hit)