The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] digital signal processor(19hit)

1-19hit
  • Feasibility Study for Computer-Aided Diagnosis System with Navigation Function of Clear Region for Real-Time Endoscopic Video Image on Customizable Embedded DSP Cores

    Masayuki ODAGAWA  Tetsushi KOIDE  Toru TAMAKI  Shigeto YOSHIDA  Hiroshi MIENO  Shinji TANAKA  

     
    LETTER-VLSI Design Technology and CAD

      Pubricized:
    2021/07/08
      Vol:
    E105-A No:1
      Page(s):
    58-62

    This paper presents examination result of possibility for automatic unclear region detection in the CAD system for colorectal tumor with real time endoscopic video image. We confirmed that it is possible to realize the CAD system with navigation function of clear region which consists of unclear region detection by YOLO2 and classification by AlexNet and SVMs on customizable embedded DSP cores. Moreover, we confirmed the real time CAD system can be constructed by a low power ASIC using customizable embedded DSP cores.

  • Robust Voice Activity Detection Algorithm Based on Feature of Frequency Modulation of Harmonics and Its DSP Implementation

    Chung-Chien HSU  Kah-Meng CHEONG  Tai-Shih CHI  Yu TSAO  

     
    PAPER-Speech and Hearing

      Pubricized:
    2015/07/10
      Vol:
    E98-D No:10
      Page(s):
    1808-1817

    This paper proposes a voice activity detection (VAD) algorithm based on an energy related feature of the frequency modulation of harmonics. A multi-resolution spectro-temporal analysis framework, which was developed to extract texture features of the audio signal from its Fourier spectrogram, is used to extract frequency modulation features of the speech signal. The proposed algorithm labels the voice active segments of the speech signal by comparing the energy related feature of the frequency modulation of harmonics with a threshold. Then, the proposed VAD is implemented on one of Texas Instruments (TI) digital signal processor (DSP) platforms for real-time operation. Simulations conducted on the DSP platform demonstrate the proposed VAD performs significantly better than three standard VADs, ITU-T G.729B, ETSI AMR1 and AMR2, in non-stationary noise in terms of the receiver operating characteristic (ROC) curves and the recognition rates from a practical distributed speech recognition (DSR) system.

  • Active Noise Control System for Reducing MR Noise

    Masafumi KUMAMOTO  Masahiro KIDA  Ryotaro HIRAYAMA  Yoshinobu KAJIKAWA  Toru TANI  Yoshimasa KURUMI  

     
    PAPER-Engineering Acoustics

      Vol:
    E94-A No:7
      Page(s):
    1479-1486

    We propose an active noise control (ANC) system for reducing periodic noise generated in a high magnetic field such as noise generated from magnetic resonance imaging (MRI) devices (MR noise). The proposed ANC system utilizes optical microphones and piezoelectric loudspeakers, because specific acoustic equipment is required to overcome the high-field problem, and consists of a head-mounted structure to control noise near the user's ears and to compensate for the low output of the piezoelectric loudspeaker. Moreover, internal model control (IMC)-based feedback ANC is employed because the MR noise includes some periodic components and is predictable. Our experimental results demonstrate that the proposed ANC system (head-mounted structure) can significantly reduce MR noise by approximately 30 dB in a high field in an actual MRI room even if the imaging mode changes frequently.

  • Low Power MAC Design with Variable Precision Support

    Young-Geun LEE  Han-Sam JUNG  Ki-Seok CHUNG  

     
    PAPER-Digital Signal Processing

      Vol:
    E92-A No:7
      Page(s):
    1623-1632

    Many DSP applications such as FIR filtering and DCT (discrete cosine transformation) require multiplication with constants. Therefore, optimizing the performance of constant multiplication improves the overall performance of these applications. It is well-known that shifting can replace a constant multiplication if the constant is a power of two. In this paper, we extend this idea in such a way that by employing more than two barrel shifters, we can design highly efficient constant multipliers. We have found that by using two or three shifters, we can generate a large set of constants. Using these constants, we can execute a typical set of FIR or DCT applications with few errors. Furthermore, with variable precision support, we can carry out a fairly large class of DSP applications with high computational efficiency. Compared to conventional multipliers, we can achieve power savings of up to 56% with negligible computational errors.

  • ODiN: A 32-Bit High Performance VLIW DSP for Software Defined Radio Applications

    Seung Eun LEE  Yong Mu JEONG  

     
    PAPER

      Vol:
    E87-C No:11
      Page(s):
    1780-1786

    A very long instruction word (VLIW) digital signal processor (DSP), called ODiN, which could execute six instructions in a single cycle simultaneously, is designed and fabricated using 0.25 µm 1-ploy 5-metal standard cell static CMOS process. The ODiN core delivers maximum 600 MIPS with 100 MHz system clock. In order to achieve high performance operation, the designed core includes compact register files, orthogonal instruction set, single cycle operations for most instructions, and parallel processing based on software scheduling. In addition, a Viterbi decoder processor and a FFT processor that are embedded make it possible to implement software defined radio (SDR) applications efficiently.

  • Loop and Address Code Optimization for Digital Signal Processors

    Jong-Yeol LEE  In-Cheol PARK  

     
    LETTER-Digital Signal Processing

      Vol:
    E85-A No:6
      Page(s):
    1408-1415

    This paper presents a new DSP-oriented code optimization method to enhance performance by exploiting the specific architectural features of digital signal processors. In the proposed method, a source code is translated into the static single assignment form while preserving the high-level information related to loops and the address computation of array accesses. The information is used in generating hardware loop instructions and parallel instructions provided by most digital signal processors. In addition to the conventional control-data flow graph, a new graph is employed to make it easy to find auto-modification addressing modes efficiently. Experimental results on benchmark programs show that the proposed method is effective in improving performance.

  • Asynchronous Multirate Real-Time Scheduling for Programmable DSPs

    Ichiro KURODA  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E85-A No:1
      Page(s):
    241-247

    A novel scheduling method for asynchronous multirate/multi-task processing by programmable digital signal processors (DSPs) has been developed. This mixed scheduling method combines static and dynamic scheduling, and avoids runtime overheads due to interrupts in context switching to realizes asynchronous multirate systems. The processing delay introduced when using static scheduling with static buffering is avoided by introducing deadline scheduling in the static schedule design. In the developed software design system, a block-diagram description language is extended to describe asynchronous multi-task processing. The scheduling method enables asynchronous multirate processing, such as arbitrary-sampling-ratio rate conversion, asynchronous interface, and multimedia applications, to be efficiently realized by programmable DSPs.

  • A New Hardware/Software Partitioning Algorithm for DSP Processor Cores with Two Types of Register Files

    Nozomu TOGAWA  Takashi SAKURAI  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    LETTER-Hardware/Software Codesign

      Vol:
    E84-A No:11
      Page(s):
    2802-2807

    This letter proposes a hardware/software partitioning algorithm for digital signal processor cores with two register files. Given a compiled assembly code and a timing constraint of execution time, the proposed algorithm generates a processor core configuration with a new assembly code running on the generated processor core. The proposed algorithm considers two register files and determines the number of registers in each of register files. Moreover the algorithm considers two or more types of functional units for each arithmetic or logical operation and assigns functional units with small area to a processor core without causing performance penalty. A generated processor core will have small area compared with processor cores which have a single register file or those which consider only one type of functional units for each operation. The experimental results demonstrate the effectiveness and efficiency of the proposed algorithm.

  • Area and Delay Estimation in Hardware/Software Cosynthesis for Digital Signal Processor Cores

    Nozomu TOGAWA  Yoshiharu KATAOKA  Yuichiro MIYAOKA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-Hardware/Software Codesign

      Vol:
    E84-A No:11
      Page(s):
    2639-2647

    Hardware/software partitioning is one of the key processes in a hardware/software cosynthesis system for digital signal processor cores. In hardware/software partitioning, area and delay estimation of a processor core plays an important role since the hardware/software partitioning process must determine which part of a processor core should be realized by hardware units and which part should be realized by a sequence of instructions based on execution time of an input application program and area of a synthesized processor core. This paper proposes area and delay estimation equations for digital signal processor cores. For area estimation, we show that total area for a processor core can be derived from the sum of area for a processor kernel and area for additional hardware units. Area for a processor kernel can be mainly obtained by minimum area for a processor kernel and overheads for adding hardware units and registers. Area for a hardware unit can be mainly obtained by its type and operation bit width. For delay estimation, we show that critical path delay for a processor core can be derived from the delay of a hardware unit which is on the critical path in the processor core. Experimental results demonstrate that errors of area estimation are less than 2% and errors of delay estimation are less than 2 ns when comparing estimated area and delay with logic-synthesized area and delay.

  • Memory Access Estimation of Filter Bank Implementation on Different DSP Architectures

    Naoki MIZUTANI  Shogo MURAMATSU  Hisakazu KIKUCHI  

     
    PAPER-Implementations of Signal Processing Systems

      Vol:
    E84-A No:8
      Page(s):
    1951-1959

    A unified polyphase representation of analysis and synthesis filter banks is introduced in this paper, and then the efficient implementation on digital signal processors (DSP) is investigated. Especially, the number of memory accesses, power consumption, processing accuracy and the required instruction cycles are discussed. Firstly, a unified representation is given, and then two types of procedures, SIMO system-based and MISO system-based procedures, are shown, where SIMO and MISO are abbreviations for single-input/multiple-output and multiple-input/single-output, respectively. These procedures are compared to each other. It is shown that the number of data load in SIMO system-based procedure is a half of that in MISO system-based procedure for two-channel filter banks. The implementation of M-channel filter banks is also discussed.

  • Fast Implementation Technique for Improving Throughput of RLS Adaptive Filters

    Kiyoshi NISHIKAWA  Hitoshi KIYA  

     
    PAPER-Adaptive Signal Processing

      Vol:
    E83-A No:8
      Page(s):
    1545-1550

    This paper proposes a fast implementation technique for RLS adaptive filters. The technique has an adjustable parameter to trade the throughput and the rate of convergence of the filter according to the applications. The conventional methods for improving the throughput do not have this kind of adjustability so that the proposed technique will expand the area of applications for the RLS algorithm. We show that the improvement of the throughput can be easily achieved by rearranging the formula of the RLS algorithm and that there are no need for faster PEs for the improvement.

  • Simulation of Series-Parallel Resonant DC-DC Converter System with DSP-Based Digital Control Scheme

    Ulhaqsyed MOBIN  Eiji HIRAKI  Hiroshi TAKANO  Mutsuo NAKAOKA  

     
    PAPER-General Fundamentals and Boundaries

      Vol:
    E83-A No:7
      Page(s):
    1458-1466

    This paper describes an efficient simulation approach of a DSP controlled series-parallel resonant high frequency DC-DC power converter system. Proposed power conversion circuit simulation approach is based on a circuit equation, modeled by substituting time-varying switched resistor circuit in place of all the controllable and uncontrollable power semiconductor switching blocks of power converter circuits. An algebraic algorithm transforms the matrices of the circuit equation into the matrices of the state vector equation. Solution of state equation is by 3rd order Runge Kutta numerical integration method. Simulation results are illustrated and discussed together with experimental results.

  • Software Radio Base and Personal Station Prototypes

    Yasuo SUZUKI  Kazuhiro UEHARA  Masashi NAKATSUGAWA  Yushi SHIRATO  Shuji KUBOTA  

     
    PAPER

      Vol:
    E83-B No:6
      Page(s):
    1261-1268

    Software radio base and personal station prototypes are proposed and implemented. The prototypes are composed of RF/IF, A/D and D/A, pre- and post-processors, CPU, and DSP parts. System software is partitioned into CPU program and DSP program to use processor resources effectively. They support various air interfaces, some of which are equivalent to the 384 kbit/s transmission rate PHS (personal handy phone system) and a 96 kbit/s transmission rate system. The base station can also be used as a communication bridge between two systems. In order to ease IF filter requirements, the zero-stuff method is employed. Basic transmission and receiving performances are evaluated in an experiment and their results agree well with those expected.

  • Performance Enhancement on Digital Signal Processors with Complex Arithmetic Capability

    Yoshimasa NEGISHI  Eiji WATANABE  Akinori NISHIHARA  Takeshi YANAGISAWA  

     
    PAPER

      Vol:
    E82-A No:2
      Page(s):
    238-245

    Digital Signal Processors with complex arithmetic capability (DSP-C) are useful for various applications. In this paper, we propose a method for the effective implementation of specific circuits with real coefficients on DSP-C. DSP-C has special hardware such as a complex multiplier so that a complex calculation can be performed with only one instruction. First, we show that nodes with two real coefficient input branches can be implemented by complex multiplications. We apply this implementation to 2D circuits and transversal circuits with real coefficients. Next, we introduce a new computational mode (Advanced mode) and a new multiplier into PSI, a kind of DSP-C which has been proposed already, in order to process the circuits effectively. The effectiveness of the proposed method is shown by simulation in the last part.

  • DSP Code Optimization Methods Utilizing Addressing Operations at the Codes without Memory Accesses

    Nobuhiko SUGINO  Hironobu MIYAZAKI  Akinori NISHIHARA  

     
    PAPER-Digital Signal Processing

      Vol:
    E80-A No:12
      Page(s):
    2562-2571

    Many digital signal processors (DSPs) employ indirect addressing using address registers (ARs) to indicate their memory addresses, which often leads to overhead. This paper presents methods to efficiently allocate addresses for variables in a given program so that overhead in AR update operations is reduced. Memory addressing model is generalized in such a way that AR can be updated at the codes without memory accesses. An efficient memory address allocation is obtained by a method based on the graph linearization algorithm, which takes account of the number of possible AR update operations for every memory access. In order to utilize multiple ARs, methods to assign variables into ARs are also investigated. The proposed methods are applied to the compiler for µPD77230 (NEC) and generated codes for several examples prove effectiveness of these methods.

  • A Built-In Self-Test for ADC and DAC in a Single-Chip Speech CODEC

    Eiichi TERAOKA  Toru KENGAKU  Ikuo YASUI  Kazuyuki ISHIKAWA  Takahiro MATSUO  Hideyuki WAKADA  Narumi SAKASHITA  Yukihiko SHIMAZU  Takeshi TOKUDA  

     
    PAPER

      Vol:
    E80-A No:2
      Page(s):
    339-345

    Built-in self-test (BIST) has been applied to test an analog to digital converter (ADC) and a digital to analog converter (DAC) embedded in a DSP-core ASIC. The eight performance characteristics of the ADC and the DAC designed in accordance with the ITU-T recommendations are measured using the BIST. Three of the eight characteristics - the attenuation/frequency distortion, the variation of gain with input level, and the signal-to-total distortion - have been evaluated and the measured results have shown good agreement with measured results by conventional tests. In the BIST operation, the DSP-core generates input stimulus and analyzes output response by control of the self-test program, The sizes of the self-test program and coefficient data are 822 words of the IROM and 384 words of the data ROM, respectively. This area overhead is less than 0.5% of total chip area. Test-time by the BIST is reduced to approximately 3.2 seconds, which is one-tenth that of conventional testing. The mixed-signal DSP-core ASIC is testable with only logic test equipment, and as a result, test-cost - that is test investment and test-time - is reduced compared with conventional test methods.

  • A 28 mW 16-bit Digital Signal Processor for the PDC Half-Rate CODEC

    Taketora SHIRAISI  Koji KAWAMOTO  Kazuyuki ISHIKAWA  Eiichi TERAOKA  Hidehiro TAKATA  Takeshi TOKUDA  Kouichi NISHIDA  

     
    PAPER

      Vol:
    E79-C No:12
      Page(s):
    1679-1685

    A low power consumption 16-bit fixed point Digital Signal Processor (DSP) has been developed to realize a half-rate CODEC for the Personal Digital Cellular (PDC) system. Dual datapath architecture has been employed to execute multiply-accumulate (MAC) operations with a high degree of efficiency. With this architecture. 86.3% of total MAC operations in the Pitch Synchronous Innovation Code Excited Linear Prediction (PSI-CELP) program are executed in parallel, so that total instruction cycles are reduced by 23.1%. The area overhead for the dual datapath architecture is only 3.0% of the total area. Furthermore, in order to reduce power consumption, circuit design techniques are also extensively applied to RAMs. ROMs, and clock circuits, which consume the great majority of power. By reducing the number of precharging bit lines, a power reduction of 49.8% is achieved in RAMs, and above 40% in ROMs. By applying gated clock to clock lines, a power reduction of 5.0% is achieved in the DSP that performs the PSI-CELP algorithm. The DSP is fabricated in 0.5 µm single-poly, double-metal CMOS technology. The PSI-CELP algorithm for the PDC half-rate CODEC can operate at 22.5 MHz instruction frequency and 1.6 V supply voltage. resulting in a low-power consumption of 28 mW.

  • A 16-bit Digital Signal Processor with Specially Arranged Multiply-Accumulator for Low Power Consumption

    Katsuhiko UEDA  Toshio SUGIMURA  Toshihiro ISHIKAWA  Minoru OKAMOTO  Mikio SAKAKIHARA  Shinichi MARUI  

     
    PAPER

      Vol:
    E78-C No:12
      Page(s):
    1709-1716

    This paper describes a new, low power 16-bit Digital Signal Processor (DSP). The DSP has a double-speed MAC mechanism, an accelerator for Viterbi decoding, and a block floating section which contribute to lower power consumption. The double-speed MAC can perform two multiply and accumulate operations in one instruction cycle. Since MAC operations are so common in digital signal processing, this mechanism can reduce the average clock frequency of the DSP resulting in lower power consumption. The Viterbi accelerator and block floating circuitry also reduce the clock frequency by minimizing the number of required cycles needed to be executed. The DSP was fabricated using a 0.8 µm CMOS 2-aluminum layer process technology to integrate 644 K transistors on a 9.30 mm9.09 mm die. It can realize an 11.2 kbps VSELP speech CODEC while consuming only 70 mW at 3.5 V Vdd.

  • VIRGO: Hierarchical DSP Code Generator Based on Vectorized Signal Flow Graph Description

    Norichika KUMAMOTO  Keiji AOKI  Hiroaki KUNIEDA  

     
    PAPER

      Vol:
    E75-A No:8
      Page(s):
    1004-1013

    This paper proposes a hierarchical Digital Signal Processor (DSP) Code Generator VIRGO for large scale general signal processing algorithms. Hierarchical structured Vectorized Signal Flow Graph (V-SFG) description is used as input specifications. Ths DSP independent optimization procedure for both the program size and the execution time is performed each module by each hierarchically with regard to operation order, memory assignment and register allocation. The efficient code generation is demonstrated by comparing both instruction steps and dynamic steps of a practical ADPCM encoder/decoder with a conventional method.