The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] ESIGN(888hit)

441-460hit(888hit)

  • An Energy-Efficient Partitioned Instruction Cache Architecture for Embedded Processors

    CheolHong KIM  SungWoo CHUNG  ChuShik JHON  

     
    PAPER-Computer Systems

      Vol:
    E89-D No:4
      Page(s):
    1450-1458

    Energy efficiency of cache memories is crucial in designing embedded processors. Reducing energy consumption in the instruction cache is especially important, since the instruction cache consumes a significant portion of total processor energy. This paper proposes a new instruction cache architecture, named Partitioned Instruction Cache (PI-Cache), for reducing dynamic energy consumption in the instruction cache by partitioning it to smaller (less power-consuming) sub-caches. When the proposed PI-Cache is accessed, only one sub-cache is accessed by utilizing the temporal/spatial locality of applications. In the meantime, other sub-caches are not accessed, leading to dynamic energy reduction. The PI-Cache also reduces dynamic energy consumption by eliminating the energy consumed in tag lookup and comparison. Moreover, the performance gap between the conventional instruction cache and the proposed PI-Cache becomes little when the physical cache access time is considered. We evaluated the energy efficiency by running a cycle accurate simulator, SimpleScalar, with power parameters obtained from CACTI. Simulation results show that the PI-Cache improves the energy-delay product by 20%-54% compared to the conventional direct-mapped instruction cache.

  • Low-Voltage Analog Switch in Deep Submicron CMOS: Design Technique and Experimental Measurements

    Christian Jesus B. FAYOMI  Mohamad SAWAN  Gordon W. ROBERTS  

     
    PAPER-Analog Signal Processing

      Vol:
    E89-A No:4
      Page(s):
    1076-1087

    This paper concerns the design, implementation and subsequent experimental validation of a low-voltage analog CMOS switch based on a gate-bootstrapped method. The main part of the proposed circuit is a new low-voltage and low-stress CMOS clock voltage doubler. Through the use of a dummy switch, the charge injection induced by the bootstrapped switch is greatly reduced resulting in improved sample-and-hold accuracy. An important attribute of the design is that the ON-resistance is nearly constant. A test chip has been designed and fabricated using a TSMC 0.18 µm CMOS process (single poly, n-well) to confirm the operation of the circuit for a supply voltage of down to 0.65 V.

  • Thermal-Aware Placement Based on FM Partition Scheme and Force-Directed Heuristic

    Jing LI  Hiroshi MIYASHITA  

     
    PAPER

      Vol:
    E89-A No:4
      Page(s):
    989-995

    Temperature-tracking is becoming of paramount importance in modern electronic design automation tools. In this paper, we present a deterministic thermal placement algorithm for standard cell based layout which can lead to a smooth temperature distribution over the die. It is mainly based on Fiduccia-Mattheyses partition scheme and a former substrate thermal model that can convert the known temperature constraints into the corresponding power distribution constraints. Moreover, a kind of force-directed heuristic based on cells' power consumption is introduced in the above process. Experimental results demonstrate a comparatively uniform temperature distribution and show a reduction of the maximal temperature on the die.

  • Variability: Modeling and Its Impact on Design

    Hidetoshi ONODERA  

     
    INVITED PAPER

      Vol:
    E89-C No:3
      Page(s):
    342-348

    As the technology scaling approaching nano-scale region, variability in device performance becomes a major issue in the design of integrated circuits. Besides the growing amount of variability, the statistical nature of the variability is changing as the progress of technology generation. In the past, die-to-die variability, which is well managed by the worst case design technique, dominates over within-die variability. In present and the future, the amount of within-die variability is increasing and it casts a challenge in design methodology. This paper first shows measured results of variability in three different processes of 0.35, 0.18, and 0.13 µm technologies, and explains the above mentioned trend of variability. An example of modeling for the within-die variability is explained. The impact of within-die random variability on circuit performance is demonstrated using a simple numerical example. It shows that a circuit that is designed optimally under the assumption of deterministic delay is now most susceptible to random fluctuation in delay, which clearly indicates the requirement of statistical design methodology.

  • A Fast Elliptic Curve Cryptosystem LSI Embedding Word-Based Montgomery Multiplier

    Jumpei UCHIDA  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-System LSIs and Microprocessors

      Vol:
    E89-C No:3
      Page(s):
    243-249

    Elliptic curve cryptosystems are expected to be a next standard of public-key cryptosystems. A security level of elliptic curve cryptosystems depends on a difficulty of a discrete logarithm problem on elliptic curves. The security level of a elliptic curve cryptosystem which has a public-key of 160-bit is equivalent to that of a RSA system which has a public-key of 1024-bit. We propose an elliptic curve cryptosystem LSI architecture embedding word-based Montgomery multipliers. A Montgomery multiplication is an efficient method for a finite field multiplication. We can design a scalable architecture for an elliptic curve cryptosystem by selecting structure of word-based Montgomery multipliers. Experimental results demonstrate effectiveness and efficiency of the proposed architecture. In the hardware evaluation using 0.18 µm CMOS library, the high-speed design using 126 Kgates with 208-bit multipliers achieved operation times of 3.6 ms for a 160-bit point multiplication.

  • Low-Power Hybrid Turbo Decoding Based on Reverse Calculation

    Hye-Mi CHOI  Ji-Hoon KIM  In-Cheol PARK  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E89-A No:3
      Page(s):
    782-789

    As turbo decoding is a highly memory-intensive algorithm consuming large power, a major issue to be solved in practical implementation is to reduce power consumption. This paper presents an efficient reverse calculation method to lower the power consumption by reducing the number of memory accesses required in turbo decoding. The reverse calculation method is proposed for the Max-log-MAP algorithm, and it is combined with a scaling technique to achieve a new decoding algorithm, called hybrid log-MAP, that results in a similar BER performance to the log-MAP algorithm. For the W-CDMA standard, experimental results show that 80% of memory accesses are reduced through the proposed reverse calculation method. A hybrid log-MAP turbo decoder based on the proposed reverse calculation reduces power consumption and memory size by 34.4% and 39.2%, respectively.

  • A Pruning Algorithm for Training Cooperative Neural Network Ensembles

    Md. SHAHJAHAN  Kazuyuki MURASE  

     
    PAPER-Biocybernetics, Neurocomputing

      Vol:
    E89-D No:3
      Page(s):
    1257-1269

    We present a training algorithm to create a neural network (NN) ensemble that performs classification tasks. It employs a competitive decay of hidden nodes in the component NNs as well as a selective deletion of NNs in ensemble, thus named a pruning algorithm for NN ensembles (PNNE). A node cooperation function of hidden nodes in each NN is introduced in order to support the decaying process. The training is based on the negative correlation learning that ensures diversity among the component NNs in ensemble. The less important networks are deleted by a criterion that indicates over-fitting. The PNNE has been tested extensively on a number of standard benchmark problems in machine learning, including the Australian credit card assessment, breast cancer, circle-in-the-square, diabetes, glass identification, ionosphere, iris identification, and soybean identification problems. The results show that classification performances of NN ensemble produced by the PNNE are better than or competitive to those by the conventional constructive and fixed architecture algorithms. Furthermore, in comparison to the constructive algorithm, NN ensemble produced by the PNNE consists of a smaller number of component NNs, and they are more diverse owing to the uniform training for all component NNs.

  • System LSI: Challenges and Opportunities

    Tadahiro KURODA  

     
    INVITED PAPER

      Vol:
    E89-C No:3
      Page(s):
    213-220

    Scaling of CMOS Integrated Circuit is becoming difficult, due mainly to rapid increase in power dissipation. How will the semiconductor technology and industry develop? This paper discusses challenges and opportunities in system LSI from three levels of perspectives: transistor level (physics), IC level (electronics), and business level (economics).

  • A Plan-Generation-Evaluation Framework for Design Space Exploration of Digital Systems Design

    Jun Kyoung KIM  Tag Gon KIM  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E89-A No:3
      Page(s):
    772-781

    Modern digital systems design requires us to explore a large and complex design space to find a best configuration which satisfies design requirements. Such exploration requires a sound representation of design space from which design candidates are efficiently generated, each of which then is evaluated. This paper proposes a plan-generation-evaluation framework which supports a complete process of such design space exploration. The plan phase constitutes a design space of all possible design alternatives by means of a formally defined representation scheme of attributed AND-OR graph. The generation phase generates a set of candidates by algorithmic pruning of the design space in an attributed AND-OR graph with respect to design requirements as well as architectural constraints. Finally, the evaluation phase measures performance of design candidates in a pruned graph to select a best one. A complete process of cache design is exemplified to show the effectiveness of the proposed framework.

  • Quarternary Signal Sets for Digital Communications with Nonuniform Sources

    Ha H. NGUYEN  Tyler NECHIPORENKO  

     
    LETTER-Communication Theory and Signals

      Vol:
    E89-A No:3
      Page(s):
    832-835

    This letter considers the signal design problems for quaternary digital communications with nonuniform sources. The designs are considered for both the average and equal energy constraints and for a two-dimensional signal space. A tight upper bound on the bit error probability (BEP) is employed as the design criterion. The optimal quarternary signal sets are presented and their BEP performance is compared with that of the standard QPSK and the binary signal set previously designed for nonuniform sources. Results shows that a considerable saving in the transmitted power can be achieved by the proposed average-energy signal set for a highly nonuniform source.

  • Design Considerations for RC Polyphase Filters with Simultaneously Equal Ripple Both in Stopband and Passband

    Hiroaki TANABE  Hiroshi TANIMOTO  

     
    LETTER

      Vol:
    E89-A No:2
      Page(s):
    461-464

    This paper describes a numerical design procedure of element values of RC polyphase filters with equal minima in stopband and equal ripple in passband. Determination of element values of RC polyphase filters with equal-ripple characteristic have not been solved to the best knowledge of the authors. There found a paper tackling with the problem; however, it can only give sub-optimal solutions via numerical calculation [3]. We propose a numerical element value design procedure for RC polyphase filters with equi-ripple gain in both stopband and passband by using the coefficient matching method. Some design examples are given.

  • A Coarse-Grain Hierarchical Technique for 2-Dimensional FFT on Configurable Parallel Computers

    Xizhen XU  Sotirios G. ZIAVRAS  

     
    PAPER-Parallel/Distributed Algorithms

      Vol:
    E89-D No:2
      Page(s):
    639-646

    FPGAs (Field-Programmable Gate Arrays) have been widely used as coprocessors to boost the performance of data-intensive applications [1],[2]. However, there are several challenges to further boost FPGA performance: the communication overhead between the host workstation and the FPGAs can be substantial; large-scale applications cannot fit in a single FPGA because of its limited capacity; mapping an application algorithm to FPGAs still remains a daunting job in configurable system design. To circumvent these problems, we propose in this paper the FPGA-based Hierarchical-SIMD (H-SIMD) machine with its codesign of the Pyramidal Instruction Set Architecture (PISA). PISA comprises high-level instructions implemented as FPGA functions of coarse-grain SIMD (Single-Instruction, Multiple-Data) tasks to facilitate ease of program development, code portability across different H-SIMD implementations and high performance. We assume a multi-FPGA board where each FPGA is configured as a separate SIMD machine. Multiple FPGA chips can work in unison at a higher SIMD level, if needed, controlled by the host. Additionally, by using a memory switching scheme and the high-level PISA to partition applications into coarse-grain tasks, host-FPGA communication overheads can be hidden. We enlist the two-dimensional Fast Fourier Transform (2D FFT) to test the effectiveness of H-SIMD. The test results show sustained high performance for this problem. The H-SIMD machine even outperforms a Xeon processor for this problem.

  • Best Security Index for Digital Fingerprinting

    Kozo BANNO  Shingo ORIHARA  Takaaki MIZUKI  Takao NISHIZEKI  

     
    PAPER-Information Hiding

      Vol:
    E89-A No:1
      Page(s):
    169-177

    Digital watermarking used for fingerprinting may receive a collusion attack; two or more users collude, compare their data, find a part of embedded watermarks, and make an unauthorized copy by masking their identities. In this paper, assuming that at most c users collude, we give a characterization of the fingerprinting codes that have the best security index in a sense of "(c,p/q)-secureness" proposed by Orihara et al. The characterization is expressed in terms of intersecting families of sets. Using a block design, we also show that a distributor of data can only find asymptotically a set of c users including at least one culprit, no matter how good fingerprinting code is used.

  • Scan Design for Two-Pattern Test without Extra Latches

    Kazuteru NAMBA  Hideo ITO  

     
    PAPER-Dependable Computing

      Vol:
    E88-D No:12
      Page(s):
    2777-2785

    There are three well-known approaches to using scan design to apply two-pattern testing: broadside testing (functional justification), skewed-load testing and enhanced scan testing. The broadside and skewed-load testing use the standard scan design, and thus the area overheads are not high. However fault coverage is low. The enhanced scan testing uses the enhanced scan design. The design uses extra latches, and allows scan-in any two-pattern testing. While this method achieves high fault coverage, it causes high area overhead because of extra latches. This paper presents a new scan design where two-pattern testing with high fault coverage can be performed with area overhead as low as the standard scan design. The proposed scan-FFs are based on master-slave FFs. The input of each scan-FF is connected to the output of the master latch and not the slave latch of the previous FF. Every scan-FF maintains the output value during scan-shift operations.

  • An Equivalence Checking Method for C Descriptions Based on Symbolic Simulation with Textual Differences

    Takeshi MATSUMOTO  Hiroshi SAITO  Masahiro FUJITA  

     
    PAPER-Simulation and Verification

      Vol:
    E88-A No:12
      Page(s):
    3315-3323

    In this paper, an efficient equivalence checking method for two C descriptions is described. The equivalence of two C descriptions is proved by symbolic simulation. Symbolic simulation used in this paper can prove the equivalence of all of the variables in the descriptions. However, it takes long time to verify the equivalence of all of the variables if large descriptions are given. Therefore, in order to improve the verification, our method identifies textual differences between descriptions. The identified textual differences are used to reduce the number of equivalence checkings among variables. The proposed method has been implemented in C language and evaluated with several C descriptions.

  • Circuit Performance Prediction Considering Core Utilization with Interconnect Length Distribution Model

    Hidenari NAKASHIMA  Junpei INOUE  Kenichi OKADA  Kazuya MASU  

     
    PAPER-Prediction and Analysis

      Vol:
    E88-A No:12
      Page(s):
    3358-3366

    Interconnect Length Distribution (ILD) represents the correlation between the number of interconnects and their length. The ILD can predict power consumption, clock frequency, chip size, etc. High core utilization and small circuit area have been reported to improve chip performance. We propose an ILD model to predict the correlation between core utilization and chip performance. The proposed model predicts the influences of interconnect length and interconnect density on circuit performances. As core utilization increases, small and simple circuits improve the performances. In large complex circuits, decreasing the wire coupling capacitance is more important than decreasing the total interconnect length for improvement of chip performance. The proposed ILD model expresses the actual ILD more accurately than conventional models.

  • Multiplier Energy Reduction by Dynamic Voltage Variation

    Vasily G. MOSHNYAGA  Tomoyuki YAMANAKA  

     
    PAPER-VLSI Circuit

      Vol:
    E88-A No:12
      Page(s):
    3548-3553

    Design of portable battery operated multimedia devices requires energy-efficient multiplication circuits. This paper proposes a novel architectural technique to reduce power consumption of digital multipliers. Unlike related approaches which focus on multiplier transition activity reduction, we concentrate on dynamic reduction of supply voltage. Two implementation schemes capable of dynamically adjusting a double voltage supply to input data variation are presented. Simulations show that using these schemes we can reduce energy consumption of 1616-bit multiplier by 34% and 29% on peak and by 10% and 7% on average with area overhead of 15% and 4%, respectively, while maintaining the performance of traditional multiplier.

  • Power-Supply Noise Reduction with Design for Manufacturability

    Hiroyuki TSUJIKAWA  Kenji SHIMAZAKI  Shozo HIRANO  Kazuhiro SATO  Masanori HIROFUJI  Junichi SHIMADA  Mitsumi ITO  Kiyohito MUKAI  

     
    PAPER-Power/Ground Network

      Vol:
    E88-A No:12
      Page(s):
    3421-3428

    In the move toward higher clock rates and advanced process technologies, designers of the latest electronic products are finding increasing silicon failure with respect to noise. On the other hand, the minimum dimension of patterns on LSIs is much smaller than the wavelength of exposure, making it difficult for LSI manufacturers to obtain high yield. In this paper, we present a solution to reduce power-supply noise in LSI microchips. The proposed design methodology also considers design for manufacturability (DFM) at the same time as power integrity. The method was successfully applied to the design of a system-on-chip (SOC), achieving a 13.1-13.2% noise reduction in power-supply voltage and uniformity of pattern density for chemical mechanical polishing (CMP).

  • Tolerance Design of Passive Filter Circuits Using Genetic Programming

    Hao-Sheng HOU  Shoou-Jinn CHANG  Yan-Kuin SU  

     
    LETTER-Electronic Circuits

      Vol:
    E88-C No:12
      Page(s):
    2388-2390

    In the letter we extend our previous work, which applies genetic programming to passive filter synthesis tasks. The extended method deals with the tolerance design considerations. Experimental results show that our method can effectively generate filters which outperform those generated by traditional methods. In addition, it provides filter designers with an effective CAD tool to manage the trade-off between manufacturing yield and circuit cost.

  • A Top-Down Approach to Quality Driven Architectural Engineering of Software Systems

    Kwanwoo LEE  

     
    PAPER-Software Engineering

      Vol:
    E88-D No:12
      Page(s):
    2757-2766

    Designing a software architecture that satisfies multiple quality requirements is a difficult undertaking. This is mainly due to the fact that architects must be able to explore a broad range of architectural choices and analyze tradeoffs among them in light of multiple quality requirements. As the size and complexity of the system increase, architectural design space to be explored and analyzed becomes more complex. In order to systematically manage the complexity, this paper proposes a method that guides architects to explore and analyze architectural decisions in a top-down manner. In the method, architectural decisions that have global impacts on given quality requirements are first explored and analyzed and those that have local impacts are then taken into account in the context of the decisions made in the previous step. This approach can cope with the complexity of large-scale architectural design systematically, as architectural decisions are analyzed and made following the abstraction hierarchy of quality requirements. To illustrate the concepts and applicability of the proposed method, we have applied this method to the architectural design of the computer used for the continuous casting process by an iron and steel manufacturer.

441-460hit(888hit)