The search functionality is under construction.

Author Search Result

[Author] Tsutomu SASAO(42hit)

1-20hit(42hit)

  • On Optimizations of Edge-Valued MDDs for Fast Analysis of Multi-State Systems

    Shinobu NAGAYAMA  Tsutomu SASAO  Jon T. BUTLER  Mitchell A. THORNTON  Theodore W. MANIKAS  

     
    PAPER-Logic Design

      Vol:
    E97-D No:9
      Page(s):
    2234-2242

    In the optimization of decision diagrams, variable reordering approaches are often used to minimize the number of nodes. However, such approaches are less effective for analysis of multi-state systems given by monotone structure functions. Thus, in this paper, we propose algorithms to minimize the number of edges in an edge-valued multi-valued decision diagram (EVMDD) for fast analysis of multi-state systems. The proposed algorithms minimize the number of edges by grouping multi-valued variables into larger-valued variables. By grouping multi-valued variables, we can reduce the number of nodes as well. To show the effectiveness of the proposed algorithms, we compare the proposed algorithms with conventional optimization algorithms based on a variable reordering approach. Experimental results show that the proposed algorithms reduce the number of edges by up to 15% and the number of nodes by up to 47%, compared to the conventional ones. This results in a speed-up of the analysis of multi-state systems by about three times.

  • Head-Tail Expressions for Interval Functions

    Infall SYAFALNI  Tsutomu SASAO  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E97-A No:10
      Page(s):
    2043-2054

    This paper shows a method to represent interval functions by using head-tail expressions. The head-tail expressions represent greater-than GT(X:A) functions, less-than LT(X:B) functions, and interval functions IN0(X:A,B) more efficiently than sum-of-products expressions. Let n be the number of bits to represent the largest value in the interval (A,B). This paper proves that a head-tail expression (HT) represents an interval function with at most n words in a ternary content addressable memory (TCAM) realization. It also shows the average numbers of factors to represent interval functions by HTs for up to n=16, which were obtained by a computer simulation. It also conjectures that, for sufficiently large n, the average number of factors to represent n-variable interval functions by HTs is at most 2/3n-5/9. Experimental results also show that, for n≥10, to represent interval functions, HTs require at least 20% fewer factors than MSOPs, on the average.

  • Design Method for Numerical Function Generators Using Recursive Segmentation and EVBDDs

    Shinobu NAGAYAMA  Tsutomu SASAO  Jon T. BUTLER  

     
    PAPER-Logic Synthesis and Verification

      Vol:
    E90-A No:12
      Page(s):
    2752-2761

    Numerical function generators (NFGs) realize arithmetic functions, such as ex,sin(πx), and , in hardware. They are used in applications where high-speed is essential, such as in digital signal or graphics applications. We introduce the edge-valued binary decision diagram (EVBDD) as a means of reducing the delay and memory requirements in NFGs. We also introduce a recursive segmentation algorithm, which divides the domain of the function to be realized into segments, where the given function is realized as a polynomial. This design reduces the size of the multiplier needed and thus reduces delay. It is also shown that an adder can be replaced by a set of 2-input AND gates, further reducing delay. We compare our results to NFGs designed with multi-terminal BDDs (MTBDDs). We show that EVBDDs yield a design that has, on the average, only 39% of the memory and 58% of the delay of NFGs designed using MTBDDs.

  • BDD Representation for Incompletely Specified Multiple-Output Logic Functions and Its Applications to the Design of LUT Cascades

    Munehiro MATSUURA  Tsutomu SASAO  

     
    PAPER-Logic Synthesis and Verification

      Vol:
    E90-A No:12
      Page(s):
    2762-2769

    A multiple-output function can be represented by a binary decision diagram for characteristic function (BDD_for_CF). This paper presents a method to represent multiple-output incompletely specified functions using BDD_for_CFs. An algorithm to reduce the widths of BDD_for_CFs is presented. This method is useful for decomposition of incompletely specified multiple-output functions. Experimental results for radix converters, adders, a multiplier, and lists of English words show that this method is useful for the synthesis of LUT cascades. An implementation of English words list by LUT cascades and an auxiliary memory is also shown.

  • Representations of Multiple-Output Functions Using Binary Decision Diagrams for Characteristic Functions

    Hafiz Md. HASAN BABU  Tsutomu SASAO  

     
    PAPER

      Vol:
    E82-A No:11
      Page(s):
    2398-2406

    This paper proposes a method to construct smaller binary decision diagrams for characteristic functions (BDDs for CFs). A BDD for CF represents an n-input m-output function, and evaluates all the outputs in O(n+m) time. We derive an upper bound on the number of nodes of the BDD for CF of n-bit adders (adrn). We also compare complexities of BDDs for CFs with those of shared binary decision diagrams (SBDDs) and multi-terminal binary decision diagrams (MTBDDs). Our experimental results show: 1) BDDs for CFs are usually much smaller than MTBDDs; 2) for adrn and for some benchmark circuits, BDDs for CFs are the smallest among the three types of BDDs; and 3) the proposed method often produces smaller BDDs for CFs than an existing method.

  • Compact Representations of Logic Functions Using Heterogeneous MDDs

    Shinobu NAGAYAMA  Tsutomu SASAO  

     
    PAPER-Logic and High Level Synthesis

      Vol:
    E86-A No:12
      Page(s):
    3168-3175

    In this paper, we propose a compact representation of logic functions using Multi-valued Decision Diagrams (MDDs) called heterogeneous MDDs. In a heterogeneous MDD, each variable may take a different domain. By partitioning binary input variables and representing each partition as a single multi-valued variable, we can produce a heterogeneous MDD with 16% smaller memory size than a Reduced Ordered Binary Decision Diagram (ROBDD), and with comparable memory size to Free Binary Decision Diagrams (FBDDs). And also, heterogeneous MDDs have shorter Average Path Length (APL) than ROBDDs and FBDDs. We minimized a large number of benchmark functions to show the compactness of heterogeneous MDDs.

  • Realization of Multi-Terminal Universal Interconnection Networks Using Contact Switches

    Tsutomu SASAO  Takashi MATSUBARA  Katsufumi TSUJI  Yoshiaki KOGA  

     
    PAPER-Logic Design

      Pubricized:
    2021/04/01
      Vol:
    E104-D No:8
      Page(s):
    1068-1075

    A universal interconnection network implements arbitrary interconnections among n terminals. This paper considers a problem to realize such a network using contact switches. When n=2, it can be implemented with a single switch. The number of different connections among n terminals is given by the Bell number B(n). The Bell number shows the total number of methods to partition n distinct elements. For n=2, 3, 4, 5 and 6, the corresponding Bell numbers are 2, 5, 15, 52, and 203, respectively. This paper shows a method to realize an n terminal universal interconnection network with $ rac {3}{8}(n^2-1)$ contact switches when n=2m+1≥5, and $ rac {n}{8}(3n+2)$ contact switches, when n=2m≥6. Also, it shows that a lower bound on the number of contact switches to realize an n-terminal universal interconnection network is ⌈log 2B(n)⌉, where B(n) is the Bell number.

  • A Design Algorithm for Sequential Circuits Using LUT Rings

    Hiroki NAKAHARA  Tsutomu SASAO  Munehiro MATSUURA  

     
    PAPER-Logic Synthesis

      Vol:
    E88-A No:12
      Page(s):
    3342-3350

    This paper shows a design method for a sequential circuit by using a Look-Up Table (LUT) ring. The method consists of two steps: The first step partitions the outputs into groups. The second step realizes them by LUT cascades, and allocates the cells of the cascades into the memory. The system automatically finds a fast implementation by maximally utilizing available memory. With the presented algorithm, we can easily design sequential circuits satisfying given specifications. The paper also compares the LUT ring with logic simulator to realize sequential circuits: the LUT ring is 25 to 237 times faster than a logic simulator that uses the same amount of memory.

  • Heuristics to Minimize Multiple-Valued Decision Diagrams

    Hafiz Md. HASAN BABU  Tsutomu SASAO  

     
    PAPER-Logic Synthesis

      Vol:
    E83-A No:12
      Page(s):
    2498-2504

    In this paper, we propose a method to minimize multiple-valued decision diagrams (MDDs) for multiple-output functions. We consider the following: (1) a heuristic for encoding the 2-valued inputs; and (2) a heuristic for ordering the multiple-valued input variables based on sampling, where each sample is a group of outputs. We first generate a 4-valued input 2-valued multiple-output function from the given 2-valued input 2-valued functions. Then, we construct an MDD for each sample and find a good variable ordering. Finally, we generate a variable ordering from the orderings of MDDs representing the samples, and minimize the entire MDDs. Experimental results show that the proposed method is much faster, and for many benchmark functions, it produces MDDs with fewer nodes than sifting. Especially, the proposed method generates much smaller MDDs in a short time for benchmark functions when several 2-valued input variables are grouped to form multiple-valued variables.

  • Shared Multi-Terminal Binary Decision Diagrams for Multiple-Output Functions

    Hafiz Md. HASAN BABU  Tsutomu SASAO  

     
    PAPER-Logic Synthesis

      Vol:
    E81-A No:12
      Page(s):
    2545-2553

    This paper describes a method to represent m output functions using shared multi-terminal binary decision diagrams (SMTBDDs). The SMTBDD(k) consists of multi-terminal binary decision diagrams (MTBDDs), where each MTBDD represents k output functions. An SMTBDD(k) is the generalization of shared binary decision diagrams (SBDDs) and MTBDDs: for k=1, it is an SBDD, and for k=m, it is an MTBDD. The size of a BDD is the total number of nodes. The features of SMTBDD(k)s are: 1) they are often smaller than SBDDs or MTBDDs; and 2) they evaluate k outputs simultaneously. We also propose an algorithm for grouping output functions to reduce the size of SMTBDD(k)s. Experimental results show the compactness of SMTBDD(k)s. An SMTBDDmin denotes the smaller SMTBDD which is either an SMTBDD(2) or an SMTBDD(3) with fewer nodes. The average relative sizes for SBDDs, MTBDDs, and SMTBDDs are 1. 00, 152. 73, and 0. 80, respectively.

  • On Properties of Kleene TDDs

    Yukihiro IGUCHI  Tsutomu SASAO  Munehiro MATSUURA  

     
    PAPER-Logic Simulation and Logic Optimization

      Vol:
    E81-D No:7
      Page(s):
    716-723

    Three types of ternary decision diagrams (TDDs) are considered: AND -TDDs, EXOR-TDDs, and Kleene-TDDs. Kleene-TDDs are useful for logic simulation in the presence of unknown inputs. Let N(BDD:f), N(AND-TDD:f), and N(EXOR-TDD:f) be the number of non-terminal nodes in the BDD, the AND-TDD, and the EXOR-TDD for f, respectively. Let N(Kleene-TDD:) be the number of non-terminal nodes in the Kleene -TDD for , where is the regular ternary function corresponding to f. Then N(BDD:f) N(TDD:f). For parity functions, N(BDD:f)=N(AND-TDD:f)=N(EXOR-TDD:f)=N(Kleene-TDD:). For unate functions,N(BDD:f)=N(AND-TDD:f). The sizes of Kleene-TDDs are O(3n/n), and O(n3) for arbitrary functions, and symmetric functions, respectively. There exist a 2n-variable function, where Kleene-TDDs require O(n) nodes with the best order, while O(3n) nodes in the worst order.

  • Generalized Reed-Muller Expressions: Complexity and an Exact Minimization Algorithm

    Tsutomu SASAO  Debatosh DEBNATH  

     
    PAPER

      Vol:
    E79-A No:12
      Page(s):
    2123-2130

    A generalized Reed-Muller expression (GRM) is obtained by negating some of the literals in a positive polarity Reed-Muller expression (PPRM). There are at most 2(n2)^(n-1) different GRMs for an n-variable function. A minimum GRM is one with the fewest products. This paper presents certain properties and an exact minimization algorithm for GRMs. The minimization algorithm uses binary decision diagrams. Up to five variables, all the representative functions of NP-equivalence classes were generated and minimized. Tables compare the number of products necessary to represent four-and five-variable functions for four classes of expressions: PPRMs, FPRMs, GRMs and SOPs. GRMs require, on the average, fewer products than sum-of-products expressions (SOPs), and have easily testable realizations.

  • Time-Division Multiplexing Realizations of Multiple-Output Functions Based on Shared Multi-Terminal Multiple-Valued Decision Diagrams

    Hafiz Md. HASAN BABU  Tsutomu SASAO  

     
    PAPER-Logic Design

      Vol:
    E82-D No:5
      Page(s):
    925-932

    This paper considers methods to design multiple-output networks based on decision diagrams (DDs). TDM (time-division multiplexing) systems transmit several signals on a single line. These methods reduce: 1) hardware; 2) logic levels; and 3) pins. In the TDM realizations, we consider three types of DDs: shared binary decision digrams (SBDDs), shared multiple-valued decision diagrams (SMDDs), and shared multi-terminal multiple-valued decision diagrams (SMTMDDs). In the network, each non-terminal node of a DD is realized by a multiplexer (MUX). We propose heuristic algorithms to derive SMTMDDs from SBDDs. We compare the number of non-terminal nodes in SBDDs, SMDDs, and SMTMDDs. For nrm n, log n, and for many other benchmark functions, SMTMDD-based realizations are more economical than other ones, where nrm n is a (2n)-input (n1)-output function computing (X2+Y2)+0.5, log n is an n-input n-output function computing (2n1)log(x1)/nlog2, and a denotes the largest integer not greater than a.

  • A Parallel Branching Program Machine for Sequential Circuits: Implementation and Evaluation

    Hiroki NAKAHARA  Tsutomu SASAO  Munehiro MATSUURA  Yoshifumi KAWAMURA  

     
    PAPER-Logic Design

      Vol:
    E93-D No:8
      Page(s):
    2048-2058

    The parallel branching program machine (PBM128) consists of 128 branching program machines (BMs) and a programmable interconnection. To represent logic functions on BMs, we use quaternary decision diagrams. To evaluate functions, we use 3-address quaternary branch instructions. We realized many benchmark functions on the PBM128, and compared its memory size, computation time, and power consumption with the Intel's Core2Duo microprocessor. The PBM128 requires approximately a quarter of the memory for the Core2Duo, and is 21.4-96.1 times faster than the Core2Duo. It dissipates a quarter of the power of the Core2Duo. Also, we realized packet filters such as an access controller and a firewall, and compared their performance with software on the Core2Duo. For these packet filters, the PBM128 requires approximately 17% of the memory for the Core2Duo, and is 21.3-23.7 times faster than the Core2Duo.

  • A Design of AES Encryption Circuit with 128-bit Keys Using Look-Up Table Ring on FPGA

    Hui QIN  Tsutomu SASAO  Yukihiro IGUCHI  

     
    PAPER-Computer Components

      Vol:
    E89-D No:3
      Page(s):
    1139-1147

    This paper addresses a pipelined partial rolling (PPR) architecture for the AES encryption. The key technique is the PPR architecture. With the proposed architecture on the Altera Stratix FPGA, two PPR implementations achieve 6.45 Gbps throughput and 12.78 Gbps throughput, respectively. Compared with the unrolling implementation that achieves a throughput of 22.75 Gbps on the same FPGA, the two PPR implementations improve the memory efficiency (i.e., throughput divided by the size of memory for core) by 13.4% and 12.3%, respectively, and reduce the amount of the memory by 75% and 50%, respectively. Also, the PPR implementation has a up to 9.83% higher memory efficiency than the fastest previous FPGA implementation known to date. In terms of resource efficiency (i.e., throughput divided by the equivalent logic element or slice), one PPR implementation offers almost the same as the rolling implementation, and the other PPR implementation offers a medium value between the rolling implementation and the unrolling implementation that has the highest resource efficiency. However, the two PPR implementations can be implemented on the minimum-sized Stratix FPGA while the unrolling implementation cannot. The PPR architecture fills the gap between unrolling and rolling architectures and is suitable for small and medium-sized FPGAs.

  • Output Phase Optimization for AND-OR-EXOR PLAs with Decoders and Its Application to Design of Adders

    Debatosh DEBNATH  Tsutomu SASAO  

     
    PAPER-Digital Circuits and Computer Arithmetic

      Vol:
    E88-D No:7
      Page(s):
    1492-1500

    This paper presents a design method for three-level programmable logic arrays (PLAs), which have input decoders and two-input EXOR gates at the outputs. The PLA realizes an EXOR of two sum-of-products expressions (EX-SOP) for multiple-valued input two-valued output functions. We developed an output phase optimization method for EX-SOPs where some outputs of the function are minimized in the complemented form and presented techniques to minimize EX-SOPs for adders by using an extension of Dubrova-Miller-Muzio's AOXMIN algorithm. The proposed algorithm produces solutions with a half products of AOXMIN-like algorithm in 250 times shorter time for large adders with two-valued inputs. We also proved that an n-bit adder with two-valued inputs requires at most 32n-2+7n-5 products in an EX-SOP while it is known that a sum-of-products expression (SOP) requires 62n-4n-5 products.

  • A Design Method of a Regular Expression Matching Circuit Based on Decomposed Automaton

    Hiroki NAKAHARA  Tsutomu SASAO  Munehiro MATSUURA  

     
    PAPER-Design Methodology

      Vol:
    E95-D No:2
      Page(s):
    364-373

    This paper shows a design method for a regular expression matching circuit based on a decomposed automaton. To implement a regular expression matching circuit, first, we convert a regular expression into a non-deterministic finite automaton (NFA). Then, to reduce the number of states, we convert the NFA into a merged-states non-deterministic finite automaton with unbounded string transition (MNFAU) using a greedy algorithm. Next, to realize it by a feasible amount of hardware, we decompose the MNFAU into a deterministic finite automaton (DFA) and an NFA. The DFA part is implemented by an off-chip memory and a simple sequencer, while the NFA part is implemented by a cascade of logic cells. Also, in this paper, we show that the MNFAU based implementation has lower area complexity than the DFA and the NFA based ones. Experiments using regular expressions form SNORT shows that, as for the embedded memory size per a character, the MNFAU is 17.17-148.70 times smaller than DFA methods. Also, as for the number of LCs (Logic Cells) per a character, the MNFAU is 1.56-5.12 times smaller than NFA methods. This paper describes detail of the MEMOCODE2010 HW/SW co-design contest for which we won the first place award.

  • Exact Minimization of FPRMs for Incompletely Specified Functions by Using MTBDDs

    Debatosh DEBNATH  Tsutomu SASAO  

     
    PAPER-Logic Synthesis

      Vol:
    E88-A No:12
      Page(s):
    3332-3341

    Fixed polarity Reed-Muller expressions (FPRMs) exhibit several useful properties that make them suitable for many practical applications. This paper presents an exact minimization algorithm for FPRMs for incompletely specified functions. For an n-variable function with α unspecified minterms there are 2n+α distinct FPRMs, and a minimum FPRM is one with the fewest product terms. To find a minimum FPRM the algorithm requires to determine an assignment of the incompletely specified minterms. This is accomplished by using the concept of integer-valued functions in conjunction with an extended truth vector and a weight vector. The vectors help formulate the problem as an assignment of the variables of integer-valued functions, which are then efficiently manipulated by using multi-terminal binary decision diagrams for finding an assignment of the unspecified minterms. The effectiveness of the algorithm is demonstrated through experimental results for code converters, adders, and randomly generated functions.

  • Compact Numerical Function Generators Based on Quadratic Approximation: Architecture and Synthesis Method

    Shinobu NAGAYAMA  Tsutomu SASAO  Jon T. BUTLER  

     
    PAPER-Circuit Synthesis

      Vol:
    E89-A No:12
      Page(s):
    3510-3518

    This paper presents an architecture and a synthesis method for compact numerical function generators (NFGs) for trigonometric, logarithmic, square root, reciprocal, and combinations of these functions. Our NFG partitions a given domain of the function into non-uniform segments using an LUT cascade, and approximates the given function by a quadratic polynomial for each segment. Thus, we can implement fast and compact NFGs for a wide range of functions. Experimental results show that: 1) our NFGs require, on average, only 4% of the memory needed by NFGs based on the linear approximation with non-uniform segmentation; 2) our NFG for 2x-1 requires only 22% of the memory needed by the NFG based on a 5th-order approximation with uniform segmentation; and 3) our NFGs achieve about 70% of the throughput of the existing table-based NFGs using only a few percent of the memory. Thus, our NFGs can be implemented with more compact FPGAs than needed for the existing NFGs. Our automatic synthesis system generates such compact NFGs quickly.

  • A Balanced Decision Tree Based Heuristic for Linear Decomposition of Index Generation Functions

    Shinobu NAGAYAMA  Tsutomu SASAO  Jon T. BUTLER  

     
    PAPER-Logic Design

      Pubricized:
    2017/05/19
      Vol:
    E100-D No:8
      Page(s):
    1583-1591

    Index generation functions model content-addressable memory, and are useful in virus detectors and routers. Linear decompositions yield simpler circuits that realize index generation functions. This paper proposes a balanced decision tree based heuristic to efficiently design linear decompositions for index generation functions. The proposed heuristic finds a good linear decomposition of an index generation function by using appropriate cost functions and a constraint to construct a balanced tree. Since the proposed heuristic is fast and requires a small amount of memory, it is applicable even to large index generation functions that cannot be solved in a reasonable time by existing heuristics. This paper shows time and space complexities of the proposed heuristic, and experimental results using some large examples to show its efficiency.

1-20hit(42hit)