The search functionality is under construction.

Author Search Result

[Author] Masao YANAGISAWA(78hit)

1-20hit(78hit)

  • A Multiple Cyclic-Route Generation Method with Route Length Constraint Considering Point-of-Interests

    Tensei NISHIMURA  Kazuaki ISHIKAWA  Toshinori TAKAYAMA  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER-Intelligent Transport System

      Vol:
    E102-A No:4
      Page(s):
    641-653

    With the spread of map applications, route generation has become a familiar function. Most of route generation methods search a route from a starting point to a destination point with the shortest time or shortest length, but more enjoyable route generation is recently focused on. Particularly, cyclic-route generation for strolling requires to suggest to a user more than one route passing through several POIs (Point-of-Interests), to satisfy the user's preferences as much as possible. In this paper, we propose a multiple cyclic-route generation method with a route length constraint considering POIs. Firstly, our proposed method finds out a set of reference points based on the route length constraint. Secondly, we search a non-cyclic route from one reference point to the next one and finally generate a cyclic route by connecting these non-cyclic routes. Compared with previous methods, our proposed method generates a cyclic route closer to the route length constraint, reduces the number of the same points passing through by approximately 80%, and increases the number of POIs passed approximately 1.49 times.

  • A Locality-Aware Hybrid NoC Configuration Algorithm Utilizing the Communication Volume among IP Cores

    Seungju LEE  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E95-A No:9
      Page(s):
    1538-1549

    Network-on-chip (NoC) architectures have emerged as a promising solution to the lack of scalability in multi-processor systems-on-chips (MPSoCs). With the explosive growth in the usage of multimedia applications, it is expected that NoC serves as a multimedia server supporting multi-class services. In this paper, we propose a configuration algorithm for a hybrid bus-NoC architecture together with simulation results. Our target architecture is a hybrid bus-NoC architecture, called busmesh NoC, which is a generalized version of a hybrid NoC with local buses. In our BMNoC configuration algorithm, cores which have a heavy communication volume between them are mapped in a cluster node (CN) and connected by a local bus. CNs can have communication with each other via edge switches (ESes) and mesh routers (MRs). With this hierarchical communication network, our proposed algorithm can improve the latency as compared with conventional methods. Several realistic applications applied to our algorithm illustrate the better performance than earlier studies and feasibility of our proposed algorithm.

  • A High Performance Embedded Wavelet Video Coder

    Tingrong ZHAO  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER

      Vol:
    E83-A No:6
      Page(s):
    979-986

    This paper describes a highly performance scalable video coder. Wavelet transform is employed to decompose the video frame into different resolutions. Novel features of this coder are 1) a highly efficient multi-resolution motion estimation that requires minimum compuation and overhead motion information is embedded in this scheme; 2) the wavelet coefficients are organized in an extended zero tree (EZT) which is much more efficient than the simple zerotree. We show with experimental results that this video coder achieves good performances both in processing time and compression ratio when applied to typical test video sequences.

  • A Fast Elliptic Curve Cryptosystem LSI Embedding Word-Based Montgomery Multiplier

    Jumpei UCHIDA  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-System LSIs and Microprocessors

      Vol:
    E89-C No:3
      Page(s):
    243-249

    Elliptic curve cryptosystems are expected to be a next standard of public-key cryptosystems. A security level of elliptic curve cryptosystems depends on a difficulty of a discrete logarithm problem on elliptic curves. The security level of a elliptic curve cryptosystem which has a public-key of 160-bit is equivalent to that of a RSA system which has a public-key of 1024-bit. We propose an elliptic curve cryptosystem LSI architecture embedding word-based Montgomery multipliers. A Montgomery multiplication is an efficient method for a finite field multiplication. We can design a scalable architecture for an elliptic curve cryptosystem by selecting structure of word-based Montgomery multipliers. Experimental results demonstrate effectiveness and efficiency of the proposed architecture. In the hardware evaluation using 0.18 µm CMOS library, the high-speed design using 126 Kgates with 208-bit multipliers achieved operation times of 3.6 ms for a 160-bit point multiplication.

  • A High-Level Synthesis Algorithm with Inter-Island Distance Based Operation Chainings for RDR Architectures

    Kotaro TERADA  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER

      Vol:
    E98-A No:7
      Page(s):
    1366-1375

    In deep-submicron era, interconnection delays are not negligible even in high-level synthesis and regular-distributed-register architectures (RDR architectures) have been proposed to cope with this problem. In this paper, we propose a high-level synthesis algorithm using operation chainings which reduces the overall latency targeting RDR architectures. Our algorithm consists of three steps: The first step enumerates candidate operations for chaining. The second step introduces maximal chaining distance (MCD), which gives the maximal allowable inter-island distance on RDR architecture between chaining candidate operations. The last step performs list-scheduling and binding simultaneously based on the results of the two preceding steps. Our algorithm enumerates feasible chaining candidates and selects the best ones for RDR architecture. Experimental results show that our proposed algorithm reduces the latency by up to 40.0% compared to the original approach, and by up to 25.0% compared to a conventional approach. Our algorithm also reduces the number of registers and the number of multiplexers compared to the conventional approaches in some cases.

  • A Floorplan-Driven High-Level Synthesis Algorithm for Multiplexer Reduction Targeting FPGA Designs

    Koichi FUJIWARA  Kazushi KAWAMURA  Shin-ya ABE  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER

      Vol:
    E98-A No:7
      Page(s):
    1392-1405

    Recently, high-level synthesis (HLS) techniques for FPGA designs are required in various applications such as computerized stock tradings and reconfigurable network processings. In HLS for FPGA designs, we need to consider module floorplan and reduce multiplexer's cost concurrently. In this paper, we propose a floorplan-driven HLS algorithm for multiplexer reduction targeting FPGA designs. By utilizing distributed-register architectures called HDR, we can easily consider module floorplan in HLS. In order to reduce multiplexer's cost, we propose two novel binding methods called datapath-oriented scheduling/FU binding and datapath-oriented register binding. Experimental results demonstrate that our algorithm can realize FPGA designs which reduce the number of slices by up to 47% and latency by up to 22% compared with conventional approaches while the number of required control steps is almost the same.

  • ECC-Based Bit-Write Reduction Code Generation for Non-Volatile Memory

    Masashi TAWADA  Shinji KIMURA  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E98-A No:12
      Page(s):
    2494-2504

    Non-volatile memory has many advantages such as high density and low leakage power but it consumes larger writing energy than SRAM. It is quite necessary to reduce writing energy in non-volatile memory design. In this paper, we propose write-reduction codes based on error correcting codes and reduce writing energy in non-volatile memory by decreasing the number of writing bits. When a data is written into a memory cell, we do not write it directly but encode it into a codeword. In our write-reduction codes, every data corresponds to an information vector in an error-correcting code and an information vector corresponds not to a single codeword but a set of write-reduction codewords. Given a writing data and current memory bits, we can deterministically select a particular write-reduction codeword corresponding to the data to be written, where the maximum number of flipped bits are theoretically minimized. Then the number of writing bits into memory cells will also be minimized. Experimental results demonstrate that we have achieved writing-bits reduction by an average of 51% and energy reduction by an average of 33% compared to non-encoded memory.

  • Scan-Based Side-Channel Attack on the Camellia Block Cipher Using Scan Signatures

    Huiqian JIANG  Mika FUJISHIRO  Hirokazu KODERA  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER-Logic Synthesis, Test and Verification

      Vol:
    E98-A No:12
      Page(s):
    2547-2555

    Camellia is a block cipher jointly developed by Mitsubishi and NTT of Japan. It is designed suitable for both software and hardware implementations. One of the design-for-test techniques using scan chains is called scan-path test, in which testers can observe and control the registers inside the LSI chip directly in order to check if the LSI chip correctly operates or not. Recently, a scan-based side-channel attack is reported which retrieves the secret information from the cryptosystem using scan chains. In this paper, we propose a scan-based attack method on the Camellia cipher using scan signatures. Our proposed method is based on the equivalent transformation of the Camellia algorithm and the possible key candidate reduction in order to retrieve the secret key. Experimental results show that our proposed method sucessfully retrieved its 128-bit secret key using 960 plaintexts even if the scan chain includes the Camellia cipher and other circuits and also sucessfully retrieves its secret key on the SASEBO-GII board, which is a side-channel attack standard evaluation board.

  • An FPGA Layout Reconfiguration Algorithm Based on Global Routes for Engineering Changes in System Design Specifications

    Nozomu TOGAWA  Kayoko HAGI  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER

      Vol:
    E81-A No:5
      Page(s):
    873-884

    Rapid system prototyping is one of the main applications for field-programmable gate arrays (FPGAs). At the stage of rapid system prototyping, design specifications can often be changed since they cannot be determined completely. In this paper, layout design change is focused on and a layout reconfiguration algorithm is proposed for FPGAs. The target FPGA architecture is developed for transport processing. In order to implement more various circuits flexibly, it has three-input lookup tables (LUTs) as minimum logic cells. Since its logic granularity is finer than that of conventional FPGAs, it requires more routing resources to connect them and minimization of routing congestion is indispensable. In layout reconfiguration, the main problem is to add LUTs to initial layouts. Our algorithm consists of two steps: For given placement and global routing of LUTs, in Step 1 an added LUT is placed with allowing that the position of the added LUT may overlap that of a preplaced LUT; Then in Step 2 preplaced LUTs are moved to their adjacent positions so that the overlap of the LUT positions can be resolved. Global routes are updated corresponding to reconfiguration of placement. The algorithm keeps routing congestion small by evaluating global routes directly both in Steps 1 and 2. Especially in Step 2, if the minimum number of preplaced LUTs are moved to their adjacent positions, our algorithm minimizes routing congestion. Experimental results demonstrate that, if the number of added LUTs is at most 20% of the number of initial LUTs, our algorithm generates the reconfigured layouts whose routing congestion is as small as that obtained by executing a conventional placement and global routing algorithm. Run time of our algorithm is within approximately one second.

  • A Fast Scheduling Algorithm Based on Gradual Time-Frame Reduction for Datapath Synthesis

    Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E81-A No:6
      Page(s):
    1231-1241

    This paper proposes a fast scheduling algorithm based on gradual time-frame reduction for datapath synthesis of digital signal processing hardwares. The objective of the algorithm is to minimize the costs for functional units and registers and to maximize connectivity under given computation time and initiation interval. Incorporating the connectivity in a scheduling stage can reduce multiplexer counts in resource binding. The algorithm maximizes connectivity with maintaining low time complexity and obtains datapath designs with totally small hardware costs in the high-level synthesis environment. The algorithm also resolves inter-iteration data dependencies and thus realizes pipelined datapaths. The experimental results demonstrate that the proposed algorithm reduces the multiplexer counts after resource binding with maintaining low costs for functional units and registers compared with eight conventional schedulers.

  • A Hardware-Trojans Identifying Method Based on Trojan Net Scoring at Gate-Level Netlists

    Masaru OYA  Youhua SHI  Noritaka YAMASHITA  Toshihiko OKAMURA  Yukiyasu TSUNOO  Satoshi GOTO  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER-Logic Synthesis, Test and Verification

      Vol:
    E98-A No:12
      Page(s):
    2537-2546

    Outsourcing IC design and fabrication is one of the effective solutions to reduce design cost but it may cause severe security risks. Particularly, malicious outside vendors may implement Hardware Trojans (HTs) on ICs. When we focus on IC design phase, we cannot assume an HT-free netlist or a Golden netlist and it is too difficult to identify whether a given netlist is HT-free or not. In this paper, we propose a score-based hardware-trojans identifying method at gate-level netlists without using a Golden netlist. Our proposed method does not directly detect HTs themselves in a gate-level netlist but it detects a net included in HTs, which is called Trojan net, instead. Firstly, we observe Trojan nets from several HT-inserted benchmarks and extract several their features. Secondly, we give scores to extracted Trojan net features and sum up them for each net in benchmarks. Then we can find out a score threshold to classify HT-free and HT-inserted netlists. Based on these scores, we can successfully classify HT-free and HT-inserted netlists in all the Trust-HUB gate-level benchmarks and ISCAS85 benchmarks as well as HT-free and HT-inserted AES gate-level netlists. Experimental results demonstrate that our method successfully identify all the HT-inserted gate-level benchmarks to be “HT-inserted” and all the HT-free gate-level benchmarks to be “HT-free” in approximately three hours for each benchmark.

  • Hardware-Trojans Rank: Quantitative Evaluation of Security Threats at Gate-Level Netlists by Pattern Matching

    Masaru OYA  Noritaka YAMASHITA  Toshihiko OKAMURA  Yukiyasu TSUNOO  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER

      Vol:
    E99-A No:12
      Page(s):
    2335-2347

    Since digital ICs are often designed and fabricated by third parties at any phases today, we must eliminate risks that malicious attackers may implement Hardware Trojans (HTs) on them. In particular, they can easily insert HTs during design phase. This paper proposes an HT rank which is a new quantitative analysis criterion against HTs at gate-level netlists. We have carefully analyzed all the gate-level netlists in Trust-HUB benchmark suite and found out several Trojan net features in them. Then we design the three types of Trojan points: feature point, count point, and location point. By assigning these points to every net and summing up them, we have the maximum Trojan point in a gate-level netlist. This point gives our HT rank. The HT rank can be calculated just by net features and we do not perform any logic simulation nor random test. When all the gate-level netlists in Trust-HUB, ISCAS85, ISCAS89 and ITC99 benchmark suites as well as several OpenCores designs, HT-free and HT-inserted AES netlists are ranked by our HT rank, we can completely distinguish HT-inserted ones (which HT rank is ten or more) from HT-free ones (which HT rank is nine or less). The HT rank is the world-first quantitative criterion which distinguishes HT-inserted netlists from HT-free ones in all the gate-level netlists in Trust-HUB, ISCAS85, ISCAS89, and ITC99.

  • Trojan-Net Feature Extraction and Its Application to Hardware-Trojan Detection for Gate-Level Netlists Using Random Forest

    Kento HASEGAWA  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER

      Vol:
    E100-A No:12
      Page(s):
    2857-2868

    It has been reported that malicious third-party IC vendors often insert hardware Trojans into their IC products. How to detect them is a critical concern in IC design process. Machine-learning-based hardware-Trojan detection gives a strong solution to tackle this problem. Hardware-Trojan infected nets (or Trojan nets) in ICs must have particular Trojan-net features, which differ from those of normal nets. In order to classify all the nets in a netlist designed by third-party vendors into Trojan nets and normal ones by machine learning, we have to extract effective Trojan-net features from Trojan nets. In this paper, we first propose 51 Trojan-net features which describe well Trojan nets. After that, we pick up random forest as one of the best candidates for machine learning and optimize it to apply to hardware-Trojan detection. Based on the importance values obtained from the optimized random forest classifier, we extract the best set of 11 Trojan-net features out of the 51 features which can effectively classify the nets into Trojan ones and normal ones, maximizing the F-measures. By using the 11 Trojan-net features extracted, our optimized random forest classifier has achieved at most 100% true positive rate as well as 100% true negative rate in several Trust-HUB benchmarks and obtained the average F-measure of 79.3% and the accuracy of 99.2%, which realize the best values among existing machine-learning-based hardware-Trojan detection methods.

  • A Robust Indoor/Outdoor Detection Method Based on Spatial and Temporal Features of Sparse GPS Measured Positions

    Sae IWATA  Kazuaki ISHIKAWA  Toshinori TAKAYAMA  Masao YANAGISAWA  Nozomu TOGAWA  

     
    LETTER-Intelligent Transport System

      Vol:
    E102-A No:6
      Page(s):
    860-865

    Cell phones with GPS function as well as GPS loggers are widely used and we can easily obtain users' geographic information. Now classifying the measured GPS positions into indoor/outdoor positions is one of the major challenges. In this letter, we propose a robust indoor/outdoor detection method based on sparse GPS measured positions utilizing machine learning. Given a set of clusters of measured positions whose center position shows the user's estimated stayed position, we calculate the feature values composed of: positioning accuracy, spatial features, and temporal feature of measured positions included in every cluster. Then a random forest classifier learns these feature values of the known data set. Finally, we classify the unknown clusters of measured positions into indoor/outdoor clusters using the learned random forest classifier. The experiments demonstrate that our proposed method realizes the maximum F1 measure of 1.000, which classifies measured positions into indoor/outdoor ones with almost no errors.

  • An Energy-Efficient Floorplan Driven High-Level Synthesis Algorithm for Multiple Clock Domains Design

    Shin-ya ABE  Youhua SHI  Kimiyoshi USAMI  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER

      Vol:
    E98-A No:7
      Page(s):
    1376-1391

    In this paper, we first propose an HDR-mcd architecture, which integrates periodically all-in-phase based multiple clock domains and multi-cycle interconnect communication into high-level synthesis. In HDR-mcd, an entire chip is divided into several huddles. Huddles can realize synchronization between different clock domains in which interconnection delay should be considered during high-level synthesis. Next, we propose a high-level synthesis algorithm for HDR-mcd, which can reduce energy consumption by optimizing configuration and placement of huddles. Experimental results show that the proposed method achieves 32.5% energy-saving compared with the existing single clock domain based methods.

  • High-Level Power Optimization Based on Thread Partitioning

    Jumpei UCHIDA  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-System Level Design

      Vol:
    E87-A No:12
      Page(s):
    3075-3082

    This paper proposes a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks (threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two sub-threads, one of which has RF and the other does not have RF. The partitioned sub-threads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned sub-threads have waiting time for synchronization, gated clocks can be applied to each sub-thread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.

  • Sub-operation Parallelism Optimization in SIMD Processor Core Synthesis

    Hideki KAWAZU  Jumpei UCHIDA  Yuichiro MIYAOKA  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER

      Vol:
    E88-A No:4
      Page(s):
    876-884

    A b-bit SIMD functional unit has n k-bit sub-functional units in itself, where b = k n. It can execute n-parallel k-bit operations. However, all the b-bit functional units in a processor core do not necessarily execute n-parallel operations. Depending on an application program, some of them just execute n/2-parallel operations or even n/4-parallel operations. This means that we can modify a b-bit SIMD functional unit so that it has n/2 k-bit sub-functional units or n/4 k-bit sub-functional units. The number of k-bit sub-functional units in a SIMD functional unit is called sub-operation parallelism. We incorporate a sub-operation parallelism optimization algorithm into SIMD functional unit optimization. Our proposed algorithm gradually reduces sub-operation parallelism of a SIMD functional unit while the timing constraint of execution time satisfied. Thereby, we can finally find a processor core with small area under the given timing constraint. We expect that we can obtain processor core configurations of smaller area in the same timing constraint rather than a conventional system. The promising experimental results are also shown.

  • A Bitwidth-Aware High-Level Synthesis Algorithm Using Operation Chainings for Tiled-DR Architectures

    Kotaro TERADA  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER

      Vol:
    E100-A No:12
      Page(s):
    2911-2924

    As application hardware designs and implementations in a short term are required, high-level synthesis is more and more essential EDA technique nowadays. In deep-submicron era, interconnection delays are not negligible even in high-level synthesis thus distributed-register and -controller architectures (DR architectures) have been proposed in order to cope with this problem. It is also profitable to take data-bitwidth into account in high-level synthesis. In this paper, we propose a bitwidth-aware high-level synthesis algorithm using operation chainings targeting Tiled-DR architectures. Our proposed algorithm optimizes bitwidths of functional units and utilizes the vacant tiles by adding some extra functional units to realize effective operation chainings to generate high performance circuits without increasing the total area. Experimental results show that our proposed algorithm reduces the overall latency by up to 47% compared to the conventional approach without area overheads by eliminating unnecessary bitwidths and adding efficient extra FUs for Tiled-DR architectures.

  • A Stayed Location Estimation Method for Sparse GPS Positioning Information Based on Positioning Accuracy and Short-Time Cluster Removal

    Sae IWATA  Tomoyuki NITTA  Toshinori TAKAYAMA  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER-Intelligent Transport System

      Vol:
    E101-A No:5
      Page(s):
    831-843

    Cell phones with GPS function as well as GPS loggers are widely used and users' geographic information can be easily obtained. However, still battery consumption in these mobile devices is main concern and then obtaining GPS positioning data so frequently is not allowed. In this paper, a stayed location estimation method for sparse GPS positioning information is proposed. After generating initial clusters from a sequence of measured positions, the effective radius is set for every cluster based on positioning accuracy and the clusters are merged effectively using it. After that, short-time clusters are removed temporarily but measured positions included in them are not removed. Then the clusters are merged again, taking all the measured positions into consideration. This process is performed twice, in other words, two-stage short-time cluster removal is performed, and finally accurate stayed location estimation is realized even when the GPS positioning interval is five minutes or more. Experiments demonstrate that the total distance error between the estimated stayed location and the true stayed location is reduced by more than 33% and also the proposed method much improves F1 measure compared to conventional state-of-the-art methods.

  • An Algorithm and a Flexible Architecture for Fast Block-Matching Motion Estimation

    Jinku CHOI  Nozomu TOGAWA  Masao YANAGISAWA  Tatsuo OHTSUKI  

     
    PAPER-VLSI Design

      Vol:
    E85-A No:12
      Page(s):
    2603-2611

    The motion estimation can choose the most suitable algorithm for different kinds of motion types, formats, and characteristics. The video encoding system can be optimized for quality, speed, and power consumption. In this paper, we propose a reconfigurable approach to a motion estimation algorithm and hardware architecture. The proposed algorithm determines motion type and then selects adapted block-matching algorithm for different kinds of motion sequences. The quality of our algorithm is better than that of the TSS and the BBGDS algorithm, or comparable to the performance of the better of the two, and the computational complexity of our algorithm is significantly less than that of the TSS. We also propose hardware architecture for realizing two kinds of motion estimations in the same hardware. We implemented the flexible and reconfigurable hardware architecture by using address generator unit, delay unit, and parameters and by using the hardware description language (VHDL) and the SYNOPSYS synthesis design tools. We analyze the performance of the algorithm and present adapted algorithm for a low cost real time application.

1-20hit(78hit)