The search functionality is under construction.
The search functionality is under construction.

Author Search Result

[Author] Takeshi YOSHIMURA(28hit)

1-20hit(28hit)

  • FOREWORD

    Takeshi YOSHIMURA  

     
    FOREWORD

      Vol:
    E79-A No:12
      Page(s):
    2085-2085
  • An Engineering Change Orders Design Method Based on Patchwork-Like Partitioning for High Performance LSIs

    Yuichi NAKAMURA  Ko YOSHIKAWA  Takeshi YOSHIMURA  

     
    PAPER-Logic Synthesis

      Vol:
    E88-A No:12
      Page(s):
    3351-3357

    This paper describes a novel engineering change order (ECO) design method for large-scale, high performance LSIs, based on a patchwork-like partitioning technique. In conventional design methods, even when only small changes are made to the design after the placement and routing process, a whole re-layout must be done, and this is very time consuming. Using the proposed method, we can partition the design into several parts after logic synthesis. When design changes occur in HDL, only the parts related to the changes need to be redesigned. The netlist for the changed design remains almost the same as the original, except for the small changed parts. For partitioning, we used multiple-fan-out-points as partition borders. An experimental evaluation of our method showed that when a small change was made in the RTL description, the revised circuit part had only about 87 gates on average. This greatly reduces the re-layout time required for implementing an ECO. In actual commercial designs in which several design changes are required, it takes only one day to redesign.

  • Multiple-Reference Compression of RTP/UDP/IP Headers for Mobile Multimedia Communications

    Takeshi YOSHIMURA  Toshiro KAWAHARA  Tomoyuki OHYA  Minoru ETOH  

     
    PAPER

      Vol:
    E85-A No:7
      Page(s):
    1491-1500

    In this paper, we propose an RTP/UDP/IP header compression method, Multiple-Reference Compression (MRC), which is designed for mobile multimedia communications. MRC is a compression method that calculates differences from the multiple reference headers that have already been sent and inserts them into a compressed header. The receiver can decompress the compressed header as long as at least one of the reference headers is correctly received and decompressed. MRC improves robustness against packet losses compared with CRTP defined in IETF RFC2508, and imposes less overheads and computational burden than robust header compression (ROHC) defined in RFC3095. We also implemented MRC and other header compression algorithms into our mobile testbed, and conducted multimedia streaming experiments over the testbed. The results of the experiments show that MRC offers the same level of packet loss rate as Legacy RTP for both audio and video streams, and provides better media quality than Legacy RTP and CRTP on error-prone radio links. Header compression robust against packet losses is expected as a key technology for VoIP and multimedia streaming services over 3G and future mobile networks.

  • Mobility Overlap-Removal-Based Leakage Power and Register-Aware Scheduling in High-Level Synthesis

    Nan WANG  Song CHEN  Wei ZHONG  Nan LIU  Takeshi YOSHIMURA  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E97-A No:8
      Page(s):
    1709-1719

    Scheduling is a key problem in high level synthesis, as the scheduling results affect most of the important design metrics. In this paper, we propose a novel scheduling method to simultaneously optimize the leakage power of functional units with dual-Vth techniques and the number of registers under given timing and resource constraints. The mobility overlaps between operations are removed to eliminate data dependencies, and a simulated-annealing-based method is introduced to explore the mobility overlap removal solution space. Given the overlap-free mobilities, the resource usage and register usage in each control step can be accurately estimated. Meanwhile, operations are scheduled so as to optimize the leakage power of functional units with minimal number of registers. Then, a set of operations is iteratively selected, reassigned as low-Vth, and rescheduled until the resource constraints are all satisfied. Experimental results show the efficiency of the proposed algorithm.

  • Unified Parameter Decoder Architecture for H.265/HEVC Motion Vector and Boundary Strength Decoding

    Shihao WANG  Dajiang ZHOU  Jianbin ZHOU  Takeshi YOSHIMURA  Satoshi GOTO  

     
    PAPER

      Vol:
    E98-A No:7
      Page(s):
    1356-1365

    In this paper, VLSI architecture design of unified motion vector (MV) and boundary strength (BS) parameter decoder (PDec) for 8K UHDTV HEVC decoder is presented. The adoption of new coding tools in PDec, such as Advanced Motion Vector Prediction (AMVP), increases the VLSI hardware realization overhead and memory bandwidth requirement, especially for 8K UHDTV application. We propose four techniques for these challenges. Firstly, this work unifies MV and BS parameter decoders for line buffer memory sharing. Secondly, to support high throughput, we propose the top-level CU-adaptive pipeline scheme by trading off between implementation complexity and performance. Thirdly, PDec process engine with optimizations is adopted for 43.2k area reduction. Finally, PU-based coding scheme is proposed for 30% DRAM bandwidth reduction. In 90nm process, our design costs 93.3k logic gates with 23.0kB line buffer. The proposed architecture can support real-time decoding for 7680x4320@60fps application at 249MHz in the worst case.

  • High Performance VLSI Architecture of H.265/HEVC Intra Prediction for 8K UHDTV Video Decoder

    Jianbin ZHOU  Dajiang ZHOU  Shihao WANG  Takeshi YOSHIMURA  Satoshi GOTO  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E98-A No:12
      Page(s):
    2519-2527

    8K Ultra High Definition Television (UHDTV) requires extremely high throughput for video decoding based on H.265. In H.265, intra coding could significantly enhance video compression efficiency, at the expense of an increased computational complexity compared with H.264. For intra prediction of 8K UHDTV real-time H.265 decoding, the joint complexity and throughput issue is more difficult to solve. Therefore, based on the divide-and-conquer strategy, we propose a new VLSI architecture in this paper, including two techniques, in order to achieve 8K UHDTV H.265 intra prediction decoding. The first technique is the LUT based Reference Sample Fetching Scheme (LUT-RSFS), reducing the number of reference samples in the worst case from 99 to 13. It further reduces the circuit area and enhances the performance. The second one is the Hybrid Block Reordering and Data Forwarding (HBRDF), minimizing the idle time and eliminating the dependency between TUs by creating 3 Data Forwarding paths. It achieves the hardware utilization of 94%. Our design is synthesized using Synopsys Design Compiler in 40nm process technology. It achieves an operation frequency of 260MHz, with a gate count of 217.8K for 8-bit design, and 251.1K for 10-bit design. The proposed VLSI architecture can support 4320p@120fps H.265 intra decoding (8-bit or 10-bit), with all 35 intra prediction modes and prediction unit sizes ranging from 4×4 to 64×64.

  • An Efficient Multi-Level Algorithm for 3D-IC TSV Assignment

    Cong HAO  Takeshi YOSHIMURA  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E100-A No:3
      Page(s):
    776-784

    Through-silicon via (TSV) assignment problem is one of the key design challenges of 3-D IC which is crucial to the wire length and signal delay. In this work we formulate the 3-D IC TSV assignment as an Integer Minimum Cost Multi Commodity (IMCMC) problem on a IMCMC network, and propose a multi-level algorithm. It coarsens the IMCMC network level by level, applies a rough flow assignment on each level of coarsened graph, and generates only promising edges to reduce the IMCMC network size. Benefiting from the multi-level structure, we propose a mixed single and multi commodity flow method improve the TSV assignment solution quality. Moreover, given a TSV assignment, we propose an extended layer by layer algorithm to further optimize the TSV assignment. The experimental results demonstrate that our multi-level with mixed single and multi commodity flow algorithm achieves not only smaller wire length but also shorter runtime compared to other existing works.

  • Framework and VLSI Architecture of Measurement-Domain Intra Prediction for Compressively Sensed Visual Contents

    Jianbin ZHOU  Dajiang ZHOU  Li GUO  Takeshi YOSHIMURA  Satoshi GOTO  

     
    PAPER

      Vol:
    E100-A No:12
      Page(s):
    2869-2877

    This paper presents a measurement-domain intra prediction coding framework that is compatible with compressive sensing (CS)-based image sensors. In this framework, we propose a low-complexity intra prediction algorithm that can be directly applied to measurements captured by the image sensor. We proposed a structural random 0/1 measurement matrix, embedding the block boundary information that can be extracted from the measurements for intra prediction. Furthermore, a low-cost Very Large Scale Integration (VLSI) architecture is implemented for the proposed framework, by substituting the matrix multiplication with shared adders and shifters. The experimental results show that our proposed framework can compress the measurements and increase coding efficiency, with 34.9% BD-rate reduction compared to the direct output of CS-based sensors. The VLSI architecture of the proposed framework is 9.1 Kin area, and achieves the 83% reduction in size of memory bandwidth and storage for the line buffer. This could significantly reduce both the energy consumption and bandwidth in communication of wireless camera systems, which are expected to be massively deployed in the Internet of Things (IoT) era.

  • Approximate-DCT-Derived Measurement Matrices with Row-Operation-Based Measurement Compression and its VLSI Architecture for Compressed Sensing

    Jianbin ZHOU  Dajiang ZHOU  Takeshi YOSHIMURA  Satoshi GOTO  

     
    PAPER

      Vol:
    E101-C No:4
      Page(s):
    263-272

    Compressed Sensing based CMOS image sensor (CS-CIS) is a new generation of CMOS image sensor that significantly reduces the power consumption. For CS-CIS, the image quality and data volume of output are two important issues to concern. In this paper, we first proposed an algorithm to generate a series of deterministic and ternary matrices, which improves the image quality, reduces the data volume and are compatible with CS-CIS. Proposed matrices are derived from the approximate DCT and trimmed in 2D-zigzag order, thus preserving the energy compaction property as DCT does. Moreover, we proposed matrix row operations adaptive to the proposed matrix to further compress data (measurements) without any image quality loss. At last, a low-cost VLSI architecture of measurements compression with proposed matrix row operations is implemented. Experiment results show our proposed matrix significantly improve the coding efficiency by BD-PSNR increase of 4.2 dB, comparing with the random binary matrix used in the-state-of-art CS-CIS. The proposed matrix row operations for measurement compression further increases the coding efficiency by 0.24 dB BD-PSNR (4.8% BD-rate reduction). The VLSI architecture is only 4.3 K gates in area and 0.3 mW in power consumption.

  • Floorplanning for High Utilization of Heterogeneous FPGAs

    Nan LIU  Song CHEN  Takeshi YOSHIMURA  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E95-A No:9
      Page(s):
    1529-1537

    Heterogeneous resources such as configurable logic blocks (CLBs), multiplier blocks (MULs) and RAM blocks (RAMs) where millions of logic gates are included have been added to field programmable gate arrays (FPGAs). The fixed-outline floorplanning used by the existing methods always has a big penalty item in the objective function to ensure all the modules are placed in the specified chip region, which maybe greatly degrade the wirelength. This paper presents a three-phase floorplanning method for heterogeneous FPGAs. First, a non-slicing free-outline floorplanning method is used to optimize the wirelength, however, in this phase, the satisfaction of resource requirements from functional modules might fail. Second, a min-cost-max-flow algorithm is used to tune the assignment of CLBs to functional modules, and assign contiguous regions to each module so that all the functional modules satisfy CLB requirements. Finally, the MULs and RAMs are allocated to modules by a network flow model. CLBs hold the maximum quantity among all the resources. Therefore, making a high utilization of them means an enhancement of the FPGA densities. The proposed method can improve the utilization of CLBs, hence, much larger circuits could be mapped to the same FPGA chip. The results show that about 7–85% wirelength reduction is obtained, and CLB utilization is improved by about 25%.

  • Max-Flow Scheduling in High-Level Synthesis

    Liangwei GE  Song CHEN  Kazutoshi WAKABAYASHI  Takashi TAKENAKA  Takeshi YOSHIMURA  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E90-A No:9
      Page(s):
    1940-1948

    Scheduling, an essential step in high-level synthesis, is an intractable process. Traditional heuristic scheduling methods usually search schedules directly in the entire solution space. In this paper, we propose the idea of searching within an intermediate solution space (ISS). We put forward a max-flow scheduling method that heuristically prunes the solution space into a specific ISS and finds the optimum of ISS in polynomial time. The proposed scheduling algorithm has some unique features, such as the correction of previous scheduling decisions in a later stage, the simultaneous scheduling of all the operations, and the optimization of more complicated objectives. Aided by the max-flow scheduling method, we implement the optimization of the IC power-ground integrity problem at the behavior level conveniently. Experiments on well-known benchmarks show that without requiring additional resources or prolonging schedule latency, the proposed scheduling method can find a schedule that draws current more stably from a supply, which mitigates the voltage fluctuation in the on-chip power distribution network.

  • Score Sequence Pair Problems of (r11, r12, r22)-Tournaments--Determination of Realizability--

    Masaya TAKAHASHI  Takahiro WATANABE  Takeshi YOSHIMURA  

     
    PAPER-Graph Algorithms

      Vol:
    E90-D No:2
      Page(s):
    440-448

    Let G be any graph with property P (for example, general graph, directed graph, etc.) and S be nonnegative and non-decreasing integer sequence(s). The prescribed degree sequence problem is a problem to determine whether there is a graph G having S as the prescribed sequence(s) of degrees or outdegrees of the vertices. From 1950's, P has attracted wide attentions, and its many extensions have been considered. Let P be the property satisfying the following (1) and (2):(1) G is a directed graph with two disjoint vertex sets A and B. (2) There are r11 (r22, respectively) directed edges between every pair of vertices in A(B), and r12 directed edges between every pair of vertex in A and vertex in B. Then G is called an (r11, r12, r22)-tournament ("tournament", for short). The problem is called the score sequence pair problem of a "tournament" (realizable, for short). S is called a score sequence pair of a "tournament" if the answer of the problem is "yes." In this paper, we propose the characterizations of a score sequence pair of a "tournament" and an algorithm for determining in linear time whether a pair of two integer sequences is realizable or not.

  • Acoustic OFDM System and Performance Analysis

    Hosei MATSUOKA  Yusuke NAKASHIMA  Takeshi YOSHIMURA  

     
    PAPER

      Vol:
    E91-A No:7
      Page(s):
    1652-1658

    This paper presents a technology for short-range communications using sound wave, in which the modulated data signal can be transmitted in parallel with regular audio without significantly degrading the quality of the sound. The technology, which we call Acoustic OFDM, replaces the high frequency band of the audio signal with OFDM carriers, each of which is power-controlled according to the spectrum envelope of the original audio signal. It can provide data transmission of several hundreds bps. The implemented Acoustic OFDM system enables the transmission of short text messages from loud speakers to mobile devices at a distance of around 3 m.

  • FOREWORD

    Winfried HAHN  Takeshi YOSHIMURA  

     
    FOREWORD

      Vol:
    E76-A No:10
      Page(s):
    1615-1616
  • Redundant via Insertion: Removing Design Rule Conflicts and Balancing via Density

    Song CHEN  Jianwei SHEN  Wei GUO  Mei-Fang CHIANG  Takeshi YOSHIMURA  

     
    PAPER-Physical Level Design

      Vol:
    E93-A No:12
      Page(s):
    2372-2379

    The occurrence of via defects increases due to the shrinking size in integrated circuit manufacturing. Redundant via insertion is an effective and recommended method to reduce the yield loss caused by via failures. In this paper, we introduce the redundant via allocation problem for layer partition-based redundant via insertion methods [1] and solve it using the genetic algorithm. At the same time, we use a convex-cost flow model to equilibrate the via density, which is good for the via density rules. The results of layer partition-based model depend on the partition and processing order of metal layers. Furthermore, even we try all of partitions and processing orders, we might miss the optimal solutions. By introducing the redundant via allocation problem on partitioning boundaries, we can avoid the sub-optimality of the original layer-partition based method. The experimental results show that the proposed method got 12 more redundant vias inserted on average and the via density balance can be greatly improved.

  • A Minimum Bandwidth Guaranteed Service Model and Its Implementation on Wireless Packet Scheduler

    Mooryong JEONG  Takeshi YOSHIMURA  Hiroyuki MORIKAWA  Tomonori AOYAMA  

     
    PAPER

      Vol:
    E85-A No:7
      Page(s):
    1463-1471

    In this paper, we introduce a concept of minimum bandwidth guaranteed service model for mobile multimedia. In this service model, service is defined in the context of the guaranteed minimum bandwidth and the residual service share. Each flow under this service model is guaranteed with its minimum bandwidth and provided with more in proportion to the residual service share if there is leftover bandwidth. The guaranteed minimum bandwidth assures a flow to keep minimum tolerable quality regardless of the network load, while the leftover bandwidth enhances the quality of service according to the application's adaptivity and the user's interest. We show that the minimum bandwidth guaranteed service model could be implemented by a two-folded wireless packet scheduler consisting of a guaranteed scheduler and a sharing scheduler. Wireless channel condition of each flow is considered in scheduling so that wireless resource can be distributed only to the flows of good channel state, improving total wireless link utilization. We evaluate the service model and the scheduling method by simulation and implementation.

  • Mobile Broadcast Streaming Service and Protocols on Unidirectional Radio Channels

    Takeshi YOSHIMURA  Tomoyuki OHYA  

     
    PAPER-Multicast/Broadcast

      Vol:
    E87-B No:9
      Page(s):
    2596-2604

    In this paper, we propose a set of broadcast streaming protocols designed for unidirectional radio channels. Considering the limited size and implementation overhead on a mobile terminal, the proposed protocol set is almost compliant with the current mobile streaming protocols, i.e. 3GPP PSS (Packet-switched Streaming Service), except for that the proposed protocols are designed to work on a unidirectional downlink channel. This protocol set enables flexible layout rendering by SMIL (Synchronized Multimedia Integration Language) in combination with SDP (Session Description Protocol), and reliable and synchronized static media (including still image and text) delivery by RTP (Real-time Transport Protocol) carousel. We present the prototype of this protocol set and measure its performance of video quality and waiting time for video presentation through a W-CDMA radio channel emulator and header compression nodes. From the experimental results, we show 1) trade-off between video quality and waiting time, 2) advantage and disadvantage of header compression, 3) effectiveness of synchronized transmission of SDP, SMIL, and I-frames of video objects, and 4) reliability of RTP-carousel. This protocol set is applicable to 3G MBMS (Multimedia Broadcast/Multicast Service) streaming service.

  • Real-Time UHD Background Modelling with Mixed Selection Block Updates

    Axel BEAUGENDRE  Satoshi GOTO  Takeshi YOSHIMURA  

     
    PAPER-IMAGE PROCESSING

      Vol:
    E100-A No:2
      Page(s):
    581-591

    The vast majority of foreground detection methods require heavy hardware optimization to process in real-time standard definition videos. Indeed, those methods process the whole frame for the detection but also for the background modelling part which makes them resource-guzzlers (time, memory, etc.) unable to be applied to Ultra High Definition (UHD) videos. This paper presents a real-time background modelling method called Mixed Block Background Modelling (MBBM). It is a spatio-temporal approach which updates the background model by carefully selecting block by a linear and pseudo-random orders and update the corresponding model's block parts. The two block selection orders make sure that every block will be updated. For foreground detection purposes, the method is combined with a foreground detection designed for UHD videos such as the Adaptive Block-Propagative Background Subtraction method. Experimental results show that the proposed MBBM can process 50min. of 4K UHD videos in less than 6 hours. while other methods are estimated to take from 8 days to more than 21 years. Compared to 10 state-of-the-art foreground detection methods, the proposed MBBM shows the best quality results with an average global quality score of 0.597 (1 being the maximum) on a dataset of 4K UHDTV sequences containing various situation like illumination variation. Finally, the processing time per pixel of the MBBM is the lowest of all compared methods with an average of 3.18×10-8s.

  • Content Delivery Network Architecture for Mobile Streaming Service Enabled by SMIL Modification

    Takeshi YOSHIMURA  Yoshifumi YONEMOTO  Tomoyuki OHYA  Minoru ETOH  Susie WEE  

     
    PAPER-CDN Architecture

      Vol:
    E86-B No:6
      Page(s):
    1778-1787

    In this paper, we present a CDN (Content Delivery Network) architecture for mobile streaming service in which content segmentation, request routing, pre-fetch scheduling, and session handoff are controlled by SMIL (Synchronized Multimedia Integration Language) modification. In this architecture, mobile clients simply follow modified SMIL files downloaded from a portal server; these modifications enable multimedia content to be delivered to the mobile clients from the best surrogates in the CDN. The key components of this architecture are 1) content segmentation with SMIL modification, 2) on-demand rewriting of URLs in SMIL, 3) pre-fetch scheduling based on timing information derived from SMIL, and 4) SMIL updates by SOAP (Simple Object Access Protocol) messaging for session handoffs due to client mobility. This architecture enhances streaming media quality for mobile clients while utilizing network resources efficiently and supporting client mobility in an integrated and practical way. The current status of our prototype on a mobile QoS testbed "MOBIQ" is also reported in this paper.

  • Lagrangian Relaxation Based Inter-Layer Signal Via Assignment for 3-D ICs

    Song CHEN  Liangwei GE  Mei-Fang CHIANG  Takeshi YOSHIMURA  

     
    PAPER

      Vol:
    E92-A No:4
      Page(s):
    1080-1087

    Three-dimensional integrated circuits (3-D ICs), i.e., stacked dies, can alleviate the interconnect problem coming with the decreasing feature size and increasing integration density, and promise a solution to heterogenous integration. The vertical connection, which is generally implemented by the through-the-silicon via, is a key technology for 3-D ICs. In this paper, given 3-D circuit placement or floorplan results with white space reserved between blocks for inter-layer interconnections, we proposed methods for assigning inter-layer signal via locations. Introducing a grid structure on the chip, the inter-layer via assignment of two-layer chips can be optimally solved by a convex-cost max-flow formulation with signal via congestion optimized. As for 3-D ICs with three or more layers, the inter-layer signal via assignment is modeled as an integral min-cost multi-commodity flow problem, which is solved by a heuristic method based on the lagrangian relaxation. Relaxing the capacity constraints in the grids, we transfer the min-cost multi-commodity flow problem to a sequence of lagrangian sub-problems, which are solved by finding a sequence of shortest paths. The complexity of solving a lagrangian sub-problem is O(nntng2), where nnt is the number of nets and ng is the number of grids on one chip layer. The experimental results demonstrated the effectiveness of the method.

1-20hit(28hit)