Lihan TONG Weijia LI Qingxia YANG Liyuan CHEN Peng CHEN
Yinan YANG
Myung-Hyun KIM Seungkwang LEE
Shuoyan LIU Chao LI Yuxin LIU Yanqiu WANG
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI
Martin LUKAC Saadat NURSULTAN Georgiy KRYLOV Oliver KESZOCZE Abilmansur RAKHMETTULAYEV Michitaka KAMEYAMA
Zheqing ZHANG Hao ZHOU Chuan LI Weiwei JIANG
Liu ZHANG Zilong WANG Yindong CHEN
Wenxia Bao An Lin Hua Huang Xianjun Yang Hemu Chen
Fengshan ZHAO Qin LIU Takeshi IKENAGA
Haruhiko KAIYA Shinpei OGATA Shinpei HAYASHI
Jiakai LI Jianyong DUAN Hao WANG Li HE Qing ZHANG
Yuxin HUANG Yuanlin YANG Enchang ZHU Yin LIANG Yantuan XIAN
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI
Na XING Lu LI Ye ZHANG Shiyi YANG
Zhe Wang Zhe-Ming Lu Hao Luo Yang-Ming Zheng
Rina TAGAMI Hiroki KOBAYASHI Shuichi AKIZUKI Manabu HASHIMOTO
Tomohiro KOBAYASHI Tomomi MATSUI
Shin-ichi NAKANO
Hongzhi XU Binlian ZHANG
Weizhi WANG Lei XIA Zhuo ZHANG Xiankai MENG
Yuka KO Katsuhito SUDOH Sakriani SAKTI Satoshi NAKAMURA
Rinka KAWANO Masaki KAWAMURA
Zhishuo ZHANG Chengxiang TAN Xueyan ZHAO Min YANG
Peng WANG Guifen CHEN Zhiyao SUN
Zeyuan JU Zhipeng LIU Yu GAO Haotian LI Qianhang DU Kota YOSHIKAWA Shangce GAO
Ji WU Ruoxi YU Kazuteru NAMBA
Hao WANG Yao Ma Jianyong Duan Li HE Xin Li
Shijie WANG Xuejiao HU Sheng LIU Ming LI Yang LI Sidan DU
Arata KANEKO Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO
Qi LIU Bo WANG Shihan TAN Shurong ZOU Wenyi GE
HanYu Zhang Tomoji Kishi
Shinobu NAGAYAMA Tsutomu SASAO Jon T. BUTLER
Yoon Hak KIM
Takashi HIRAYAMA Rin SUZUKI Katsuhisa YAMANAKA Yasuaki NISHITANI
Yosuke IIJIMA Atsunori OKADA Yasushi YUMINAKA
Batnasan Luvaanjalba Elaine Yi-Ling Wu
KuanChao CHU Satoshi YAMAZAKI Hideki NAKAYAMA
Shenglei LI Haoran LUO Tengfei SHAO Reiko HISHIYAMA
Yasushi YUMINAKA Kazuharu NAKAJIMA Yosuke IIJIMA
Chunbo Liu Liyin Wang Zhikai Zhang Chunmiao Xiang Zhaojun Gu Zhi Wang Shuang Wang
Jia-ji JIANG Hai-bin WAN Hong-min SUN Tuan-fa QIN Zheng-qiang WANG
Yuhao LIU Zhenzhong CHU Lifei WEI
Ken ASANO Masanori NATSUI Takahiro HANYU
Shuto HASEGAWA Koichiro ENOMOTO Taeko MIZUTANI Yuri OKANO Takenori TANAKA Osamu SAKAI
Zhewei XU Mizuho IWAIHARA
Takao WAHO Akihisa KOYAMA Hitoshi HAYASHI
Taisei SAITO Kota ANDO Tetsuya ASAI
Shiyu YANG Tetsuya KANDA Daniel M. GERMAN Yoshiki HIGO
Tsutomu SASAO
Jiyeon LEE
Koichi MORIYAMA Akira OTSUKA
Hongliang FU Qianqian LI Huawei TAO Chunhua ZHU Yue XIE Ruxue GUO
Gao WANG Gaoli WANG Siwei SUN
Hua HUANG Yiwen SHAN Chuan LI Zhi WANG
Zhi LIU Heng WANG Yuan LI Hongyun LU Hongyuan JING Mengmeng ZHANG
Tomoyasu NAKANO Masataka GOTO
Hyebong CHOI Joel SHIN Jeongho KIM Samuel YOON Hyeonmin PARK Hyejin CHO Jiyoung JUNG
Xianglong LI Yuan LI Jieyuan ZHANG Xinhai XU Donghong LIU
Haoran LUO Tengfei SHAO Shenglei LI Reiko HISHIYAMA
Chang SUN Yitong LIU Hongwen YANG
Ji XI Yue XIE Pengxu JIANG Wei JIANG
Ming PAN
Barry SHACKLEFORD Mitsuhiro YASUDA Etsuko OKUSHI Hisao KOIZUMI Hiroyuki TOMIYAMA Hiroto YASUURA
Entire systems on a chip (SOCs) embodying a processor, memory, and system-specific peripheral hardware are now an everyday reality. The current generation of SOC designers are driven more than ever by the need to lower chip cost, while at the same time being faced with demands to get designs to market more quickly. It was to support this new community of designers that we developed Satsuki-an integrated processor synthesis and compiler generation system. By allowing the designer to tune the processor design to the bitwidth and performance required by the application, minimum cost designs are achieved. Using synthesis to implement the processor in the same technology as the rest of the chip, allows for global chip optimization from the perspective of the system as a whole and assures design portability. The integral compiler generator, driven by the same parameters used for processor synthesis, promotes high-level expression of application algorithms while at the same time isolating the application software from the processor implementation. Synthesis experiments incorporating a 0.8 micron CMOS gate array have produced designs ranging from a 45 MHz, 1,500 gate, 8-bit processor with a 4-word register file to a 31 MHz, 9,800 gate, 32-bit processor with a 16-word register file.
Raphael ROCHET Regis LEVEUGLE Gabriele SAUCIER
Synthesis tools are now extensively used in the VLSI circuit design process. They allow a much higher design productivity, but the designer often does not directly control the circuit structure. Thus, when circuits are dedicated to dependable applications, designers have difficulties in implementing manually the devices needed to obtain fault detection or tolerance capabilities. The ASYL-SdF System has been developed over the last few years in order to avoid this break in the design flow, and to facilitate the designer's work when dependability is targeted. This paper gives an overview of the resulting tool, its synthesis flow for fault detection and fault tolerance in Finite State Machines, its limitations and the current developments. Actual circuit implementation results are given in terms of area overheads, expected reliability and experimental fault detection coverage.
Vasily G. MOSHNYAGA Keikichi TAMARU
As IC fabrication technology enters a deepsubmicron region with device feature sizes <0.35µm, interconnect becomes the most dominant factor in design of high-speed Application Specific Integrated Circuits (ASICs). This paper proposes a novel methodology for automated data-path synthesis of such circuits and outlines algorithms to support it. In contrast to other approaches, we formulate interconnect area/delay optimizations as high-level synthesis transformations and use them during the synthesis to minimize the impact of wiring on circuit characteristics. Experiments with FIR filter implementations show that such formulation jointly with
Hsien-Ho CHUANG C. Bernard SHUNG
A new technology mapping algorithm is developed on a general model of FPGA with composite logic block architectures, consisting of different sizes of look-up tables (LUTs) and possibly different logic gates. In additions, the logic blocks may have hard-wired connections and limit accessible fanouts. Xilinx XC4000 is one example containing LUTs of different sizes and AT&T ORCA is another example containing both LUTs and logic gates. We use a multiple-fanout pattern graph library to model the composite logic block and a premapping technique to generate the subject graph dynamically. A new matching algorithm and a new covering algorithm are also developed for the subject graph covering. The experimental results show that our algorithm is an effective technology mapper for FPGAs with composite logic block architectures, especially for larger circuits. Over a set of MCNC benchmarks, our algorithm requires on the average 4.25% fewer CLBs than PPR, 6.79% fewer CLBs than TEMPT, and 2,79% fewer CLBs than ASYL when used as the XC4000 mapper. Over a set of larger benchmarks, our algorithm outperforms PPR by 13.70%. Very encouraging results were obtained when our algorithm is used as an ORCA mapper, while there was no prior published results.
Midori TAKANO Fumihiro MINAMI Naohito KOJIMA
This paper presents a novel clock routing method used in constructing an optimal clock tree for embedded array chips by determining the route so as to minimize both delay and skew. The proposed method features constructing a tree by optimal node-pair merging, predicting the upper side balancedtree structure, based on accurate global path or delay estimation. By this method, in the case of the chip with large macro cells, the delay estimation error has been within 10%.
Tetsushi KOIDE Takeshi SUZUKI Shin'ichi WAKABAYASHI Noriyoshi YOSHIDA
This paper presents a new timing-driven global routing method for standard cell layout. The proposed method can explicitly consider the timing constraint between two registers and minimize the channel density under the given timing constraint. In the proposed method, first, we determine the initial global routes. Next, we improve the global routes to satisfy the timing constraint between two registers as well as to minimize the channel density. Finally, for each cell row, the nets incident to terminals on the cell row are assigned to channels to minimize the channel density using 0-1 integer linear programming. We also show the experimental results of the proposed method implemented on an engineering workstation. Experimental results show that the proposed method is quite promising.
Tetsushi KOIDE Shin'ichi WAKABAYASHI Noriyoshi YOSHIDA
This paper presents a three layer over-the-cell (OTC) routing algorithm for standard cells with nonuniform OTC routing capacities in standard cell design. Since the number of available routing tracks on the second metal layer of OTC varies column by column, the proposed OTC routing method can effectively utilize the OTC regions. The proposed router performs two types of OTC routing. For the OTC regions near the channel, it performs planar routing. For the OTC regions far from the channel, it performs H-V routing on the second and third layers. Combining planar and H-V routings, the router can utilize the OTC effectively, that could hardly be achieved by existing algorithms. We also formulate the problem of selecting planar routable nets on the third layer as the maximum weighted planar routable net selection problem with nonuniform routing capacity, and propose an optimal algorithm. Experimental results show that the proposed router produces small height layouts as compared to those produced by the routers based on the existing cell model with uniform OTC routing capacity.
In order to apply formal design verification, it is necessary to describe formally and correctly the specification of the circuit under verification. Especially when we apply conventional OBDD-based logic comparison method for verifying combinational circuits, another
Akira MOTOHARA Sadami TAKEOKA Mitsuyasu OHTA Michiaki MURAOKA
An approach to design for testability using register-transfer level (RTL) partial scan selection is described. We define an RTL circuit model which enables efficient description in an electronic system design automation (ESDA) tool and testability analysis which leads to effective partial scan selection for RTL design including data path circuits and control circuits such as state machines. We also introduced a method of partial scan selection at RTL which selects critical registers and state machines based on RTL testability analysis. DFT techniques using gate level testability measures have been studied and concluded that they are not successful in achieving high fault coverage [15]. However, we started this work for the following reasons, 1) In sequential ATPG procedure, more than two memory elements belonging to a functional units such as registers and state machines are often required to be justified at a time. At RTL, state machines and registers are explicitly described and recognized as functional units while gate level memory elements are scattered over the circuit. 2) As discussed in [6], if the circuit is modified so that the test sequence which causes state transition between initial and final states of sequential ATPG can be easily obtained, ATPG results can be also improved. Complex state machines can be identified at RTL. According to the experimental results, our gate level DFT achieves high fault coverage comparable with the previously published most successful DFT methods, and DFT at RTL resulted in higher fault coverage than gate level DFT at much shorter CPU time.
Shuichi OIKAWA Hideyuki TOKUDA
In forthcoming multimedia environments, continuous-media data, such as video and audio data, will be used by a variety of multimedia applications. Multimedia applications require efficient and flexible support from real-time operating systems. This is because the changes in system and network loads require dynamic management of real-time thread behavior. If threads are implemented at the user level, operations on threads can be processed at the user level, and the efficient management of threads becomes possible by avoiding kernel interventions. Thus, we can provide an effective platform for multimedia applications. The goal of our work is to realize high-performance user-level real-time threads which satisfy the above requirements of multimedia systems. In this paper we describe the design and implementation of a user-level real-time threads package, called RTC-Threads, which is being developed on the RT-Mach microkernel. The results of performance evaluations show that our user-level real-time threads outperform real-time kernel-provided threads, which are implemented in the microkernel, in terms of efficiency and accuracy.
Naotake KAMIURA Yutaka HATA Kazuharu YAMATO
In this paper, we discuss problems in design and fault masking of multiple-valued cellular arrays where basic cells having simple switch functions are arranged iteratively. The stuck-at faults of switch cells are assumed to be fault models. First, we introduce a universal single-level array and derive the ratio of the number of single faults whose influence can be masked to the total number of single faults. Next, we propose a universal two-level array that outputs correct values even if single faults occur in it and derive the ratio of the number of double faults whose influence can be masked compared to the total number of double faults. By evaluating the universal single-level array and the universal two-level array from the viewpoints of design and fault masking, we show that the latter is superior to the former. Finally, we compare our universal two-level array with formerly presented arrays in order to demonstrate the advantages of our universal two-level array.
The estimation of model parameter is essentially important for an MRF image model to work well. Because the maximum likelihood estimate (MLE), which is statistically optimal, is too difficult to implement, the conventional estimates such as the maximum pseudo-likelihood estimate (MPLE), the coding method estimate (CME), and the least-squares estimate (LSE) are all based on the (conditional) pixel probabilities for simplicity. However, the conventional pixel-based estimators are not very satisfactorily accurate, especially when the interactions of pixels are strong. We therefore propose two window-based estimators to improve the estimation accuracy: the adjoining-conditional-window (ACW) scheme and the separated-conditional-window (SCW) scheme. The replacement of the pixel probabilities by the joint probabilities of window pixels was inspired by the fact that the pixels in an image present information in a joint way and hence the more pixels we deal with the joint probabilities of, the more accurate the estimate should be. The window-based estimators include the pixel-based ones as special cases. We present respectively the relationship between the MLE and each of the two window-based estimates. Through the relationships we provide a unified view that the conventional pixel-based estimates and our window-based estimates all approximate the MLE. The accuracy of all the estimates can be described by two types of superiority: the cross-scheme superiority that an ACW estimate is more accurate than the SCW estimate with the same window size, and the in-scheme superiority that an ACW (or SCW) estimate more accurate than another ACW (or SCW) estimate which uses smaller window size. The experimental results showed the two types of superiority and particularly the significant improvement in estimation accuracy due to using window probabilities instead of pixel probabilities.
Ronghua YAN Naoyuki TOKUDA Juichi MIYAMICHI
Unlike the time-consuming contour tracking method of snakes [5] which requires a considerable number of iterated computations before contours are successfully tracked down, we present a faster and accurate model-based