Jiaxin WU Bing LI Li ZHAO Xinzhou XU
Maaki SAKAI Kanon HOKAZONO Yoshiko HANADA
Xuecheng SUN Zheming LU
Yuanhe WANG Chao ZHANG
Jinfeng CHONG Niu JIANG Zepeng ZHUO Weiyu ZHANG
Xiangrun LI Qiyu SHENG Guangda ZHOU Jialong WEI Yanmin SHI Zhen ZHAO Yongwei LI Xingfeng LI Yang LIU
Meiting XUE Wenqi WU Jinfeng LUO Yixuan ZHANG Bei ZHAO
Rong WANG Changjun YU Zhe LYU Aijun LIU
Huijuan ZHOU Zepeng ZHUO Guolong CHEN
Feifei YAN Pinhui KE Zuling CHANG
Manabu HAGIWARA
Ziqin FENG Hong WAN Guan GUI
Sungryul LEE
Feng WANG Xiangyu WEN Lisheng LI Yan WEN Shidong ZHANG Yang LIU
Yanjun LI Jinjie GAO Haibin KAN Jie PENG Lijing ZHENG Changhui CHEN
Ho-Lim CHOI
Feng WEN Haixin HUANG Xiangyang YIN Junguang MA Xiaojie HU
Shi BAO Xiaoyan SONG Xufei ZHUANG Min LU Gao LE
Chen ZHONG Chegnyu WU Xiangyang LI Ao ZHAN Zhengqiang WANG
Izumi TSUNOKUNI Gen SATO Yusuke IKEDA Yasuhiro OIKAWA
Feng LIU Helin WANG Conggai LI Yanli XU
Hongtian ZHAO Hua YANG Shibao ZHENG
Kento TSUJI Tetsu IWATA
Yueying LOU Qichun WANG
Menglong WU Jianwen ZHANG Yongfa XIE Yongchao SHI Tianao YAO
Jiao DU Ziwei ZHAO Shaojing FU Longjiang QU Chao LI
Yun JIANG Huiyang LIU Xiaopeng JIAO Ji WANG Qiaoqiao XIA
Qi QI Liuyi MENG Ming XU Bing BAI
Nihad A. A. ELHAG Liang LIU Ping WEI Hongshu LIAO Lin GAO
Dong Jae LEE Deukjo HONG Jaechul SUNG Seokhie HONG
Tetsuya ARAKI Shin-ichi NAKANO
Shoichi HIROSE Hidenori KUWAKADO
Yumeng ZHANG
Jun-Feng Liu Yuan Feng Zeng-Hui Li Jing-Wei Tang
Keita EMURA Kaisei KAJITA Go OHTAKE
Xiuping PENG Yinna LIU Hongbin LIN
Yang XIAO Zhongyuan ZHOU Mingjie SHENG Qi ZHOU
Kazuyuki MIURA
Yusaku HIRAI Toshimasa MATSUOKA Takatsugu KAMATA Sadahiro TANI Takao ONOYE
Ryuta TAMURA Yuichi TAKANO Ryuhei MIYASHIRO
Nobuyuki TAKEUCHI Kosei SAKAMOTO Takuro SHIRAYA Takanori ISOBE
Shion UTSUMI Kosei SAKAMOTO Takanori ISOBE
You GAO Ming-Yue XIE Gang WANG Lin-Zhi SHEN
Zhimin SHAO Chunxiu LIU Cong WANG Longtan LI Yimin LIU Zaiyan ZHOU
Xiaolong ZHENG Bangjie LI Daqiao ZHANG Di YAO Xuguang YANG
Takahiro IINUMA Yudai EBATO Sou NOBUKAWA Nobuhiko WAGATSUMA Keiichiro INAGAKI Hirotaka DOHO Teruya YAMANISHI Haruhiko NISHIMURA
Takeru INOUE Norihito YASUDA Hidetomo NABESHIMA Masaaki NISHINO Shuhei DENZUMI Shin-ichi MINATO
Zhan SHI
Hakan BERCAG Osman KUKRER Aykut HOCANIN
Ryoto Koizumi Xiaoyan Wang Masahiro Umehira Ran Sun Shigeki Takeda
Hiroya Hachiyama Takamichi Nakamoto
Chuzo IWAMOTO Takeru TOKUNAGA
Changhui CHEN Haibin KAN Jie PENG Li WANG
Pingping JI Lingge JIANG Chen HE Di HE Zhuxian LIAN
Ho-Lim CHOI
Akira KITAYAMA Goichi ONO Hiroaki ITO
Koji NUIDA Tomoko ADACHI
Yingcai WAN Lijin FANG
Yuta MINAMIKAWA Kazumasa SHINAGAWA
Sota MORIYAMA Koichi ICHIGE Yuichi HORI Masayuki TACHI
Sendren Sheng-Dong XU Albertus Andrie CHRISTIAN Chien-Peng HO Shun-Long WENG
Zhikui DUAN Xinmei YU Yi DING
Hongbo LI Aijun LIU Qiang YANG Zhe LYU Di YAO
Yi XIONG Senanayake THILAK Yu YONEZAWA Jun IMAOKA Masayoshi YAMAMOTO
Feng LIU Qian XI Yanli XU
Yuling LI Aihuang GUO
Mamoru SHIBATA Ryutaroh MATSUMOTO
Haiyang LIU Xiaopeng JIAO Lianrong MA
Ruixiao LI Hayato YAMANA
Riaz-ul-haque MIAN Tomoki NAKAMURA Masuo KAJIYAMA Makoto EIKI Michihiro SHINTANI
Kundan LAL DAS Munehisa SEKIKAWA Tadashi TSUBONE Naohiko INABA Hideaki OKAZAKI
Takashi MITSUHASHI Toshiro AKINO
This paper presents a novel system-level design methodology, called quality-driven design, by which application-specific optimization can be achieved; furthermore the entire functionality can be shared to maximize design reuse. As a case of study, this paper focuses on quality-driven design for video applications and introduces an output quality adaptive approach based on variable bitwidth optimization to explore a new design space. MPEG2 video is used as the driver application to illustrate the potential of the presented methodology. Experimental results show the effectiveness of the methodology.
Hiroshi SAITO Alex KONDRATYEV Jordi CORTADELLA Luciano LAVAGNO Alex YAKOVLEV Takashi NANYA
Deep submicron technology calls for new design techniques, in which wire and gate delays are accounted to have equal or nearly equal effect on circuit behavior. Asynchronous speed-independent (SI) circuits, whose behavior is only robust to gate delay variations, may be too optimistic. On the other hand, building circuits totally delay-insensitive (DI), for both gates and wires, is impractical because of the lack of effective synthesis methods. The paper presents a new approach for synthesis of globally DI and locally SI circuits. The method, working in two possible design scenarios, either starts from a behavioral specification called Signal Transition Graph (STG) or from the SI implementation of the STG specification. The method locally modifies the initial model in such a way that the resultant behavior of the system does not depend on delays in the input wires. This guarantees delay-insensitivity of the system-environment interface. The suggested approach was successfully tested on a set of benchmarks. Experimental results show that DI interfacing is realized with a relatively moderate cost in area and speed (costs about 40% area penalty and 20% speed penalty).
Shinsuke KOBAYASHI Kentaro MITA Yoshinori TAKEUCHI Masaharu IMAI
This paper proposes a compiler generation method for PEAS-III (Practical Environment for ASIP development), which is a configurable processor development environment for application domain specific embedded systems. Using the PEAS-III system, not only the HDL description of a target processor but also its target compiler can be generated. Therefore, execution cycles and dynamic power consumption can be rapidly evaluated. Two processors and their derivatives were designed using the PEAS-III system in the experiment. Experimental results show that the trade-offs among area, performance and power consumption of processors were analyzed in about twelve hours and the optimal processor was selected under the design constraints by using generated compilers and processors.
Seiichiro ISHIJIMA Tetsuaki UTSUMI Tomohiro OTO Atsushi TAKAHASHI
A circuit in which the clock is assumed to be distributed periodically to each individual register though not necessarily to all registers simultaneously, called a semi-synchronous circuit, is expected to achieve higher frequency or a smaller clock tree compared with an ordinary synchronous circuit, called a complete-synchronous circuit. In this paper, we propose a circuit design method that realizes a semi-synchronous circuit with higher frequency by modifying the clock tree of a complete-synchronous circuit. We confirm that the proposed method is easy to incorporate with current practical design environment by designing a four stage pipelined processor compatible with MIPS operation code. The obtained processor circuit is the first semi-synchronous circuit designed systematically with theoretical background.
Jinku CHOI Nozomu TOGAWA Masao YANAGISAWA Tatsuo OHTSUKI
The motion estimation can choose the most suitable algorithm for different kinds of motion types, formats, and characteristics. The video encoding system can be optimized for quality, speed, and power consumption. In this paper, we propose a reconfigurable approach to a motion estimation algorithm and hardware architecture. The proposed algorithm determines motion type and then selects adapted block-matching algorithm for different kinds of motion sequences. The quality of our algorithm is better than that of the TSS and the BBGDS algorithm, or comparable to the performance of the better of the two, and the computational complexity of our algorithm is significantly less than that of the TSS. We also propose hardware architecture for realizing two kinds of motion estimations in the same hardware. We implemented the flexible and reconfigurable hardware architecture by using address generator unit, delay unit, and parameters and by using the hardware description language (VHDL) and the SYNOPSYS synthesis design tools. We analyze the performance of the algorithm and present adapted algorithm for a low cost real time application.
Barry SHACKLEFORD Motoo TANAKA Richard J. CARTER Greg SNIDER
Studies of cellular automata (CA) based random number generators (RNGs) have focused mainly upon symmetrically connected networks with neighborhood sizes of three or five. Popular field programmable gate array configurations feature a four-input (i.e., 16-row) lookup table. Full utilization of the four-input lookup table leads to the potential for asymmetrically connected cellular automata networks with a neighborhood size of four. From each of various 1-d, 2-d, and 3-d networks with periodic boundary conditions, the 1000 highest entropy CA RNGs were selected from the set of 65,536 possible uniform (all CA truth tables the same) implementations. Each set of 1000 high-entropy CA was then submitted to Marsaglia's DIEHARD suite of random number tests. A number of 64-bit, neighbor-of-four CA-based RNGs have been discovered that pass all tests in DIEHARD without resorting to either site spacing or time spacing to improve the RNG quality.
Ing-Jer HUANG Li-Rong WANG Yu-Min WANG Tai-An LU
This paper presents a case study of synthesis of the industrial embedded microcontroller HT48100 and analysis of performance, cost and software compatibility for its implementation alternatives, using the hardware/software co-design system for microcontrollers/microprocessors PIPER-II. The synthesis tool accepts as input the instruction set architecture (behavioral) specification, and produces as outputs the pipelined RTL designs with their simulators, and the reordering constraints which guide the compiler backend to optimize the code for the synthesized designs. A compiler backend is provided to optimize the application software according to the reordering constraints. The study shows that the co-design approach was able to help the original design team to analyze the architectural properties, identify inefficient architecture features, and explore possible architectural improvements and their impacts in both hardware and software. Feasible future upgrades for the microcontroller family have been identified by the study.
Hiroshi MIZUNO Hiroyuki KOBAYASHI Takao ONOYE Isao SHIRAKAWA
This paper devises a sophisticated approach to the performance estimation of an embedded hardware-software codesign system at the architecture level, which intends to optimize the hardware-software configuration in terms of processing time, power dissipation, and hardware cost. A distinctive feature of this approach consists in constructing a performance estimation model proper to each component of an embedded system, such as CPU core, RAM/ROM, cache memory, and application-specific hardware, by taking account of not only the functional performance but also the data transfer. The proposed estimation schemes are incorporated into an existing instruction set simulator, so that the actual performance can be estimated accurately at the architecture level. The experimental results demonstrate that the performance estimation approach enables the precise design decision at the architecture level, which greatly contributes toward enhancing the design ability dedicatedly for mobile appliances.
Chang-Jae PARK Ando KI In-Cheol PARK Chong-Min KYUNG
This paper describes an automatic interface insertion scheme for in-system verification of algorithm models. To insert the interface, an algorithm model described in C is translated into another source code that includes the communication with hardware components in the target system to be validated with the algorithm model. The communication between the algorithm model and hardware components is achieved using transactors that perform transformation between access operations and bus cycle transactions. I/O terminal is introduced as an interface model to relate the transactions to access operations during the execution of the algorithm model, i.e., accesses to I/O terminals invoke bus cycle transactions in hardware and vice versa. An automatic interface insertion tool is developed using the source-to-source translation to identify the I/O terminals and insert interface function calls in the source code. The proposed automatic interface insertion scheme is validated by emulating several multimedia algorithms written in C on real target systems.
Shinichi NODA Nozomu TOGAWA Masao YANAGISAWA Tatsuo OHTSUKI
This paper proposes a high-level energy-optimizing algorithm which can synthesize low energy system VLSIs. Given an initial system hardware obtained from an abstract behavioral description, the proposed algorithm applies to it the three energy reduction techniques, 1) reducing supply voltage, 2) selecting lower energy modules, and 3) applying gated clocks. By incorporating our area/delay/power estimation, the proposed algorithm can obtain low energy system VLSIs meeting the constraints of area, delay, and execution time. The proposed algorithm has been incorporated into a high-level synthesis system and experimental results demonstrate effectiveness and efficiency of the algorithm.
Kimiyoshi USAMI Naoyuki KAWABE Masayuki KOIZUMI Katsuhiro SETA Toshiyuki FURUSAWA
In portable applications such as W-CDMA cell phones, high performance and low standby leakage are both required. We propose an automated design technique to selectively use multi-threshold CMOS (MTCMOS) in a cell-by-cell fashion. MT cells consisting of low-Vth transistors and high-Vth sleep transistors are newly introduced. MT cells are assigned to critical paths to speed up, while High-Vth cells are assigned to non-critical paths to reduce leakage. Compared to the conventional MTCMOS, the gate delay is not affected by the discharge patterns of other gates because there is no virtual ground to be shared. We applied this technique to a test chip of a DSP core for W-CDMA baseband LSI. The worst path-delay was improved by 14% over the single high-Vth design without increasing standby leakage at 10% area overhead.
Eunjung OH Soo-Hyun KIM Dong-Ik LEE Ho-Yong CHOI
In this paper, we have proposed an efficient high-level test generation method for asynchronous circuits. The test generation is based on specification level, especially on Signal Transition Graph (STG), which is a kind of specification method for asynchronous circuits. We define a high-level fault model, called a single State Transition Fault (STF) model on STG. Test patterns for STFs are generated based on Stable State Graph (SSG), which can be derived from STG directly. The state space explored in test generation is greatly reduced and hence the test generation cost is small in terms of execution time. To enhance the fault coverage at gate-level, we have also proposed an extended STF (ESTF) model with additional gate-level information. Experimental results show that the generated test for STFs achieves high fault coverage with low cost for single stuck-at faults of its corresponding synthesized gate-level circuit. The generated test for ESTFs attains higher fault coverage with same benchmark in cost of longer execution time. Further, we have also proposed a 3-phase test generation based on the above proposed methods. An effective test generation is implemented by 3-phase: 1) test generation for STFs, 2) test generation for ESTFs, and 3) test generation using an asynchronous product machine traversal method. Experimental results also show that the proposed 3-phase test generation achieves higher fault coverage in cost of longer execution time.
Tomohiro YONEDA Eric MERCER Chris MYERS
This paper develops a modular synthesis algorithm for timed circuits that is dramatically accelerated by partial order reduction. This algorithm synthesizes each module in a hierarchical design individually. It utilizes partial order reduction to reduce the state space explored for the other modules by considering a single interleaving of concurrently enabled transitions. This approach better manages the state explosion problem resulting in a more than 2 order of magnitude reduction in synthesis time. The improved synthesis time enables the synthesis of a larger class of timed circuits than was previously possible.
Munehiro MATSUURA Tsutomu SASAO Jon T. BUTLER Yukihiro IGUCHI
A shared binary decision diagram (SBDD) represents a multiple-output function, where nodes are shared among BDDs representing the various outputs. A partitioned SBDD consists of two or more SBDDs that share nodes. The separate SBDDs are optimized independently, often resulting in a reduction in the number of nodes over a single SBDD. We show a method for partitioning a single SBDD into two parts that reduces the node count. Among the benchmark functions tested, a node reduction of up to 23% is realized.
Shinji KIMURA Atsushi ISHII Takashi HORIYAMA Masaki NAKANISHI Hirotsugu KAJIHARA Katsumasa WATANABE
The paper describes the folding method of logic functions to reduce the size of memories to keep the functions. The folding is based on the relation of fractions of logic functions. If the logic function includes 2 or 3 same parts, then only one part should be kept and other parts can be omitted. We show that the logic function of 1-bit addition can be reduced to half size using the bit-wise NOT relation and the bit-wise OR relation. The paper also introduces 3-1 LUT's with the folding mechanism. A full adder can be implemented using only one 3-1 LUT with the folding. Multi-bit AND and OR operations can be mapped to our LUT's not using the extra cascading circuit but using the carry circuit for addition. We have also tested the mapping capability of 4 input functions to our 3-1 LUT's with folding and carry propagation mechanisms. We have shown the reduction of the area consumption when using our LUT's compared to the case using 4-1 LUT's on several benchmark circuits.
Takashi HIRAYAMA Yasuaki NISHITANI Toru SATO
It has been considered difficult to obtain the minimum AND-EXOR expression of a given function with six variables in a practical computing time. In this paper, a faster algorithm of minimizing AND-EXOR expressions is proposed. We believe that our algorithm can compute the minimum AND-EXOR expressions of any six-variable and some seven-variable functions practically. In this paper, we first present a naive algorithm that searches the space of expansions of a given n-variable function f for a minimum expression of f. The space of expansions are generated by using all combinations of (n-1)-variable product terms. Then, how to prune the branches in the search process and how to restrict the search space to obtain the minimum solutions are discussed as the key point of reduction of the computing time. Finally a faster algorithm is constructed by using the methods discussed. Experimental results to demonstrate the effectiveness of these methods are also presented.
Functional decomposition is an essential technique of logic synthesis and is important especially for FPGA design. Bertacco and Damiani proposed an efficient algorithm finding simple disjoint decomposition using Binary Decision Diagrams (BDDs). However, their algorithm is not complete and does not find all the decompositions. This paper presents a complete theory of simple disjoint decomposition and describes an efficient algorithm using BDDs.
Chin-Ngai SZE Wangning LONG Yu-Liang WU Jinian BIAN
In this paper, we present a novel algorithm to the alternative wiring problem by analyzing the implication relationship between nodes of alternative wires. Alternative wiring, or rewiring, refers to the process of adding a redundant connection to a circuit so as to make a target connection redundant and removable from the circuit without altering the functionality of the circuit. The well-known ATPG-based alternative wiring scheme, Redundancy Addition and Removal for Multi-level Boolean Optimization (RAMBO), has shown its effectiveness in solving the problem in the last decade. But, the deficiency of RAMBO lies in its long execution time for redundancy identification among a large set of candidate alternative wires. Our approaches of redundancy identification by source node and destination node implication relationship indicate that a large subset of unnecessary redundancy check processes can be further avoided to improve the efficiency significantly. We propose an algorithm, the Implication Based Alternative Wiring Logic Transformation (IBAW), to integrate the two adroit techniques. IBAW provides a competent solution to the alternative wiring problem and shows an outstanding efficiency in our experiments. Experiments were performed on MCNC benchmark circuits. Results show that IBAW runs 6.8 times faster than the original RAMBO in locating alternative wires and solution quality is maintained.
Hirofumi HAMAMURA Hiroaki KOMATSU
This paper describes special-purpose hardware for large-scale logic simulation, called SP2, which executes an event driven algorithm and can simulate up to sixteen million gates. SP2 was developed, in 1992, for system verification of large-scale computer designs as a successor to SP1, which was developed in 1987. SP2 provides enhanced performance, throughput, and delay accuracy over SP1. Since 1992, SP2 has been widely used for system-level simulation of mainframes, super computers, UNIX servers and microprocessors. It is used as a powerful simulator, in all stages of design verification, or in early stages, before regression testing, by using emulators.
Keiichi KUROKAWA Takuya YASUI Yoichi MATSUMURA Masahiko TOYONAGA Atsushi TAKAHASHI
In several researches in recent years, it is shown that the circuit of a higher clock frequency can be obtained by controlling the clock-input timing of each register. However, the power consumption of the clock-tree obtained by them tends to be larger since the locations of registers are not well taken into account in clock scheduling. In this paper, we propose a novel clock tree synthesis that attains both the higher clock frequency and the lower power consumption. Our proposed algorithm determines the clock-input timings of registers step by step in constructing a clock tree structure. First, the clock period of a circuit is improved by controlling the clock-input timing of each register, and second, the clock-input timings are modified to construct a low power clock tree without deteriorating the obtained clock period. According to our experiments using several benchmark circuits, the power consumption of our clock trees attain about 9.5% smaller than previous methods.
Makoto SAITOH Masaaki AZUMA Atsushi TAKAHASHI
We introduce a clock schedule algorithm to obtain a clock schedule that achieves a shorter clock period and that can be realized by a light clock tree. A shorter clock period can be achieved by controlling the clock input timing of each register, but the required wire length and power consumption of a clock tree tends to be large if clock input timings are determined without considering the locations of registers. To overcome the drawback, our algorithm constructs a cluster that consists of registers with the same clock input timing located in a close area. The registers in each cluster are driven by a buffer and a shorter wire length can be achieved. In our algorithm, first registers are partitioned into clusters by their locations, and clusters are modified to improve the clock period while maintaining the radius of each cluster small. In our experiments, the clock period achieved in average is about 13% shorter than that achieved by a zero-skew clock tree, and about 4% longer than the theoretical minimum. The wire length and power consumption of a clock tree according to an obtained clock schedule is comparable to these of a zero skew tree.
As a remarkable development of VLSI technology, a gate switching delay is reduced and a signal delay of a net comes to have a considerable effect on the clock period. Therefore, it is required to minimize signal delays in digital VLSIs. There are a number of ways to evaluate a signal delay of a net, such as cost, radius, and Elmore's delay. Delays of those models can be computed in linear time. Elmore's delay model takes both capacitance and resistance into account and it is often regarded as a reasonable model. So, it is important to investigate the properties of this model. In this paper, we investigate the properties of the model and construct a heuristic algorithm based on these properties for computing a wiring of a net to minimize the interconnection delay. We show the effectiveness of our proposed algorithm by comparing ERT algorithm which is proposed in [2] for minimizing the maximum Elmore's delay of a sink. Our proposed algorithm decreases the average of the maximum Elmore's delay by 10-20% for ERT algorithm. We also compare our algorithm with an O(n4) algorithm proposed in [15] and confirm the effectiveness of our algorithm though its time complexity is O(n3).
Shinya YAMASAKI Shingo NAKAYA Shin'ichi WAKABAYASHI Tetsushi KOIDE
In this paper, we propose a floorplanning method for VLSI building block layout. The proposed method produces a floorplan under the timing constraint for a given netlist. To evaluate the wiring delay, the proposed method estimates the global routing cost for each net with buffer insertion and wire sizing. The slicing structure is adopted to represent a floorplan, and the Elmore delay model is used to estimate the wiring delay. The proposed method is based on simulated annealing. To shorten the computation time, a table look-up method is adopted to calculate the wiring delay. Experimental results show that the proposed algorithm performs well for producing satisfactory floorplans for industrial data.
Chikaaki KODAMA Kunihiro FUJIYOSHI
The sequence-pair was proposed as a representation method of block placement to determine the densest possible placement of rectangular modules in VLSI layout design. A method of achieving bottom left corner packing in O(n2) time based on a given sequence-pair of n rectangles was proposed using horizontal/vertical constraint graphs. Also, a method of determining packing from a sequence-pair in O(n log n) time was proposed. Another method of obtaining packing in O(n log log n) time was recently proposed, but further improvement was still required. In this paper, we propose a method of obtaining packing via the Q-sequence (representation of rectangular dissection) in O(n+k) time from a given sequence-pair of n rectangles with k subsequences called adjacent crosses, given the position of adjacent crosses and the insertion order of dummy modules into adjacent crosses. The position of adjacent crosses and insertion order of dummy modules can be obtained from a sequence-pair in O(n+k) time using the conventional method. Here, we prove that arbitrary packing can be represented by a sequence-pair, keeping the value of k not more than n-3. Therefore, we can determine packing from a sequence-pair with k of O(n) in linear time using the proposed method and the conventional method.
Takahiro KAKIMOTO Hiroyuki OCHI Takao TSUDA
As a design flow for low-power FPGA implementation, Datapath-Layout-Driven Design (DLDD) has been proposed. This letter reports the effect of DLDD for standard-cell-based ASIC implementation, and proposes necessary improvements. Experimental results shows that about 8.3% reduction of power dissipation is achieved in the best case.
Masanori HASHIMOTO Hidetoshi ONODERA
This paper discusses a statistical effect of performance optimization to uncertainty in circuit delay. Performance optimization has an effect of balancing the delay of each path in a circuit, i.e. the delay times of long paths are shortened and the delay times of short paths are lengthened. In these path-balanced circuits, the uncertainty in circuit delay, which is caused by delay calculation error, manufacturing variability, fluctuation of operating condition, etc., becomes worse by a statistical characteristic of circuit delay. Thus, a highly-optimized circuit may not satisfy delay constraints. In this paper, we demonstrate some examples that uncertainty in circuit delay is increased by path-balancing, and we then raise a problem that performance optimization increases statistically-distributed circuit delay.
Masahiro UMEHIRA Takatoshi SUGIYAMA
OFDM (Orthogonal Frequency Division Multiplexing) and CDMA (Code Division Multiple Access) are being used to enable broadband mobile wireless access under severe multipath fading in IMT-2000 and 5 GHz band WLAN (Wireless Local Area Networks), respectively. Both of them are expected to play important roles in future broadband mobile communication systems such as fourth generation cellular and next generation broadband WLAN. This paper overviews the features of OFDM and CDMA technologies and discusses their roles in future broadband mobile communication systems. It suggests an OFDM/CDMA approach combined with link adaptation and SDM (Space Division Multiplexing) over MIMO (Multiple Input Multiple Output) channel to achieve high transmission rate and to improve frequency utilization efficiency for high system capacity.
Hsi-Pin MA Steve Hengchen HSU Tzi-Dar CHIUEH
This paper presents architecture design, FPGA implementation, and measurement results of a real-time signal processing circuit for WCDMA uplink baseband receiver. To enhance uplink signal-to-interference-plus-noise ratio (SINR) performance, a four-element antenna array and a four-finger Rake combiner are integrated in the proposed receiver. Moreover, a low-complexity beamforming architecture using a correlator-based beam searcher, a decision-directed carrier synchronization loop, and a matched-filter based channel estimator is also designed. Simulations are based on the standard Doppler-fading scalar channel models provided by 3GPP and an extension to vector channel models that specify angle of arrival for each path is also made for beamformer simulation. Simulation and hardware emulation results show that the proposed architecture meets the specified requirements. In addition, this architecture, with its correlator-based beamformer weights, achieves such performance improvement with relatively low hardware complexity.
Kazutoshi SUGIMOTO Hiraku OKADA Takaya YAMAZATO Masaaki KATAYAMA
In narrow band power-line communication (PLC) systems, which use frequency band below a few hundred kHz, the noise on power-line is non-white and non-stationary. Under such environment, the performance of Orthogonal Frequency Division Multiplex (OFDM) modulation system is analyzed, and time and frequency dependence of bit error rate (BER) is clarified. In addition, the possibility of performance improvement with the symbol level repetition coding employing cyclo-stationary feature of power-line noise is presented.
Masaaki FUJIYOSHI Takashi TACHIBANA Hitoshi KIYA
A novel data embedding method for high-quality images, e.g., an image with a peak signal-to-noise ratio of better than 60 [dB] is proposed in this paper. The proposed method precisely generates a watermarked image of the desired and high quality for any images. To do this, this method considers the finite word-length of a luminance value of pixels, i.e., both quantization errors and the range limitation of luminance. The proposed method embeds a watermark sequence, modulated by the mechanism of a spread spectrum scheme, into the dc values of an image in the spatial domain. By employing spread spectrum technology as well as embedding a watermark into the dc values, this method guarantees the high image quality and, simultaneously, provides adequate JPEG tolerance.
Hiroyasu ISHIKAWA Naoki FUKE Keizo SUGIYAMA Hideyuki SHINONAGA
A wireless communications system with a transmission speed of 18 Mbit/s is presented using the 2.4 GHz ISM band. This system employs the
Ken-ichi TAKIZAWA Shigenobu SASAKI Jie ZHOU Shogo MURAMATSU Hisakazu KIKUCHI
In this paper, an online SNR estimator is proposed for parallel combinatorial SS (PC/SS) systems in Nakagami fading channels. The PC/SS systems are called as partial-code-parallel multicode DS/SS systems, which have the higher-speed data transmission capability comparing with conventional multicode DS/SS systems referred to as all-code-parallel systems. We propose an SNR estimator based on a statistical ratio of correlator outputs at the receiver. The SNR at the correlator output is estimated through a simple polynomial from the statistical ratio. We investigate the SNR estimation accuracy in Nakagami fading channels through computer simulations. In addition, we apply it to the convolutional coded PC/SS systems with iterative demodulation and decoding to evaluate the estimation performance from the viewpoint of error rate. Numerical results show that the PC/SS systems with the proposed SNR estimator have superior estimation performance to conventional DS/SS systems. It is also shown that the bit error rate performance using our SNR estimation method is close to the performance with perfect knowledge of channel state information in Nakagami fading channels and correlated Rayleigh fading channels.
Noriyoshi SUZUKI Hideyuki UEHARA Mitsuo YOKOYAMA
In an orthogonal frequency division multiplexing (OFDM) system, the bit error performance is degraded in the presence of multiple propagation paths whose excess delays are longer than the Guard Interval (GI), because the orthogonality between subcarriers cannot be maintained. In this paper, we propose a new OFDM demodulation method with a variable-length effective symbol and a multi-stage inter-carrier interference (ICI) canceller, in order to improve the bit error performance in the presence of multipaths whose excess delays are longer than the GI. The influence of the inter-symbol interference (ISI) is eliminated by the variable-length effective symbol, and then the ICI component is reduced by the multi-stage ICI canceller. The principle of the proposed method is explained, and the performance of the proposed method is then evaluated by computer simulation. The results show that the proposed method improves the system availability under more various multipath fading environments without changing the system parameters.
Yasuhiro HIRAYAMA Hiraku OKADA Takaya YAMAZATO Masaaki KATAYAMA
In this paper, we propose a medium access control (MAC) protocol for multimedia code division multiple access (CDMA) communications. In the proposed protocol, a base station (BS) estimates the instantaneous number of simultaneously transmitted packets in the future slots with exploiting a stochastic property of traffic. In order to carry out this estimation, we employ an adaptive algorithm. We evaluate the performance of the proposed protocol by comparing that with two different cases. One is no estimation case and the other is perfect estimation case. From these results, we clarify the advantage of the proposed MAC protocol.
Wuk KIM Jang-Gyu LEE Gyu-In JEE
A hybrid location system for a mobile station consists of a wireless-assisted GPS and a kind of cellular signals. This letter presents a location estimator improving the performance of the hybrid mobile station location for all terrain environments including inside or between buildings. An estimation structure eliminating non-line-of-sight propagation-delay error effectively improves location accuracy of the hybrid location system.
In this paper, we propose and describe a new synchronizer for the FFT timing applicable to multi-carrier spread-spectrum (MC-SS) communication systems. The performance of the synchronizer is evaluated in terms of false- and miss-detection probabilities in the presence of additive white Gaussian noise (AWGN) and Rayleigh fading.
Takashi KURASHINA Satomi OGAWA Kenzo WATANABE
This paper presents a second-generation CMOS current conveyor (CCII) consisting of a rail-to-rail complementary n- and p-channel differential input stage for the voltage input, a class AB push-pull stage for the current input, and current mirrors for the current outputs. The CCII was implemented using a double-poly triple-metal 0.6 µm n-well CMOS process, to confirm its operation experimentally. A prototype chip achieves a rail-to-rail swing
Yin-He SU Ching-Hwa CHENG Shih-Chieh CHANG
The purpose of a testability analysis program is to estimate the difficulty of testing a fault. A good measurement can give an early warning about the testing problem so as to provide guidance in improving the testability of a circuit. There have been researches attempting to efficiently compute the testability analysis. Among those, the Controllability and Observability Procedure COP can calculate the testability value of a stuck-at fault efficiently in a tree-structured circuit but may be very inaccurate for a general circuit. The inaccuracy in COP is due to the ignorance of signal correlations. Recently, the algorithm of TAIR in [5] proposes a testability analysis algorithm, which starts from the result of COP and then gradually improves the result by applying a set of rules. The set of rules in TAIR can capture some signal correlations and therefore the results of TAIR are more accurate than COP. In this paper, we first prove that the rules in TAIR can be replaced by a closed-form formulation. Then, based on the closed-form formulation, we proposed two novel techniques to further improve the testability analysis results. Our experimental results have shown improvement over the results of TAIR.
Yoshihiro KANEKO Shoji SHINODA
A problem of obtaining an optimal file transfer of a file transmission net N is to consider how to transmit, with the minimum total cost, copies of a certain file of information from some vertices, called sources, to other vertices of N by the respective vertices' copy demand numbers. This problem is NP-hard for a general file transmission net N. Some classes of N, on each of which a polynomial time algorithm for obtaining an optimal file transfer can be designed, are known. In the characterization, we assumed that file given originally to the source remains at the source without being transmitted. In this paper, we relax the assumption to the one that a sufficient number of copies of the file are given to the source and those copies can be transmitted from the source to other vertices on N. Under this new assumption, we characterize a class of file transmission nets, on each of which a polynomial time algorithm for obtaining an optimal file transfer can be designed. A minimum spanning tree with degree constraints plays a key role in the algorithm.
Hiroaki SUZUKI Tadashi DOHI Hiroyuki OKAMURA
In this paper, we consider the similar software cost models with periodic rejuvenation to Garg, Puliafito, Telek and Trivedi (1995) under the cost effectiveness criteria. First, an alternative model as well as the original one are analyzed by Markov regenerative processes. We derive analytically the optimal periodic software rejuvenation policies which maximize the cost-effectiveness in the steady state for two models. Further, we develop statistical non-parametric algorithms to estimate the optimal software rejuvenation policies, provided that the sample data to characterize the system failure times are given. Then, the total time on test (TTT) concept is used. In numerical examples, we compare the periodic software rejuvenation policy with the non-periodic one, and investigate the asymptotic properties of the non-parametric estimators for the optimal software rejuvenation policies through a simulation experiment.
Shigeru YOSHIDA Takashi MORIHARA Hironori YAHAGI Noriko ITANI
16-bit Asian language codes can not be compressed well by conventional 8-bit sampling text compression schemes. Previously, we reported the application of a word-based text compression method that uses 16-bit sampling for the compression of Japanese texts. This paper describes our further efforts in applying a word-based method with a static canonical Huffman encoder to both Japanese and Chinese texts. The method was proposed to support a multilingual environment, as we replaced the word-dictionary and the canonical Huffman code table for the respective language appropriately. A computer simulation showed that this method is effective for both languages. The obtained compression ratio was a little less than 0.5 without regarding the Markov context, and around 0.4 when accounting for the first order Markov context.
A blind equalizer which uses the differential constant modulus algorithm (DCMA) is introduced. An anchored FIR equalizer applied to a first-order autoregressive channel and updated according to the DCMA is shown to converge to the inverse of that channel regardless of the initial tap-weights and the gain along the direct path.
Migyoung JUNG Gueesang LEE Sungju PARK Rolf DRECHSLER
OPKFDDs (Ordered Pseudo-Kronecker Functional Decision Diagrams) are a data structure that provides compact representation of Boolean functions. The size of OPKFDDs depends on a variable ordering and on decomposition type choices. Finding an optimal representation is very hard and the size of the search space is n!
Noritaka KOBAYASHI Tatsuhiro TSUCHIYA Tohru KIKUNO
2-Factor covering designs, a type of combinatorial designs, have recently received attention since they have industrial applications including software testing. For these applications, even a small reduction on the size of a design is significant, because it directly leads to the reduction of testing cost. In this letter, we report ten new designs that we constructed, which improve on the previously best known results.