Fawnizu Azmadi HUSSIN Tomokazu YONEDA Hideo FUJIWARA
The IEEE 1500 standard wrapper requires that its inputs and outputs be interfaced directly to the chip's primary inputs and outputs for controllability and observability. This is typically achieved by providing a dedicated Test Access Mechanism (TAM) between the wrapper and the primary inputs and outputs. However, when reusing the embedded Network-on-Chip (NoC) interconnect instead of the dedicated TAM, the standard wrapper cannot be used as is because of the packet-based transfer mechanism and other functional requirements by the NoC. In this paper, we describe two NoC-compatible wrappers, which overcome these limitations of the 1500 wrapper. The wrappers (Type 1 and Type 2) complement each other to optimize NoC bandwidth utilization while minimizing the area overhead. The Type 2 wrapper uses larger area overhead to increase bandwidth efficiency, while Type 1 takes advantage of some special configurations which may not require a complex and high-cost wrapper. Two wrapper optimization algorithms are applied to both wrapper designs under channel-bandwidth and test-time constraints, resulting in very little or no increase in the test application time compared to conventional dedicated TAM approaches.
In this paper, we investigate the proportional fair scheduling (PFS) problem for multiuser OFDM systems, considering the impact of packet length. Packet length influences scheduling schemes in a way that each scheduled packet should be ensured to be completely transmitted within the scheduled frames. We formulate the PFS problem as an optimization problem. Based on the observations on the structure of optimal solutions, we propose a heuristic scheduling algorithm that consists of two stages. First, subcarriers are allocated among users without considering the packet length constraint. Then on the second stage, subcarrier readjustment is done in a way that surplus subcarriers from length-satisfied users are released and allocated among length-unsatisfied users. The objective is to provide proportional fairness among users while guaranteeing complete transmission of each scheduled packet. Simulation results show that the proposed scheme has quite close performance to the optimal scheme in terms of Multi-carrier Proportional Fairness Measure (MCPFM), throughput and average packet delay.
Po-Hsun CHENG Sao-Jie CHEN Jin-Shin LAI Feipei LAI
This paper illustrates a feasible health informatics domain knowledge management process which helps gather useful technology information and reduce many knowledge misunderstandings among engineers who have participated in the IBM mainframe rightsizing project at National Taiwan University (NTU) Hospital. We design an asynchronously sharing mechanism to facilitate the knowledge transfer and our health informatics domain knowledge management process can be used to publish and retrieve documents dynamically. It effectively creates an acceptable discussion environment and even lessens the traditional meeting burden among development engineers. An overall description on the current software development status is presented. Then, the knowledge management implementation of health information systems is proposed.
Nobuo KARAKI Takashi NANMOTO Satoshi INOUE
This paper presents an asynchronous design technique, an enabler for the emerging technology of flexible microelectronics that feature low-temperature processed polysilicon (LTPS) thin-film transistors (TFT) and surface-free technology by laser annealing/ablation (SUFTLA®). The first design instance chosen is an 8-bit microprocessor. LTPS TFTs are good for realizing displays having integrated VLSI circuit at lower costs. However, LTPS TFTs have drawbacks, including substantial deviations in characteristics and the self-heating phenomenon. To solve these problems, the authors adopted the asynchronous circuit design technique and developed an asynchronous design language called Verilog+, which is based on a subset of Verilog HDL® and includes minimal primitives used for describing the communications between modules, and the dedicated tools including a translator called xlator and a synthesizer called ctrlsyn. The flexible 8-bit microprocessor stably operates at 500 kHz, drawing 180 µA from a 5 V power source. The microprocessor's electromagnetic emissions are 21 dB less than those of the synchronous counterpart.
Shinpei HAYASHI Junya KATADA Ryota SAKAMOTO Takashi KOBAYASHI Motoshi SAEKI
One of the approaches to improve program understanding is to extract what kinds of design pattern are used in existing object-oriented software. This paper proposes a technique for efficiently and accurately detecting occurrences of design patterns included in source codes. We use both static and dynamic analyses to achieve the detection with high accuracy. Moreover, to reduce computation and maintenance costs, detection conditions are hierarchically specified based on Pree's meta patterns as common structures of design patterns. The usage of Prolog to represent the detection conditions enables us to easily add and modify them. Finally, we have implemented an automated tool as an Eclipse plug-in and conducted experiments with Java programs. The experimental results show the effectiveness of our approach.
Hamid NOORI Maziar GOUDARZI Koji INOUE Kazuaki MURAKAMI
Energy consumption is a major concern in embedded computing systems. Several studies have shown that cache memories account for 40% or more of the total energy consumed in these systems. Active power used to be the primary contributor to total power dissipation of CMOS designs, but with the technology scaling, the share of leakage in total power consumption of digital systems continues to grow. Moreover, temperature is another factor that exponentially increases the leakage current. In this paper, we show the effect of temperature on the optimal (minimum-energy-consuming) cache configuration for low energy embedded systems. Our results show that for a given application and technology, the optimal cache size moves toward smaller caches at higher temperatures, due to the larger leakage. Consequently, a Temperature-Aware Configurable Cache (TACC) is an effective way to save energy in finer technologies when the embedded system is used in different temperatures. Our results show that using a TACC, up to 61% energy can be saved for instruction cache and 77% for data cache compared to a configurable cache that has been configured for only the corner-case temperature (100). Furthermore, the TACC also enhances the performance by up to 28% for the instruction cache and up to 17% for the data cache.
Sumek WISAYATAKSIN Dongju LI Tsuyoshi ISSHIKI Hiroaki KUNIEDA
We propose a low cost and stand-alone platform-based SoC for H.264/AVC decoder, whose target is practical mobile applications such as a handheld video player. Both low cost and stand-alone solutions are particularly emphasized. The SoC, consisting of RISC core and decoder core, has advantages in terms of flexibility, testability and various I/O interfaces. For decoder core design, the proposed H.264/AVC coprocessor in the SoC employs a new block pipelining scheme instead of a conventional macroblock or a hybrid one, which greatly contribute to reducing drastically the size of the core and its pipelining buffer. In addition, the decoder schedule is optimized to block level which is easy to be programmed. Actually, the core size is reduced to 138 KGate with 3.5 kbyte memory. In our practical development, a single external SDRAM is sufficient for both reference frame buffer and display buffer. Various peripheral interfaces such as a compact flash, a digital broadcast receiver and a LCD driver are also provided on a chip.
Shin-ichi OHKAWA Hiroo MASUDA Yasuaki INOUE
We have proposed a random curved surface model as a new mathematical concept which enables the expression of spatial correlation. The model gives us an appropriate methodology to deal with the systematic components of device variation in an LSI chip. The key idea of the model is the fitting of a polynomial to an array of Gaussian random numbers. The curved surface is expressed by a new extension from the Legendre polynomials to form two-dimensional formulas. The formulas were proven to be suitable to express the spatial correlation with reasonable computational complexity. In this paper, we show that this approach is useful in analyzing characteristics of device variation of actual chips by using experimental data.
Hoojin LEE Jeffrey G. ANDREWS Edward J. POWERS
Space-time block codes (STBCs) from coordinate interleaved orthogonal designs (CIODs) have attracted a great deal of attention due to their full-diversity and linear maximum likelihood (ML) decodability. In this letter, we propose a simple detection technique, particularly for full-rate STBCs from CIODs to overcome the performance degradation caused by time-selective fading channels. Furthermore, we evaluate the effects of time-selective fading channels and imperfect channel estimation on STBCs from CIODs by using a newly-introduced index, the results of which demonstrate that full-rate STBCs from CIODs are more robust against time-selective fading channels than conventional full-rate STBCs.
The present paper describes a method for the construction of a zero-correlation zone sequence set from a perfect sequence. Both the cross-correlation function and the side-lobe of the auto-correlation function of the proposed sequence sets are zero for phase shifts within the zero-correlation zone. These sets can be generated from an arbitrary perfect sequence, the length of which is the product of a pair of odd integers ((2n+1)(2k+1) for k ≥ 1 and n ≥ 0). The proposed sequence construction method can generate an optimal zero-correlation zone sequence set that achieves the theoretical bounds of the sequence member size given the size of the zero-correlation zone and the sequence period. The peak in the out-of-phase correlation function of the constructed sequences is restricted to be lower than the half of the power of the sequence itself. The proposed sequence sets could successfully provide CDMA communication without co-channel interference, or, in an ultrasonic synthetic aperture imaging system, improve the signal-to-noise ratio of the acquired image.
Wilaiporn LEE Suwich KUNARUTTANAPRUK Somchai JITAPUNKUL
This paper proposes a novel technique in designing the optimum pulse shape for ultra wideband (UWB) systems under the presence of timing jitter. In the UWB systems, pulse transmission power and timing jitter tolerance are crucial keys to communications success. While there is a strong desire to maximize both of them, one must be traded off against the other. In the literature, much effort has been devoted to separately optimize each of them without considering the drawback to the other. In this paper, both factors are jointly considered. The proposed pulse attains the adequate power to survive the noise floor and at the same time provides good resistance to the timing jitter. The proposed pulse also meets the power spectral mask restriction as prescribed by the Federal Communications Commission (FCC) for indoor UWB systems. Simulation results confirm the advantages of the proposed pulse over other previously known UWB pulses. Parameters of the proposed optimization algorithm are also investigated in this paper.
Areeyata SRIPETCH Poompat SAENGUDOMLERT
In a power grid used to distribute electricity, optical fibers can be inserted inside overhead ground wires to form an optical network infrastructure for data communications. Dense wavelength division multiplexing (DWDM)-based optical networks present a promising approach to achieve a scalable backbone network for power grids. This paper proposes a complete optimization procedure for optical network designs based on an existing power grid. We design a network as a subgraph of the power grid and divide the network topology into two layers: backbone and access networks. The design procedure includes physical topology design, routing and wavelength assignment (RWA) and optical amplifier placement. We formulate the problem of topology design into two steps: selecting the concentrator nodes and their node members, and finding the connections among concentrators subject to the two-connectivity constraint on the backbone topology. Selection and connection of concentrators are done using integer linear programming (ILP). For RWA and optical amplifier placement problem, we solve these two problems together since they are closely related. Since the ILP for solving these two problems becomes intractable with increasing network size, we propose a simulated annealing approach. We choose a neighborhood structure based on path-switching operations using k shortest paths for each source and destination pair. The optimal number of optical amplifiers is solved based on local search among these neighbors. We solve and present some numerical results for several randomly generated power grid topologies.
Thomas Edison YU Tomokazu YONEDA Danella ZHAO Hideo FUJIWARA
The rapid advancement of VLSI technology has made it possible for chip designers and manufacturers to embed the components of a whole system onto a single chip, called System-on-Chip or SoC. SoCs make use of pre-designed modules, called IP-cores, which provide faster design time and quicker time-to-market. Furthermore, SoCs that operate at multiple clock domains and very low power requirements are being utilized in the latest communications, networking and signal processing devices. As a result, the testing of SoCs and multi-clock domain embedded cores under power constraints has been rapidly gaining importance. In this research, a novel method for designing power-aware test wrappers for embedded cores with multiple clock domains is presented. By effectively partitioning the various clock domains, we are able to increase the solution space of possible test schedules for the core. Since previous methods were limited to concurrently testing all the clock domains, we effectively remove this limitation by making use of bandwidth conversion, multiple shift frequencies and properly gating the clock signals to control the shift activity of various core logic elements. The combination of the above techniques gains us greater flexibility when determining an optimal test schedule under very tight power constraints. Furthermore, since it is computationally intensive to search the entire expanded solution space for the possible test schedules, we propose a heuristic 3-D bin packing algorithm to determine the optimal wrapper architecture and test schedule while minimizing the test time under power and bandwidth constraints.
Masato NAKAZATO Michiko INOUE Satoshi OHTAKE Hideo FUJIWARA
In this paper, we propose a design for testability method for test programs of software-based self-test using test program templates. Software-based self-test using templates has a problem of error masking where some faults detected in a test generation for a module are not detected by the test program synthesized from the test. The proposed method achieves 100% template level fault efficiency, that is, it completely avoids the error masking. Moreover, the proposed method has no performance degradation (adds only observation points) and enables at-speed testing.
Jihyung KIM Sangho NAM Dongjun LEE Jonghan KIM Jongae PARK Daesik HONG
In this letter, we propose a new preamble structure for channel estimation in a MIMO OFDM-based WLAN system. Both backward compatibility with IEEE 802.11a and low overhead are considered in designing the preamble. Simulation results show that the proposed preamble has low overhead and good performance gain for channel estimation.
Suhua TANG Naoto KADOWAKI Sadao OBANA
In this paper we analyze the characteristics of vehicle mobility and propose a novel Mobility Prediction Progressive Routing (MP2R) protocol for Inter-Vehicle Communication (IVC) that is based on cross-layer design. MP2R utilizes the additional gain provided by the directional antennas to improve link quality and connectivity; interference is reduced by the directional transmission. Each node learns its own position and speed and that of other nodes, and performs position prediction. (i) With the predicted progress and link quality, the forwarding decision of a packet is locally made, just before the packet is actually transmitted. In addition the load at the forwarder is considered in order to avoid congestion. (ii) The predicted geographic direction is used to control the beam of the directional antenna. The proposed MP2R protocol is especially suitable for forwarding burst traffic in highly mobile environments. Simulation results show that MP2R effectively reduces Packet Error Ratio (PER) compared with both topology-based routing (AODV [1], FSR [2]) and normal progressive routing (NADV [18]) in the IVC scenarios.
Because the leakage current of a digital circuit depends on the states of the circuit's logic gates, assigning a minimum leakage vector (MLV) for the primary inputs and the flip-flops' outputs of the circuit that operates in the sleep mode is a popular technique for leakage current reduction. In this paper, we propose a novel probability-based algorithm and technique that can rapidly find an MLV. Unlike most traditional techniques that ignore the leakage current overhead of the newborn vector controller, our technique can take this overhead into account. Ignoring this overhead during solution space exploration may bring a side effect that is misrecognizing a non-optimal solution as an optimal one. Experimental results show that our heuristic algorithm can reduce the leakage current up to 59.5% and can find the optimal solutions on most of the small MCNC benchmark circuits. Moreover, the required CPU time of our probability-based program is significantly less than that of a random search program.
Masaaki IIJIMA Kayoko SETO Masahiro NUMA Akira TADA Takashi IPPOSHI
Instability of SRAM memory cells derived from aggressive technology scaling has been recently one of the most significant issues. Although a 7T-SRAM cell with an area-tolerable separated read port improves read margins even at sub-1V, it unfortunately results in degradation of write margins. In order to assist the write operation, we address a new memory cell employing a look-ahead body-bias which dynamically controls the threshold voltage. Simulation results have shown improvement in both the write margins and access time without increasing the leakage power derived from the body-bias.
Yow-Tyng NIEH Shih-Hsu HUANG Sheng-Yu HSU
Although much research effort has been devoted to the minimization of total power consumption caused by the clock tree, no attention has been paid to the minimization of the peak current caused by it. In this paper, we propose an opposite-phase clock scheme to reduce the peak current incurred by the clock tree. Our basic idea is to balance the charging and discharging activities. According to the output operation, the clock buffers that transit simultaneously are divided into two groups: half of the clock buffers transit at the same phase of the clock source, while the other half transit at the opposite phase of the clock source. As a consequence, the opposite-phase clock scheme significantly reduces the peak current caused by the clock tree. Experimental data show that our approach can be applied at different design stages in the existing design flow.
In this paper, we present a new fast Fourier transform (FFT) algorithm to reduce the table size of twiddle factors required in pipelined FFT processing. The table size is large enough to occupy significant area and power consumption in long-point FFT processing. The proposed algorithm can reduce the table size to half, compared to the radix-22 algorithm, while retaining the simple structure. To verify the proposed algorithm, a 2048-point pipelined FFT processor is designed using a 0.18 µm CMOS process. By combining the proposed algorithm and the radix-22 algorithm, the table size is reduced to 34% and 51% compared to the radix-2 and radix-22 algorithms, respectively. The FFT processor occupies 1.28 mm2 and achieves a signal-to-quantization-noise ratio (SQNR) of more than 50 dB.