Hamid NOORI Farhad MEHDIPOUR Koji INOUE Kazuaki MURAKAMI
Encapsulating critical computation subgraphs as application-specific instruction set extensions is an effective technique to enhance the performance of embedded processors. However, the addition of custom functional units to the base processor is required to support the execution of these custom instructions. Although automated tools have been developed to reduce the long design time needed to produce a new extensible processor for each application, short time-to-market, significant non-recurring engineering and design costs are issues. To address these concerns, we introduce an adaptive extensible processor in which custom instructions are generated and added after chip-fabrication. To support this feature, custom functional units (CFUs) are replaced by a reconfigurable functional unit (RFU). The proposed RFU is based on a matrix of functional units which is multi-cycle with the capability of conditional execution. A quantitative approach is utilized to propose an efficient architecture for the RFU and fix its constraints. To generate more effective custom instructions, they are extended over basic blocks and hence, multiple exits custom instructions are proposed. Conditional execution has been added to the RFU to support the multi-exit feature of custom instructions. Experimental results show that multi-exit custom instructions enhance the performance by an average of 67% compared to custom instructions limited to one basic block. A maximum speedup of 4.7, compared to a general embedded processor, and an average speedup of 1.85 was achieved on MiBench benchmark suite.
Haruhiko KAIYA Akira OSADA Kenji KAIJIRI
We present a method to identify stakeholders and their preferences about non-functional requirements (NFR) by using use case diagrams of existing systems. We focus on the changes about NFR because such changes help stakeholders to identify their preferences. Comparing different use case diagrams of the same domain helps us to find changes to be occurred. We utilize Goal-Question-Metrics (GQM) method for identifying variables that characterize NFR, and we can systematically represent changes about NFR using the variables. Use cases that represent system interactions help us to bridge the gap between goals and metrics (variables), and we can easily construct measurable NFR. For validating and evaluating our method, we applied our method to an application domain of Mail User Agent (MUA) system.
Fawnizu Azmadi HUSSIN Tomokazu YONEDA Alex ORAILOLU Hideo FUJIWARA
This paper proposes a test methodology for core-based testing of System-on-Chips by utilizing the functional bus as a test access mechanism. The functional bus is used as a transportation channel for the test stimuli and responses from a tester to the cores under test (CUT). To enable test concurrency, local test buffers are added to all CUTs. In order to limit the buffer area overhead while minimizing the test application time, we propose a packet-based scheduling algorithm called PAcket Set Scheduling (PASS), which finds the complete packet delivery schedule under a given power constraint. The utilization of test packets, consisting of a small number of bits of test data, for test data delivery allow an efficient sharing of bus bandwidth with the help of an effective buffer-based test architecture. The experimental results show that the methodology is highly effective, especially for smaller bus widths, compared to previous approaches that do not use the functional bus.
Takashi WATANABE Tomoya MASUKO Achmad ARIFIN Makoto YOSHIZAWA
Functional Electrical Stimulation (FES) can be effective in assisting or restoring paralyzed motor functions. The purpose of this study is to examine experimentally the fuzzy controller based on cycle-to-cycle control for FES-induced gait. A basic experimental test was performed on controlling maximum knee extension angle with normal subjects. In most of control trials, the joint angle was controlled well compensating changes in muscle responses to electrical stimulation. The results show that the fuzzy controller would be practical in clinical applications of gait control by FES. An automatic parameter tuning would be required practically for quick responses in reaching the target and in compensating the change in muscle responses without causing oscillating responses.
Munehiro MATSUURA Tsutomu SASAO
A multiple-output function can be represented by a binary decision diagram for characteristic function (BDD_for_CF). This paper presents a method to represent multiple-output incompletely specified functions using BDD_for_CFs. An algorithm to reduce the widths of BDD_for_CFs is presented. This method is useful for decomposition of incompletely specified multiple-output functions. Experimental results for radix converters, adders, a multiplier, and lists of English words show that this method is useful for the synthesis of LUT cascades. An implementation of English words list by LUT cascades and an auxiliary memory is also shown.
Satoshi SHIGEMATSU Hiroki MORIMURA Toshishige SHIMAMURA Takahiro HATANO Namiko IKEDA Yukio OKAZAKI Katsuyuki MACHIDA Mamoru NAKANISHI
This paper describes logic and analog test schemes that improve the testability of a pixel-parallel fingerprint identification circuit. The pixel contains a processing circuit and a capacitive fingerprint sensor circuit. For the logic test, we propose a test method using a pseudo scan circuit to check the processing circuits of all pixels simultaneously. In the analog test, the sensor circuit employs dummy capacitance to mimic the state of a finger touching the chip. This enables an evaluation of the sensitivity of all sensor circuits on logical LSI tester without touching the chip with a finger. To check the effectiveness of the schemes, we applied them to a pixel array in a fingerprint identification LSI. The pseudo scan circuit achieved a 100% failure-detection rate for the processing circuit. The analog test determines that the sensitivities of the sensor circuit in all pixels are in the proper range. The results of the tests confirmed that the proposed schemes can completely detect defects in the circuits. Thus, the schemes will pave the way to logic and analog tests of chips integrating highly functional devices stacked on a LSI.
Yukihiro IGUCHI Tsutomu SASAO Munehiro MATSUURA
In arithmetic circuits for digital signal processing, radixes other than two are often used to make circuits faster. In such cases, radix converters are necessary. However, in general, radix converters tend to be complex. This paper considers design methods for p-nary to binary converters. First, it considers Look-Up Table (LUT) cascade realizations. Then, it introduces a new design technique called arithmetic decomposition by using LUTs and adders. Finally, it compares the amount of hardware and performance of radix converters implemented by FPGAs. 12-digit ternary to binary converters on Cyclone II FPGAs designed by the proposed method are faster than ones by conventional methods.
In July 2006, International Telecommunication Union-Telecommunication Standardization Sector (ITU-T) Study Group 13 initiated the approval process for a batch of framework Recommendations on the Next Generation Network (NGN) Release 1. One of the new Recommendations, Y.2012, illustrates the NGN from the viewpoint of a functional architecture consisting of various functional blocks, namely functional entities. In conjunction with this Recommendation, this paper explains how the NGN can be built and how the NGN utilizes functional entities to provide expected services and required capabilities. This paper also identifies open issues for extending the functional architecture towards Release 2.
Eui-Young CHUNG Hyuk-Jun LEE Sung Woo CHUNG
We present a scenario-aware bus functional modeling method which improves the accuracy of traditional methods without sacrificing the simulation run time. Existing methods focused on the behavior of individual IP (Intellectual Property) components and neglected the interplay effects among them, resulting in accuracy degradation from the system perspective. On the other hand, our method thoroughly considers such effects and increases the analysis accuracy by adopting control signal modeling and hierarchical stochastic modeling. Furthermore, our method minimizes the additional design time by reusing the simulation results of each IP component and an automated design flow. The experimental results show that the accuracy of our method is over 90% of RTL simulation in a multimedia SoC (System-on-Chip) design.
Kouichi WATANABE Masashi IMAI Masaaki KONDO Hiroshi NAKAMURA Takashi NANYA
As VLSI technology advances, delay variations will become more serious. Delay-insensitive asynchronous dual-rail circuits tolerate any delay variation, but their energy consumption is more than double that of the single-rail circuits because signal transitions occur every cycle in all bits regardless of the input bit pattern. However, in functional units, a significant number of input bits may not change from the previous input in many cases. In such a situation, calculation of these bits is not required. Thus, we propose a method, called unflip-bits control, makes use of the above situation, to reduce energy consumption. We evaluate the energy consumption and performance penalty for the method using HSPICE and the verilog-XL simulator, and compare the method with the conventional dual-rail circuit and a synchronous circuit. Our evaluation results reveal that the proposed asynchronous dual-rail circuit has a 12-60% lower energy consumption compared with a conventional asynchronous dual-rail circuit.
Hiroki NAKAHARA Tsutomu SASAO Munehiro MATSUURA
This paper represents a cycle-based logic simulation method using an LUT cascade emulator, where an LUT cascade consists of multiple-output LUTs (cells) connected in series. The LUT cascade emulator is an architecture that emulates LUT cascades. It has a control part, a memory for logic, and registers. It connects the memory to registers through a programmable interconnection circuit, and evaluates the given circuit stored in the memory. The LUT cascade emulator runs on an ordinary PC. This paper also compares the method with a Levelized Compiled Code (LCC) simulator and a simulator using a Quasi-Reduced Multi-valued Decision Diagram (QRMDD). Our simulator is 3.5 to 10.6 times faster than the LCC, and 1.1 to 3.9 times faster than the one using a QRMDD. The simulation setup time is 2.0 to 9.8 times shorter than the LCC. The necessary amount of memory is 1/1.8 to 1/5.5 of the one using a QRMDD.
Yongmin QI Wei GUO Yi ZHANG Siye ZUO Yaohui JIN Weisheng HU
We study the configuration issue of three-stage multi-granularity optical cross-connects (MG-OXC) for the dynamic traffic model in all-optical networks. From the single node point of view, we propose a configuration algorithm to configure different granularity cross-connects for arrival sub-requests with different traffic types and bandwidths. The performance of the configuration algorithm is evaluated by simulation and, furthermore, is validated by experiment based on our flexible Multi-functional Optical Switching Testbed (MOST).
Sang-Min HAN Ji-Yong PARK Tatsuo ITOH
A simple self-biased receiver system with a dual branch architecture consisting of a low-power consumption receiver and a rectenna is introduced. The system is efficiently integrated with a dual-fed circular sector antenna with harmonic rejection characteristics without a BPF. The receiver portion is designed by utilizing a low-noise amplifier (LNA) with low power consumption and a self-heterodyne mixer, while the rectenna achieves high conversion efficiency up to 80%, thanks to the harmonic rejection of the circular sector antenna. The rectified DC power from the rectenna is applied for a bias of the receiver without any external bias. Simultaneously, an ASK digital signal demodulation without an extra power supply are implemented successfully.
Hiroki NAKAHARA Tsutomu SASAO Munehiro MATSUURA
This paper shows a design method for a sequential circuit by using a Look-Up Table (LUT) ring. The method consists of two steps: The first step partitions the outputs into groups. The second step realizes them by LUT cascades, and allocates the cells of the cascades into the memory. The system automatically finds a fast implementation by maximally utilizing available memory. With the presented algorithm, we can easily design sequential circuits satisfying given specifications. The paper also compares the LUT ring with logic simulator to realize sequential circuits: the LUT ring is 25 to 237 times faster than a logic simulator that uses the same amount of memory.
Tso-Bing JUANG Shen-Fu HSIAO Ming-Yu TSAI Jenq-Shiun JAN
In this paper, a cell-driven multiplier generator is developed that can produce high-performance gate-level netlists for multiplier-related arithmetic functional units, including multipliers, multiplier and accumulators (MAC) and dot product calculator. The generator optimizes the speed/area performance both in the partial product compression and in the final addition stage for the specified process technology. In addition to the conventional CMOS full adder cells, we have also designed fast compression elements based on pass-transistor logic for further performance improvement of the generated multipliers. Simulation results show that our proposed generator could produce better multiplier-related functional units compared to those generated using Synopsys Designware library or other previously proposed approaches.
Nozomu TOGAWA Koichi TACHIKAKE Yuichiro MIYAOKA Masao YANAGISAWA Tatsuo OHTSUKI
This paper focuses on SIMD processor synthesis and proposes a SIMD instruction set/functional unit synthesis algorithm. Given an initial assembly code and a timing constraint, the proposed algorithm synthesizes an area-optimized processor core with optimal SIMD functional units. It also synthesizes a SIMD instruction set. The input initial assembly code is assumed to run on a full-resource SIMD processor (virtual processor) which has all the possible SIMD functional units. In our algorithm, we introduce the SIMD operation decomposition and apply it to the initial assembly code and the full-resource SIMD processor. By gradually reducing SIMD operations or decomposing SIMD operations, we can finally find a processor core with small area under the given timing constraint. The promising experimental results are also shown.
Achmad ARIFIN Takashi WATANABE Nozomu HOSHIMIYA
We proposed a fuzzy control scheme to implement the cycle-to-cycle control for restoring swing phase of gait using functional electrical stimulation (FES). We designed two fuzzy controllers for the biceps femoris short head (BFS) and the vastus muscles to control flexion and extension of the knee joint during the swing phase. Control capabilities of the designed fuzzy controllers were tested and compared to proportional-integral-derivative (PID) and adaptive PID controllers in automatic generation of stimulation burst duration and compensation of muscle fatigue through computer simulations using a musculo-skeletal model. Parameter adaptations in the adaptive PID controllers did not significantly improve the control performance of the PID controllers. The fuzzy controllers were superior to the PID and adaptive PID controllers under several subject conditions and different fatigue levels. These results showed the fuzzy controller would be suitable to implement the cycle-to-cycle control of FES-induced gait.
Yasuhiko TAMURA Junichi NAKAYAMA
This paper deals with a TE plane wave reflection and transmission from a one-dimensional random slab by means of the stochastic functional approach. The relative permittivity of the random slab is written by a Gaussian random field in the vertical direction with finite thickness, and is uniform in the horizontal direction with infinite extent. An explicit form of the random wavefield is obtained in terms of a Wiener-Hermite expansion with approximate expansion coefficients (Wiener kernels) under a small fluctuation case. By using the first three terms of the random wavefield representation, the optical theorem is illustrated in figures for several physical parameters. It is then found that the optical theorem holds with good accuracy.
Sachie FUJITA Mika MATSUI Hiroshi MATSUNO Satoru MIYANO
Through many researches on modeling and analyzing biological pathways, Petri net has recognized as a promising method for representing biological pathways. Recently, Matsuno et al. (2003) introduced hybrid functional Petri net (HFPN) for giving more intuitive and natural biological pathway modeling method than existing Petri nets. They also developed Genomic Object Net (GON) which employs the HFPN as a basic architecture. Many kinds of biological pathways have been modeled with the HFPN and simulated by the GON. This paper gives a new HFPN model of "cell cycle of fission yeast" with giving six basic HFPN components of typical biological reactions, and demonstrating the method how biological pathways can be modeled with these HFPN components. Simulation results by GON suggest a new hypothesis which will help biologist for performing further experiments.
Hiroki HIGA Ikuo NAKAMURA Nozomu HOSHIMIYA
As one of control command input methods for functional electrical stimulation (FES) system, using the head movements was considered in this paper. In order to detect the head movements, we designed a prototype control command input device using acceleration sensors and verified its validity in experiments. The experimental results showed that the head movements in the lateral flexion and in the flexion/extension were highly detected and separated by the acceleration sensors.