Fu-Shing CHIM Tak-Kei LAM Yu-Liang WU Hongbing FAN
The digital logic rewiring technique has been shown to be one of the most powerful logic transformation methods. It has been proven that rewiring is able to further improve some already excellent results on many EDA problems, ranging from logic minimization, partitioning, FPGA technology mappings to final routings. Previous studies have shown that ATPG-based rewiring is one of the most powerful tools for logic perturbation while a graph-based rewiring engine is able to cover nearly one fifth of the target wires with 50 times runtime speedup. For some problems that only require good-enough and very quick solutions, this new rewiring technique may serve as a useful and more practical alternative. In this work, essential elements in graph-based rewiring such as rewiring patterns, pattern size and locality, etc., have been studied to understand their relationship with rewiring performance. A structural analysis on the target-alternative wire pairs discovered by ATPG-based and graph-based engines has also been conducted to analyze the structural characteristics that favor the identification of alternative wires. We have also developed a hybrid rewiring approach that can take the advantages from both ATPG-based and graph-based rewiring. Experimental results suggest that our hybrid engine is able to achieve about 50% of alternative wire coverage when compared with the state-of-the-art ATPG-based rewiring engine with only 4% of the runtime. Through applying our hybrid rewiring approach to the FGPA technology mapping problem, we could achieve similar depth level and look-up table number reductions with much shorter runtime. This shows that the fast runtime of our hybrid approach does not sacrifice the quality of certain rewiring applications.
Memory accesses are a major cause of energy consumption for embedded systems. This paper presents the implementation of a fully software technique which places stack and static data into a scratch-pad memory (SPM) in order to reduce the energy consumed by the processor while accessing them. Since an SPM is usually too small to include all these data, some of them must be left into the external main memory (MM). Therefore, further energy reduction is achieved by moving some stack data between both memories at run time. The technique employs integer linear programming in order to find at compile time the optimal placement of static data and management of the stack and implements it by inserting stack operations inside the code. Experimental results show that with an SPM of only 1 KB, our technique is able to exploit it for reducing the energy consumption related to the static and stack data accesses by more than 90% for several applications and on an average by 57% compared to the case where these data are fully placed into the main memory.
Katherine Shu-Min LI Chih-Yun PAI Liang-Bi CHEN
This paper presents an interconnect resilient (IR) methodology with maximal interconnect fault tolerance, yield, and reliability for both single and multiple interconnect faults under stuck-at and open fault models. By exploiting multiple routes inherent in an interconnect structure, this method can tolerate faulty connections by efficiently finding alternative paths. The proposed approach is compatible with previous interconnect detection and diagnosis methods under oscillation ring schemes, and together they can be applied to implement a robust interconnect structure that may still provide correct communication even under multiple link faults in Network-on-Chips (NoCs). With such knowledge, designers can significantly improve interconnect reliability by augmenting vulnerable interconnect structures in NoCs. As a result, the experimental results show that alternative paths in NoCs can be found for almost all paths. Hence, the proposed method provides a good way to achieve fault tolerance and reliability/yield improvement.
Toshihiro KONISHI Hyeokjong LEE Shintaro IZUMI Takashi TAKEUCHI Masahiko YOSHIMOTO Hiroshi KAWAGUCHI
We propose a transfer gate phase coupler for a low-power multi-phase oscillator (MPOSC). The phase coupler is an nMOS transfer gate, which does not waste charge to the ground and thus achieves low power. The proposed MPOSC can set the number of outputs to an arbitrary number. The test circuit in a 180-nm process and a 65-nm process exhibits 20 phases, including 90 different angles. The designs in a 180-nm CMOS process and a 65-nm CMOS process were fabricated to confirm its process scalability; in the respective designs, we observed 36.6% and 38.3% improvements in a power-delay products, compared with the conventional MPOSCs using inverters and nMOS latches. In a 65-nm process, the measured DNL and 3σ period jitter are, respectively, less than 1.22 and 5.82 ps. The power is 284 µW at 1.85 GHz.
In order to improve the cell boundary throughput performance and to extend the coverage area, relaying transmission with relay stations (RSs) is becoming a promising architecture for the next generation cellular systems. However, if RSs are operated in every cell, the interference between cells increases and the throughput improvement effect with RSs is prone to be restricted. In this paper, we propose a scheme reducing the interference from other cells by using packet transmission control. This packet transmitting control technique is realized by the compound scheduling technique with the Proportional fair (PF) scheduling and the Maximum Carrier-to-Interference power Ratio (Max CIR) scheduling. The proposed scheme can improve the throughput around the cell boundary by controlling the timing of transmission of each cell with appropriate power and user assignment. The simulation results show that the proposed method can also improve the fairness of user throughput and system throughput considering the users of whole cell.
Wan Yeon LEE Hyogon KIM Heejo LEE
The proposed scheduling scheme minimizes the energy consumption of a real-time task on the multi-core processor with the dynamic voltage and frequency scaling capability. The scheme allocates a pertinent number of cores to the task execution, inactivates unused cores, and assigns the lowest frequency meeting the deadline. For a periodic real-time task with consecutive real-time instances, the scheme prepares the minimum-energy solutions for all input cases at off-line time, and applies one of the prepared solutions to each real-time instance at runtime.
Yuki YOSHIKAWA Tomomi NUWA Hideyuki ICHIHARA Tomoo INOUE
In this paper, we propose a hybrid test application in partial skewed-load (PSL) scan design. The PSL scan design in which some flip-flops (FFs) are controlled as skewed-load FFs and the others are controlled as broad-side FFs was proposed in [1]. We notice that the PSL scan design potentially has a capability of two test application modes: one is the broad-side test mode, and the other is the hybrid test mode which corresponds to the test application considered in [1]. According to this observation, we present a hybrid test application of the two test modes in the PSL scan design. In addition, we also address a way of skewed-load FF selection based on propagation dominance of FFs in order to take advantage of the hybrid test application. Experimental results for ITC'99 benchmark circuits show that the hybrid test application in the proposed PSL scan design can achieve higher fault coverage than the design based on the skewed-load FF selection [1] does.
Linchen CHANG Kazuhiko FUKAWA Hiroshi SUZUKI Satoshi SUYAMA
This paper proposes a precoding scheme for downlink multiuser MIMO-OFDM systems. The proposed precoding employs the minimum average bit error rate (MABER) criterion, and obtains precoding matrices by the steepest descent algorithm in order to minimize average BER of mobile stations. As the cost function of the proposed scheme, an upper bound of the average BER is derived from the pairwise error probability (PEP) and is averaged with respect to channel state information (CSI) errors. Thus, the MABER scheme is robust against imperfect CSI. Computer simulations under a frequency-selective fading condition demonstrate that the proposed precoder is more robust against the CSI errors than both the zero-forcing (ZF) precoder and a robust sum mean square error (SMSE) precoder, and that it is superior in BER to the conventional schemes.
Hisahiro SASABE Masatoshi ISHIBA Yong-Jin PU Junji KIDO
We designed and synthesized alkoxyphenyl group containing starburst host materials 1. Using 1 as a host material, efficient phosphorescent OLEDs with the power efficiencies of 32 lm W-1 for blue, and 85 lm W-1 for green at 100 cd m-2 were developed.
Yu HEMMI Koichi ADACHI Tomoaki OHTSUKI
A combination of single-carrier frequency-division mult-iple-access (SC-FDMA) and relay transmission is effective for performance improvement in uplink transmission. In SC-FDMA, a mapping strategy of user's spectrum has an enormous impact on system performance. In the relay communication, the optimum mapping strategy may differentiate from that in direct communication because of the independently distributed channels among nodes. In this letter, how each link should be considered in subcarrier mapping is studied and the impact of mapping strategies on the average bit error rate (BER) performance of single-user SC-FDMA relay communications will be given.
Megumi KANEKO Kazunori HAYASHI Petar POPOVSKI Hideaki SAKAI
We consider Downlink (DL) scheduling for a multi-user cooperative cellular system with fixed relays. The conventional scheduling trend is to avoid interference by allocating orthogonal radio resources to each user, although simultaneous allocation of users on the same resource has been proven to be superior in, e.g., the broadcast channel. Therefore, we design a scheduler where in each frame, two selected relayed users are supported simultaneously through the Superposition Coding (SC) based scheme proposed in this paper. In this scheme, the messages destined to the two users are superposed in the modulation domain into three SC layers, allowing them to benefit from their high quality relayed links, thereby increasing the sum-rate. We derive the optimal power allocation over these three layers that maximizes the sum-rate under an equal rates' constraint. By integrating this scheme into the proposed scheduler, the simulation results show that our proposed SC scheduler provides high throughput and rate outage probability performance, indicating a significant fairness improvement. This validates the approach of simultaneous allocation versus orthogonal allocation in the cooperative cellular system.
Naoki OYAMA Sho KANEKO Katsuaki MOMIYAMA Fumihiko HIROSE
Current density-voltage (J-V) and capacitance-voltage (C-V) characteristics of P3HT/n--silicon heterojunction diodes were investigated to clarify the carrier conduction mechanism at the organic/inorganic heterojunction. The J-V characteristics of the P3HT/n--Si junctions can be explained by a Schottky diode model with an interfacial layer. Diode parameters such as Schottky barrier height and ideality factor were estimated to be 0.78 eV and 3.2, respectively. The C-V analysis suggests that the depletion layer appears in the n--Si layer with a thickness of 1.2 µm from the junction with zero bias and the diffusion potential was estimated at 0.40 eV at the open-circuit condition. The present heterojunction allows a photovoltaic operation with power conversion efficiencies up to 0.38% with a simulated solar light exposure of 100 mW/cm2. The forward bias current was enhanced by coating the Si surface with a SiC layer, where the ideality factor was improved to be the level of 1.451.50.
In this paper, we consider Peer-to-Peer Video-on-Demand (P2P VoD) systems based on the BitTorrent file sharing protocol. Since the Rarest First policy adopted in the original BitTorrent protocol frequently fails to collect pieces corresponding to a video file by their playback time, we need to develop a new piece selection rule particularly designed for P2P VoDs. In the proposed scheme, we assume the existence of a media server which can upload any piece upon request, and try to bound the load of such media server with two techniques. The first technique is to estimate pieces which are not held by any peer and prefetch them from the media server. The second technique is to switch the mode of each peer according to the estimated size of the P2P network. The performance of the proposed scheme is evaluated by simulation.
In this paper, we propose a quantitative metric of measuring the degree of the visual fatigue in a stereoscopy. To the best of our knowledge, this is the first simplified relative quantitative approach describing visual fatigue value of a stereoscopy. Our experimental result shows that the correlation index of more than 98% is obtained between our Simplified Relative Visual Fatigue (SRVF) model and Mean Opinion Score (MOS).
Joo Myoung SEOK Junggon KO Younghun LEE Doug Young SUH
For the panoramic video streaming service, this letter proposes a visual perception-based view navigation trick mode (VP-VNTM) that reduces bandwidth requirements by adjusting the quality of transmitting views in accordance with the view navigation velocity without decreasing the user's visual sensitivity. Experiments show that the proposed VP-VNTM reduces bandwidth requirements by more than 44%.
Hideki MIWA Ryutaro SUSUKITA Hidetomo SHIBAMURA Tomoya HIRAO Jun MAKI Makoto YOSHIDA Takayuki KANDO Yuichiro AJIMA Ikuo MIYOSHI Toshiyuki SHIMIZU Yuji OINAGA Hisashige ANDO Yuichi INADOMI Koji INOUE Mutsumi AOYAGI Kazuaki MURAKAMI
In the near future, interconnection networks of massively parallel computer systems will connect more than a hundred thousands of computing nodes. The performance evaluation of the interconnection networks can provide real insights to help the development of efficient communication library. Hence, to evaluate the performance of such interconnection networks, simulation tools capable of modeling the networks with sufficient details, supporting a user-friendly interface to describe communication patterns, providing the users with enough performance information, completing simulations within a reasonable time, are a real necessity. This paper introduces a novel interconnection network simulator NSIM, for the evaluation of the performance of extreme-scale interconnection networks. The simulator implements a simplified simulation model so as to run faster without any loss of accuracy. Unlike the existing simulators, NSIM is built on the execution-driven simulation approach. The simulator also provides a MPI-compatible programming interface. Thus, the simulator can emulate parallel program execution and correctly simulate point-to-point and collective communications that are dynamically changed by network congestion. The experimental results in this paper showed sufficient accuracy of this simulator by comparing the simulator and the real machine. We also confirmed that the simulator is capable of evaluating ultra large-scale interconnection networks, consumes smaller memory area, and runs faster than the existing simulator. This paper also introduces a simulation service built on a cloud environment. Without installing NSIM, users can simulate interconnection networks with various configurations by using a web browser.
Futao KANEKO Akira BABA Kazunari SHINBO Keizo KATO
In this review, we introduce a variety of surface sensitive techniques for the study of organic thin films, and applications to organic devices. These studies include surface plasmon emission light, organic thin film transistors, combination of quartz crystal microbalance and optical waveguide spectroscopy, evaluation of alignment of liquid crystal molecules at surfaces, and biosensor applications.
Ken-ichi SHINKAI Masanori HASHIMOTO Takao ONOYE
Device-parameter estimation sensors inside a chip are gaining its importance as the post-fabrication tuning is becoming of a practical use. In estimation of variational parameters using on-chip sensors, it is often assumed that the outputs of variation sensors are not affected by random variations. However, random variations can deteriorate the accuracy of the estimation result. In this paper, we propose a device-parameter estimation method with on-chip variation sensors explicitly considering random variability. The proposed method derives the global variation parameters and the standard deviation of the random variability using the maximum likelihood estimation. We experimentally verified that the proposed method improves the accuracy of device-parameter estimation by 11.1 to 73.4% compared to the conventional method that neglects random variations.
Masashi NOMURA Shigemasa TAKAI
In the framework of supervisory control of timed discrete event systems (TDESs), a supervisor decides the set of events to be enabled to occur and the set of events to be forced to occur in order for a given specification to be satisfied. In this paper, we consider decentralized supervisory control of TDESs where enforcement decisions of local supervisors are fused by the AND rule or the OR rule. We derive existence conditions of a decentralized supervisor under these decision fusion rules.
Coscheduling has been gained a resurgence of interest as an effective technique to enhance the performance of parallel applications in multi-programmed clusters. However, existing coscheduling schemes do not adequately handle priority boost conflicts, leading to significantly degraded performance. To address this problem, in our previous study, we devised a novel algorithm that reorders the scheduling sequence of conflicting processes based on the rescheduling latency of their correspondents in remote nodes. In this paper, we exhaustively explore the design issues and implementation details of our contention-aware coscheduling scheme over Myrinet-based cluster system. We also practically analyze the impact of various system parameters and job characteristics on the performance of all considered schemes on a heterogeneous Linux cluster using a generic coscheduling framework. The results show that our approach outperforms existing schemes (by up to 36.6% in avg. job response time), reducing both boost conflict ratio and overall message delay.