Tsutomu SASAO Yuta URANO Yukihiro IGUCHI
This paper shows a method to find a linear transformation that reduces the number of variables to represent a given incompletely specified index generation function. It first generates the difference matrix, and then finds a minimal set of variables using a covering table. Linear transformations are used to modify the covering table to produce a smaller solution. Reduction of the difference matrix is also considered.
Akihiro SUDA Hideki TAKASE Kazuyoshi TAKAGI Naofumi TAKAGI
We propose a synthesis method of nested loops into parallelized circuits by integrating the polyhedral optimization, which is a state-of-the-art technique in the field of software, into high-level synthesis. Our method constructs circuits equipped with multiple processing elements (PEs), using information generated by the polyhedral optimizing compiler. Since multiple PEs cannot concurrently access the off-chip RAM, a method for constructing on-chip buffers is also proposed. Our buffering method reduces the off-chip RAM access conflicts and further enables burst accesses and data reuses. In our experimental result, the buffered circuits generated by our method are 8.2 times on average and 26.5 times at maximum faster than the sequential non-buffered ones, when each of the parallelized circuits is configured with eight PEs.
Michitarou YABUUCHI Ryo KISHIDA Kazutoshi KOBAYASHI
We analyze the correlation between BTI (Bias Temperature Instability) -induced degradations and process variations. Those reliability issues are correlated. BTI is one of the most significant aging-degradations on LSIs. Threshold voltages of MOSFETs increase with time when biases stress their gates. It shows a strong effect of BTI on highly scaled LSIs in the same way as the process variations. The accurate prediction of the combinational effects is indispensable. We should analyze both aging-degradations and process variations of MOSFETs to explain the correlation. We measure frequencies of ROs (Ring Oscillators) of 65-nm process test circuits on two types of LSIs, ASICs and FPGAs. There are 98 and 837 ROs on our ASICs and FPGAs respectively. The frequencies of ROs follow gaussian distributions. We describe the highest frequency group as the “fast” conditon, the average group as the “typical” conditon and the lowest group as the “slow” conditon. We measure the aging-degradations of the ROs of the three conditions on the accelerated test. The degradations can be approximated by logarithmic function of stress time. The degradation at the “fast” condition has a higher impact on the frequency than the “slow” one. The correlation coefficient is 0.338. In this case, we can define a smaller design margin for BTI-induced degradations than that without considering the correlation because the degradation at the “slow” conditon is smaller than the average and the fast.
Industrial applications such as automotive ones require a cheap communication mechanism to send out communication messages from node to node by their deadline time. This paper presents a design paradigm in which we optimize both assignment of a network node to a bus and slot multiplexing of a FlexRay network system under hard real-time constraints so that we can minimize the cost of wire harness for the FlexRay network system. We present a cost minimization problem as a non-linear model. We developed a network synthesis tool which was based on simulated annealing. Our experimental results show that our design paradigm achieved a 50.0% less cost than a previously proposed approach for a virtual cost model.
Yiqiang SHENG Atsushi TAKAHASHI
In this paper, a novel high-performance heuristic algorithm, named relay-race algorithm (RRA), which was proposed to approach a global optimal solution by exploring similar local optimal solutions more efficiently within shorter runtime for NP-hard problem is investigated. RRA includes three basic parts: rough search, focusing search and relay. The rough search is designed to get over small hills on the solution space and to approach a local optimal solution as fast as possible. The focusing search is designed to reach the local optimal solution as close as possible. The relay is to escape from the local optimal solution in only one step and to maintain search continuity simultaneously. As one of typical applications, multi-objective placement problem in physical design optimization is solved by the proposed RRA. In experiments, it is confirmed that the computational performance is considerably improved. RRA achieves overall Pareto improvement of two conflicting objectives: power consumption and maximal delay. RRA has its potential applications to improve the existing search methods for more hard problems.
Satoshi HASHIMOTO Takahiro TANAKA Kazuaki AOKI Kinya FUJITA
Frequently interrupting someone who is busy will decrease his or her productivity. To minimize this risk, a number of interruptibility estimation methods based on PC activity such as typing or mouse clicks have been developed. However, these estimation methods do not take account of the effect of conversations in relation to the interruptibility of office workers engaged in intellectual activities such as scientific research. This study proposes an interruptibility estimation method that takes account of the conversation status. Two conversation indices, “In conversation” and “End of conversation” were used in a method that we developed based on our analysis of 50 hours worth of recorded activity. Experiments, using the conversation status as judged by the Wizard-of-OZ method, demonstrated that the estimation accuracy can be improved by the two indices. Furthermore, an automatic conversation status recognition system was developed to replace the Wizard-of-OZ procedure. The results of using it for interruptibility estimation suggest the effectiveness of the automatically recognized conversation status.
Jie GUO Bin SONG Fang TIAN Haixiao LIU Hao QIN
For compressed sensing, to address problems which do not involve reconstruction, a correlation analysis between measurements and the transform coefficients is proposed. It is shown that there is a linear relationship between them, which indicates that we can abstract the inner property of images directly in the measurement domain.
A generalized chirp-like (GCL) sequence of period N is constructed by modulating a Zadoff-Chu sequence of period N with an arbitrary unimodular sequence of period m, where m divides N. Under some specific conditions, the cross-correlations between two GCL sequences are shown to have exactly the same magnitudes as those of their corresponding Zadoff-Chu sequences regardless of the employed unimodular sequences. In this paper, we first investigate the sufficient conditions under which such a relation holds. We then use them to construct a new class of optimal zero-correlation zone (ZCZ) sequence sets which can be considered to be an extension of the so-called GCL-ZCZ sequence sets.
We propose an unsharp-masking technique which preserves the hue of colors in images. This method magnifies the contrast of colors and spatially sharpens textures in images. The contrast magnification ratio is adaptively controlled. We show by experiments that this method enhances the color tone of photographs while keeping their perceptual scene depth.
Kanako YAMAGUCHI Huu Phu BUI Yasutaka OGAWA Toshihiko NISHIMURA Takeo OHGANE
Although multi-user multiple-input multiple-output (MI-MO) systems provide high data rate transmission, they may suffer from interference. Block diagonalization and eigenbeam-space division multiplexing (E-SDM) can suppress interference. The transmitter needs to determine beamforming weights from channel state information (CSI) to use these techniques. However, MIMO channels change in time-varying environments during the time intervals between when transmission parameters are determined and actual MIMO transmission occurs. The outdated CSI causes interference and seriously degrades the quality of transmission. Channel prediction schemes have been developed to mitigate the effects of outdated CSI. We evaluated the accuracy of prediction of autoregressive (AR)-model-based prediction and Lagrange extrapolation in the presence of channel estimation error. We found that Lagrange extrapolation was easy to implement and that it provided performance comparable to that obtained with the AR-model-based technique.
Suyong EUM Masahiro JIBIKI Masayuki MURATA Hitoshi ASAEDA Nozomu NISHINAGA
This article introduces a self-organizing model which builds the topology of a DHT mapping system for ICN. Due to its self-organizing operation and low average degree of maintenance, the management overhead of the system is reduced dramatically, which yields inherent scalability. The proposed model can improve latency by around 10% compared to an existing approach which has a near optimal average distance when the number of nodes and degree are given. In particular, its operation is simple which eases maintenance concerns. Moreover, we analyze the model theoretically to provide a deeper understanding of the proposal.
Yohei UMEKI Koji YANAGIDA Shusuke YOSHIMOTO Shintaro IZUMI Masahiko YOSHIMOTO Hiroshi KAWAGUCHI Koji TSUNODA Toshihiro SUGII
This paper reports a 65nm 8Mb spin transfer torque magnetoresistance random access memory (STT-MRAM) operating at a single supply voltage with a process-variation-tolerant sense amplifier. The proposed sense amplifier comprises a boosted-gate nMOS and negative-resistance pMOSs as loads, which maximizes the readout margin at any process corner. The STT-MRAM achieves a cycle time of 1.9µs (=0.526MHz) at 0.38V. The operating power is 1.70µW at this voltage. The minimum energy per access is 1.12 pJ/bit when the supply voltage is 0.44V. The proposed STT-MRAM operates at a lower energy than an SRAM when the utilization of the memory bandwidth is 14% or less.
Fumio TERAOKA Sho KANEMARU Kazuma YONEMURA Motoki IDE Shinji KAWAGUCHI Kunitake KANEKO
Using “clean-slate approach” to redesign the Internet has attracted considerable attention. ZNA (Z Network Architecture) is one of clean-slate network architectures based on the layered model. The major features of ZNA are as follows: (1) introducing the session layer to provide the applications with sophisticated communication services, (2) employing inter-node cross-layer cooperation to adapt to the dynamically changing network conditions, (3) splitting the node identifier and the node locator for mobility, multi-homing, and heterogeneity of network layer protocols, (4) splitting the data plane and the control plane for high manageability, and (5) introducing a recursive layered model to support network virtualization. This paper focuses on the first three topics as well as the basic design of ZNA.
Tetsunao MATSUTA Tomohiko UYEMATSU
In this paper, we consider the lossy source coding problem with delayed side information at the decoder. We assume that delay is unknown but the maximum of delay is known to the encoder and the decoder, where we allow the maximum of delay to change with the block length. In this coding problem, we show an upper bound and a lower bound of the rate-distortion (RD) function, where the RD function is the infimum of rates of codes in which the distortion between the source sequence and the reproduction sequence satisfies a certain distortion level. We also clarify that the upper bound coincides with the lower bound when maximums of delay per block length converge to a constant. Then, we give a necessary and sufficient condition in which the RD function is equal to that for the case without delay. Furthermore, we give an example of a source which does not satisfy this necessary and sufficient condition.
Nozomi MIYA Tota SUKO Goki YASUDA Toshiyasu MATSUSHIMA
In this paper, sequential prediction is studied. The typical assumptions about the probabilistic model in sequential prediction are following two cases. One is the case that a certain probabilistic model is given and the parameters are unknown. The other is the case that not a certain probabilistic model but a class of probabilistic models is given and the parameters are unknown. If there exist some parameters and some models such that the distributions that are identified by them equal the source distribution, an assumed model or a class of models can represent the source distribution. This case is called that specifiable condition is satisfied. In this study, the decision based on the Bayesian principle is made for a class of probabilistic models (not for a certain probabilistic model). The case that specifiable condition is not satisfied is studied. Then, the asymptotic behaviors of the cumulative logarithmic loss for individual sequence in the sense of almost sure convergence and the expected loss, i.e. redundancy are analyzed and the constant terms of the asymptotic equations are identified.
The index generation function is a multi-valued logic function which checks if the given input vector is a registered or not, and returns its index value if the vector is registered. If the latency of the operation is critical, dedicated hardware is used for implementing the index generation functions. This paper proposes a method implementing the index generation functions using parallel index generator. A novel and efficient algorithm called ‘conflict free partitioning’ is proposed to synthesize parallel index generators. Experimental results show the proposed method outperforms other existing methods. Also, A novel architecture of index generator which is suitable for parallelized implementation is introduced. A new architecture has advantages in the sense of both area and delay.
Hideki TAKASE Gang ZENG Lovic GAUTHIER Hirotaka KAWASHIMA Noritoshi ATSUMI Tomohiro TATEMATSU Yoshitake KOBAYASHI Takenori KOSHIRO Tohru ISHIHARA Hiroyuki TOMIYAMA Hiroaki TAKADA
This paper presents a framework for reducing the energy consumption of embedded real-time systems. We implemented the presented framework as both an optimization toolchain and an energy-aware real-time operating system. The framework consists of the integration of multiple techniques to optimize the energy consumption. The main idea behind our approach is to utilize trade-offs between the energy consumption and the performance of different processor configurations during task checkpoints, and to maintain memory allocation during task context switches. In our framework, a target application is statically analyzed at both intra-task and inter-task levels. Based on these analyzed results, runtime optimization is performed in response to the behavior of the application. A case study shows that our toolchain and real-time operating systems have achieved energy reduction while satisfying the real-time performance. The toolchain has also been successfully applied to a practical application.
While Triple modular Redundancy (TMR) is effective in eliminating soft errors in LSIs, the overhead of the triplicated area as well as the triplicated energy consumption is the problem. In addition to the spatial TMR mode where executions are simply tripricated and the majority is taken, the temporal TMR mode is available where only two copies of an operation are executed and the results are compared, then if the results differ, the third copy is executed to get the correct result. Appropriately selecting the power supply voltage is also an effective technique to reduce the energy consumption. In this paper, a method to derive a TMR design is proposed which selects the TMR mode and supply voltage for each operation to minimize the energy consumption within the time and area constraints.
Ryo YAMAGUCHI Shouhei KIDERA Tetsuo KIRIMOTO
Radar systems using ultra-wideband (UWB) signals have definitive advantages in high range resolution. These are suitable for accurate 3-dimensional (3-D) sensing by rescue robots operating in disaster zone settings, where optical sensing is not applicable because of thick smog or high-density gas. For such applications, where no a priori information of target shape and position is given, an accurate method for 3-D imaging and motion estimation is strongly required for effective target recognition. In addressing this issue, we have already proposed a non-parametric 2-dimensional (2-D) imaging method for a target with arbitrary target shape and motion including rotation and translation being tracked using a multi-static radar system. This is based on matching target boundary points obtained using the range points migration (RPM) method extended to the multi-static radar system. Whereas this method accomplishes accurate imaging and motion estimation for single targets, accuracy is degraded severely for multiple targets, due to interference effects. For a solution of this difficulty, this paper proposes a method based on a novel matching scheme using not only target points but also normal vectors on the target boundary estimated by the Envelope method; interference effects are effectively suppressed when incorporating the RPM approach. Results from numerical simulations for both 2-D and 3-D models show that the proposed method simultaneously achieves accurate target imaging and motion tracking, even for multiple moving targets.
To support the processing of spatial window queries efficiently in a non-flat wireless data broadcasting system, we propose a Two-Tier Spatial Index (TTSI) that uses a two tier data space to distinguish hot and regular data items. Unlike an existing index which repeats regular data items located near hot items at the same time as the hot data items during the broadcast cycle, TTSI makes it possible to repeat only hot data items during a cycle. Simulations show that the proposed TTSI outperforms the existing scheme with respect to access time and energy consumption.