Tsuyoshi SADAKATA Yusuke MATSUNAGA
A Multi-Functional unit has several functions and these can be changed with a control signal. For High-Level Synthesis, using Multi-Functions units in operation chaining make it possible to obtaining the solution with the same number of control steps and less resources compared to that without them. This paper proposes an operation chaining method considering Multi-Functional units. The method formulates module selection, scheduling, and functional unit allocation with operation chaining as a 0/1 integer linear problem and obtains optimal solution with minimum number of control steps under area and clock-cycle type constraints. The first contribution of this paper is to propose the global search for operation chaining with Multi-Functional units having multiple outputs as well as with single output. The second contribution is to condier the area constraint as a resource constraint instead of the type and number of functional units. Experimental results show that chaining with Multi-Functional units is effective and the proposed method is useful to evaluate heuristic algorithms.
We propose a new tableau construction which builds an FSM, instead of a Kripke structure, from a formula in a class of temporal logic named ASTL. This FSM is a maximal model of the formula under the preorder derived from simulation relations. Additionally, we propose a method using the tableaus to build controllers in a certain topology of interconnected FSMs. We can use ASTL to describe the desired behaviors of the control system. This method is applicable to generating digital circuits. Moreover, this method accepts a wider range of specifications than conventional methods.
Jiang ZHANG Li-Feng SUN Yun TANG Shi-Qiang YANG
Key agreement for collaborative groups has become an increasingly popular research area. However, most of previous work requires each member to not only maintain the whole key tree structure whose size is O(N), where N is the size of group, but also involve rekeying operation upon each membership change, resulting in high costs in terms of storage, communication and computation and thus suffers from poor scalability. In this paper, we propose a scalable Distributed and collaborative group key agreement scheme using a Virtual Key Tree (D-VKT). Each group member in D-VKT only reserves and maintains partial information of the whole key tree structure with requirement of O(log N). Furthermore, a distributed tree balancing algorithm is presented to keep the whole key tree as balanced as possible for rekeying efficiency. In addition, a distributed group batch rekeying protocol is proposed to further reduce the computation and communication workload of group rekeying in a highly dynamic environment. The experiment results demonstrate that D-VKT can scale to large and dynamic collaborative groups.
Hideki NODA Yohsuke TSUKAMIZU Michiharu NIIMI
This paper presents two steganographic methods for JPEG2000 still images which approximately preserve histograms of discrete wavelet transform coefficients. Compared with a conventional JPEG2000 steganography, the two methods show better histogram preservation. The proposed methods are promising candidates for secure JPEG2000 steganography against histogram-based attack.
Due to its importance in engineering applications, the bilinear transformation has been studied in many literature. In this letter two new algorithms are presented to compute transformation matrix for the bilinear s-z transformation.
Shin'ichi KOUYAMA Tomonori IZUMI Hiroyuki OCHI Yukihiro NAKAMURA
Recently, self-reconfigurable devices which can be partially reprogrammed by other part of the same device have been proposed. However, since conventional self-reconfigurable devices are LUT-array-based fine-grained devices, their time efficiency is spoiled by overhead for reconfiguration time to load large amount of configuration data. Therefore, we have to improve architectures. At the architecture design phase, it is difficult to estimate the performance, including reconfiguration overhead, of self-reconfigurable devices by static analysis, since it depends on many architecture parameters and unpredictable run-time behavior. In this paper, we propose a simulation-based platform for design exploration of self-reconfigurable devices. As a demonstration of the proposed platform, we implement an adaptive load distribution model on the devices of various reconfiguration granularities and evaluate performance of the devices.
Qiang LI Jiansong GAN Yunzhou LI Shidong ZHOU Yan YAO
Spatial multiplexing (SM) offers a linear increase in transmission rate without bandwidth expansion or power increase. In SM systems, the LMMSE receiver establishes a good tradeoff between the complexity and performance. The performance of the LMMSE receiver would be degraded by MIMO channel estimation errors. This letter focus on obtaining the asymptotic convergence of output interference power and SIR performance for the LMMSE receiver with channel uncertainty. Exactly matched simulation results verify the validity of analysis in the large-system assumption. Furthermore, we find that the analytical results are also valid in the sense of average results for limited-scale system in spite of the asymptotic assumption used in derivation.
Naoto EGASHIRA Hiroo TAKAYAMA Takahiko SABA
In multi-input multi-output orthogonal frequency division multiplexing (MIMO-OFDM) systems, phase tracking schemes suffer from co-channel interference (CCI) and inter-carrier interference (ICI) caused by residual frequency offset. In this paper, we propose a residual frequency offset compensation scheme using feedback phase tracking to eliminate the effect of both ICI and CCI for MIMO-OFDM systems. The proposed phase tracking scheme estimates the amount of residual frequency offset in the frequency domain, and compensates for it in the time domain, periodically. Thus, the effect of ICI can be reduced. Furthermore, we consider two methods of channel estimation that enable the system to estimate the channel response several times within a packet to eliminate the effect of CCI. This is because the channel is generally estimated at the beginning of a packet, and this estimation is affected by residual frequency offset. First is the method that employs midambles. Second is the one that reuses the preamble. When the channel is estimated several times within a packet, the effect of CCI can be reduced. Simulation results show the proposed scheme can compensate for residual frequency offset and CCI more accurately than the conventional scheme, and improve the packet error rate (PER) performance.
Yun TANG Lifeng SUN Jianguang LUO Shiqiang YANG Yuzhuo ZHONG
In recent years, the inherent effectiveness of Peer-to-Peer (P2P) networks has been advocated to address scalability issues in large scale Internet-based on-Demand streaming services. Most of existing works adopt Cache-and-Relay (CR) scheme to exploit a cooperative paradigm among peers. In this paper, we mainly present our practical evaluation study of the scalability of the CR scheme by taking into account of more than 20,000,000 collected real traces. Based on trace-driven simulations, we conclude that the CR scheme is not as effective as previously reported in terms of saving server bandwidth.
Yueguang BIAN Youzheng WANG Jing WANG
In this letter, we propose a new modification to the belief propagation (BP) decoding algorithm for Finite-Geometry low-density parity-check (LDPC) codes. The modification is based on introducing feedback into the iterative process, which can break the oscillations of bit log-likelihood ratio (LLR) values. Simulations show that, with a given maximum iteration, the "feedback BP" (FBP) algorithm can achieve better performance than the conventional belief propagation algorithm.
Shinya SUENAGA Yoshihiro HAYAKAWA Koji NAKAJIMA
In order to introduce the burst firing, a nerve-cell dynamic feature, we extend the Inverse function Delayed model (ID model), which is the neuron model with ability to oscillate and has powerful ability on the information processing. This dynamics is discussed for the relation with the functional role of the brain and is characterized by repeated patterns of closely spaced action potentials. It is expected that the additional new characteristics add extra functions to neural networks. Using the relation between the ID model and reduced Hodgkin-Huxley model, we propose the neuron model with ability of burst. The proposed model excelled the ID model in solving the N-Queen problem. Additionally, the prototype chip for the burst ID model is implemented and measured.
Noriaki ODA Hironori IMURA Naoyoshi KAWAHARA Masayoshi TAGAMI Hiroyuki KUNISHIMA Shuji SONE Sadayuki OHNISHI Kenta YAMADA Yumi KAKUHARA Makoto SEKINE Yoshihiro HAYASHI Kazuyoshi UENO
A novel interconnect design concept named "ASIS (Appilication-specific Interconnect Structure)" is presented for 45 nm CMOS performance maximization. Basic scheme of ASIS is that corresponding to applications, such as high-performance, low-power, or high reliability, interconnect structure as well as metal thickness is individually optimized in order to maximize chip-level performance matched to the application. Our investigation shows that for low-power application, the increased resistivity of scaled-down Cu-wire is not a main issue, so that thinner wire is more advantageous. For high-performance application, partially double pitch structure for local and intermediate layers is advantageous. For high-reliability requirement, Cu-Al alloy or CoWP cap-metal is quite effective for boosting reliability.
Hiroyuki KOBAYASHI Nobuto ONO Takashi SATO Jiro IWAI Hidenari NAKASHIMA Takaaki OKUMURA Masanori HASHIMOTO
With the recent advance of process technology shrinking, process parameter variation has become one of the major issues in SoC designs, especially for timing convergence. Recently, Statistical Static Timing Analysis (SSTA) has been proposed as a promising solution to consider the process parameter variation but it has not been widely used yet. For estimating the delay yield, designers have to know and understand the accuracy of SSTA. However, the accuracy has not been thoroughly studied from a practical point of view. This paper proposes two metrics to measure the pessimism/optimism of SSTA; the first corresponds to yield estimation error, and the second examines delay estimation error. We apply the metrics for a problem which has been widely discussed in SSTA community, that is, normal-distribution approximation of max operation. We also apply the proposed metrics for benchmark circuits and discuss about a potential problem originating from normal-distribution approximation. Our metrics indicate that the appropriateness of the approximation depends on not only given input distributions but also the target yield of the product, which is an important message for SSTA users.
Recently, research on parallel processing systems is very active, and many complex topologies have been proposed. A burnt pancake graph is one such topology. In this paper, we prove that a faulty burnt pancake graph with degree n has a fault-free Hamiltonian cycle if the number of the faulty elements is n-2 or less, and it has a fault-free Hamiltonian path between any pair of nonfaulty nodes if the number of the faulty elements is n-3 or less.
Ming SHAO Zhenyu LIU Satoshi GOTO Takeshi IKENAGA
Fractional Motion Estimation (FME) is an advanced feature adopted in H.264/AVC video compression standard with quarter-pixel accuracy. Although FME could gain considerably higher encoding efficiency, sub-pixel interpolation and sum of absolute transformed difference (SATD) computation, as main parts of FME, increase the computation complexity a lot. To reduce the complexity of FME, this paper proposes a full computation reusable VLSI oriented algorithm. Through exploiting the similarity among motion vectors (MVs) of partitions in the same macroblock (MB), temporary computation results can be fully reused. Furthermore, a simple and effective searching method is adopted to make the proposed method more suitable for VLSI implementation. Experiment results show that up to 80% add operations and 85% internal reference frame memory access operations are saved without any degradation in the coding quality.
Thatsanee CHAROENPORN Canasai KRUENGKRAI Thanaruk THEERAMUNKONG Virach SORNLERTLAMVANICH
Manually collecting contexts of a target word and grouping them based on their meanings yields a set of word senses but the task is quite tedious. Towards automated lexicography, this paper proposes a word-sense discrimination method based on two modern techniques; EM algorithm and principal component analysis (PCA). The spherical Gaussian EM algorithm enhanced with PCA for robust initialization is proposed to cluster word senses of a target word automatically. Three variants of the algorithm, namely PCA, sGEM, and PCA-sGEM, are investigated using a gold standard dataset of two polysemous words. The clustering result is evaluated using the measures of purity and entropy as well as a more recent measure called normalized mutual information (NMI). The experimental result indicates that the proposed algorithms gain promising performance with regard to discriminate word senses and the PCA-sGEM outperforms the other two methods to some extent.
Yoshitomo SHIRAMIZU Nobuo GOTO
All optical analog-to-digital converter consisting of an optical polarization modulator using nonlinear phase shift and switches based on polarization is proposed. The principle of operation is discussed using Jones matrix. Optical polarization states through the system and limit of resolution are evaluated. The resolution is optimized by maintaining the polarization state in the converter and refining the polarization of incident sampling signal. Parallel usage of converter modules is proposed to increase the dynamic range, where cyclic nature of optical phase plays an important roll. Application to photonic routing of our converter is also proposed.
Keiichirou KUSAKARI Yuki CHIBA
The completeness (i.e. confluent and terminating) property is an important concept when using a term rewriting system (TRS) as a computational model of functional programming languages. Knuth and Bendix have proposed a procedure known as the KB procedure for generating a complete TRS. A TRS cannot, however, directly handle higher-order functions that are widely used in functional programming languages. In this paper, we propose a higher-order KB procedure that extends the KB procedure to the framework of a simply-typed term rewriting system (STRS) as an extended TRS that can handle higher-order functions. We discuss the application of this higher-order KB procedure to a certification technique called inductionless induction used in program verification, and its application to fusion transformation, a typical kind of program transformation.
Hiroki IURA Hiroyoshi YAMADA Yasutaka OGAWA Yoshio YAMAGUCHI
Antenna array is essential factor for multiple- input multiple-output (MIMO) wireless systems. Since the antenna array is composed of closely spaced elements, the mutual coupling among the elements cannot be ignored for the best performance of the array. Mutual coupling affects the MIMO channel, so the performance of a MIMO system, including channel capacity and diversity, varies with the degree of mutual coupling. The effect of mutual coupling is a function of the antenna load impedance. Therefore, designing an optimal element-matched array for a MIMO system requires consideration of the optimal matching condition for the array elements, the one that maximizes the channel capacity. We evaluated the effects of mutual coupling with various matching conditions in dipole arrays, and investigated their effects on the path correlation and channel capacity of MIMO systems. Simulation showed that the conventional conjugate matching of each element is still suitable for closely spaced elements except when the separation is about less than 0.1λ. Theoretical consideration of the received power of a closely-spaced-element array is also provided to show the effects of mutual coupling.
Masahiko OMURA Toshiki KANAMOTO Michiko TSUKAMOTO Mitsutoshi SHIROTA Takashi NAKAJIMA Masayuki TERAI
This paper proposes a new efficient method of characterizing a memory compiler in order to reduce the computation time and remove human error. The new features that make our method greatly efficient are the following three points: (1) high-speed circuit simulation of the whole memory module using a hierarchical LPE (Layout Parasitic Extractor) and a hierarchical circuit simulator, (2) automatic generation of circuit simulation input data from corresponding parameterized description termed the template file, and (3) carefully selected environmental conditions of circuit level simulator and minimizing the number of runs of it. We demonstrate the effectiveness of the proposed method by application to the single-port SRAM generators using 90 nm CMOS technology.