Zhenhai TAN Yun YANG Xiaoman WANG Fayez ALQAHTANI
Chenrui CHANG Tongwei LU Feng YAO
Takuma TSUCHIDA Rikuho MIYATA Hironori WASHIZAKI Kensuke SUMOTO Nobukazu YOSHIOKA Yoshiaki FUKAZAWA
Shoichi HIROSE Kazuhiko MINEMATSU
Toshimitsu USHIO
Yuta FUKUDA Kota YOSHIDA Takeshi FUJINO
Qingping YU Yuan SUN You ZHANG Longye WANG Xingwang LI
Qiuyu XU Kanghui ZHAO Tao LU Zhongyuan WANG Ruimin HU
Lei Zhang Xi-Lin Guo Guang Han Di-Hui Zeng
Meng HUANG Honglei WEI
Yang LIU Jialong WEI Shujian ZHAO Wenhua XIE Niankuan CHEN Jie LI Xin CHEN Kaixuan YANG Yongwei LI Zhen ZHAO
Ngoc-Son DUONG Lan-Nhi VU THI Sinh-Cong LAM Phuong-Dung CHU THI Thai-Mai DINH THI
Lan XIE Qiang WANG Yongqiang JI Yu GU Gaozheng XU Zheng ZHU Yuxing WANG Yuwei LI
Jihui LIU Hui ZHANG Wei SU Rong LUO
Shota NAKAYAMA Koichi KOBAYASHI Yuh YAMASHITA
Wataru NAKAMURA Kenta TAKAHASHI
Chunfeng FU Renjie JIN Longjiang QU Zijian ZHOU
Masaki KOBAYASHI
Shinichi NISHIZAWA Masahiro MATSUDA Shinji KIMURA
Keisuke FUKADA Tatsuhiko SHIRAI Nozomu TOGAWA
Yuta NAGAHAMA Tetsuya MANABE
Baoxian Wang Ze Gao Hongbin Xu Shoupeng Qin Zhao Tan Xuchao Shi
Maki TSUKAHARA Yusaku HARADA Haruka HIRATA Daiki MIYAHARA Yang LI Yuko HARA-AZUMI Kazuo SAKIYAMA
Guijie LIN Jianxiao XIE Zejun ZHANG
Hiroki FURUE Yasuhiko IKEMATSU
Longye WANG Lingguo KONG Xiaoli ZENG Qingping YU
Ayaka FUJITA Mashiho MUKAIDA Tadahiro AZETSU Noriaki SUETAKE
Xingan SHA Masao YANAGISAWA Youhua SHI
Jiqian XU Lijin FANG Qiankun ZHAO Yingcai WAN Yue GAO Huaizhen WANG
Sei TAKANO Mitsuji MUNEYASU Soh YOSHIDA Akira ASANO Nanae DEWAKE Nobuo YOSHINARI Keiichi UCHIDA
Kohei DOI Takeshi SUGAWARA
Yuta FUKUDA Kota YOSHIDA Takeshi FUJINO
Mingjie LIU Chunyang WANG Jian GONG Ming TAN Changlin ZHOU
Hironori UCHIKAWA Manabu HAGIWARA
Atsuko MIYAJI Tatsuhiro YAMATSUKI Tomoka TAKAHASHI Ping-Lun WANG Tomoaki MIMOTO
Kazuya TANIGUCHI Satoshi TAYU Atsushi TAKAHASHI Mathieu MOLONGO Makoto MINAMI Katsuya NISHIOKA
Masayuki SHIMODA Atsushi TAKAHASHI
Yuya Ichikawa Naoko Misawa Chihiro Matsui Ken Takeuchi
Katsutoshi OTSUKA Kazuhito ITO
Rei UEDA Tsunato NAKAI Kota YOSHIDA Takeshi FUJINO
Motonari OHTSUKA Takahiro ISHIMARU Yuta TSUKIE Shingo KUKITA Kohtaro WATANABE
Iori KODAMA Tetsuya KOJIMA
Yusuke MATSUOKA
Yosuke SUGIURA Ryota NOGUCHI Tetsuya SHIMAMURA
Tadashi WADAYAMA Ayano NAKAI-KASAI
Li Cheng Huaixing Wang
Beining ZHANG Xile ZHANG Qin WANG Guan GUI Lin SHAN
Sicheng LIU Kaiyu WANG Haichuan YANG Tao ZHENG Zhenyu LEI Meng JIA Shangce GAO
Kun ZHOU Zejun ZHANG Xu TANG Wen XU Jianxiao XIE Changbing TANG
Soh YOSHIDA Nozomi YATOH Mitsuji MUNEYASU
Ryo YOSHIDA Soh YOSHIDA Mitsuji MUNEYASU
Nichika YUGE Hiroyuki ISHIHARA Morikazu NAKAMURA Takayuki NAKACHI
Ling ZHU Takayuki NAKACHI Bai ZHANG Yitu WANG
Toshiyuki MIYAMOTO Hiroki AKAMATSU
Yanchao LIU Xina CHENG Takeshi IKENAGA
Kengo HASHIMOTO Ken-ichi IWATA
Shota TOYOOKA Yoshinobu KAJIKAWA
Kyohei SUDO Keisuke HARA Masayuki TEZUKA Yusuke YOSHIDA
Hiroshi FUJISAKI
Tota SUKO Manabu KOBAYASHI
Akira KAMATSUKA Koki KAZAMA Takahiro YOSHIDA
Tingyuan NIE Jingjing NIE Kun ZHAO
Xinyu TIAN Hongyu HAN Limengnan ZHOU Hanzhou WU
Shibo DONG Haotian LI Yifei YANG Jiatianyi YU Zhenyu LEI Shangce GAO
Kengo NAKATA Daisuke MIYASHITA Jun DEGUCHI Ryuichi FUJIMOTO
Jie REN Minglin LIU Lisheng LI Shuai LI Mu FANG Wenbin LIU Yang LIU Haidong YU Shidong ZHANG
Ken NAKAMURA Takayuki NOZAKI
Yun LIANG Degui YAO Yang GAO Kaihua JIANG
Guanqun SHEN Kaikai CHI Osama ALFARRAJ Amr TOLBA
Zewei HE Zixuan CHEN Guizhong FU Yangming ZHENG Zhe-Ming LU
Bowen ZHANG Chang ZHANG Di YAO Xin ZHANG
Zhihao LI Ruihu LI Chaofeng GUAN Liangdong LU Hao SONG Qiang FU
Kenji UEHARA Kunihiko HIRAISHI
David CLARINO Shohei KURODA Shigeru YAMASHITA
Qi QI Zi TENG Hongmei HUO Ming XU Bing BAI
Ling Wang Zhongqiang Luo
Zongxiang YI Qiuxia XU
Donghoon CHANG Deukjo HONG Jinkeon KANG
Xiaowu LI Wei CUI Runxin LI Lianyin JIA Jinguo YOU
Zhang HUAGUO Xu WENJIE Li LIANGLIANG Liao HONGSHU
Seonkyu KIM Myoungsu SHIN Hanbeom SHIN Insung KIM Sunyeop KIM Donggeun KWON Deukjo HONG Jaechul SUNG Seokhie HONG
Manabu HAGIWARA
Tetsunao MATSUTA Tomohiko UYEMATSU
In this paper, we consider the lossy source coding problem with delayed side information at the decoder. We assume that delay is unknown but the maximum of delay is known to the encoder and the decoder, where we allow the maximum of delay to change with the block length. In this coding problem, we show an upper bound and a lower bound of the rate-distortion (RD) function, where the RD function is the infimum of rates of codes in which the distortion between the source sequence and the reproduction sequence satisfies a certain distortion level. We also clarify that the upper bound coincides with the lower bound when maximums of delay per block length converge to a constant. Then, we give a necessary and sufficient condition in which the RD function is equal to that for the case without delay. Furthermore, we give an example of a source which does not satisfy this necessary and sufficient condition.
Masanori HIROTOMO Masakatu MORII
In this paper, we propose an efficient method for computing the weight spectrum of LDPC convolutional codes based on circulant matrices of quasi-cyclic codes. In the proposed method, we reduce the memory size of their parity-check matrices with the same distance profile as the original codes, and apply a forward and backward tree search algorithm to the parity-check matrices of reduced memory. We show numerical results of computing the free distance and the low-part weight spectrum of LDPC convolutional codes of memory about 130.
Daichi YUGAWA Tadashi WADAYAMA
An Invertible Bloom Lookup Tables (IBLT) is a data structure which supports insertion, deletion, retrieval and listing operations for the key-value pair. An IBLT can be used to realize efficient set reconciliation for database synchronization. The most notable feature of the IBLT is the complete listing operation of key-value pairs based on the algorithm similar to the peeling algorithm for low-density parity check (LDPC) codes. In this paper, we will present a stopping set (SS) analysis for the IBLT that reveals finite length behaviors of the listing failure probability. The key of the analysis is enumeration of the number of stopping matrices of given size. We derived a novel recursive formula useful for computationally efficient enumeration. An upper bound on the listing failure probability based on the union bound accurately captures the error floor behaviors.
The capacity (i.e., maximum flow) of a unicast network is known to be equal to the minimum s-t cut capacity due to the max-flow min-cut theorem. If the topology of a network (or link capacities) is dynamically changing or unknown, it is not so trivial to predict statistical properties on the maximum flow of the network. In this paper, we present a probabilistic analysis for evaluating the accumulate distribution of the minimum s-t cut capacity on random graphs. The graph ensemble treated in this paper consists of undirected graphs with arbitrary specified degree distribution. The main contribution of our work is a lower bound for the accumulate distribution of the minimum s-t cut capacity. The feature of our approach is to utilize the correspondence between the cut space of an undirected graph and a binary LDGM (low-density generator-matrix) code. From some computer experiments, it is observed that the lower bound derived here reflects the actual statistical behavior of the minimum s-t cut capacity of random graphs with specified degrees.
In anonymous reputation systems, where after an interaction between anonymous users, one of the users evaluates the peer by giving a rating. Ratings for a user are accumulated, which becomes the reputation of the user. By using the reputation, we can know the reliability of an anonymous user. Previously, anonymous reputation systems have been proposed, using an anonymous e-cash scheme. However, in the e-cash-based systems, the bank grasps the accumulated reputations for all users, and the fluctuation of reputations. These are private information for users. Furthermore, the timing attack using the deposit times is possible, which makes the anonymity weak. In this paper, we propose an anonymous reputation system, where the reputations of users are secret for even the reputation manager such as the bank. Our approach is to adopt an anonymous credential certifying the accumulated reputation of a user. Initially a user registers with the reputation manager, and is issued an initial certificate. After each interaction with a rater, the user as the ratee obtains an updated certificate certifying the previous reputation summed up by the current rating. The update protocol is based on the zero-knowledge proofs, and thus the reputations are secret for the reputation manager. On the other hand, due to the certificate, the user cannot maliciously alter his reputation.
Yasuyuki NOGAMI Kazuki TADA Satoshi UEHARA
Let p be an odd characteristic and m be the degree of a primitive polynomial f(x) over the prime field Fp. Let ω be its zero, that is a primitive element in F*pm, the sequence S={si}, si=Tr(ωi) for i=0,1,2,… becomes a non-binary maximum length sequence, where Tr(·) is the trace function over Fp. On this fact, this paper proposes to binarize the sequence by using Legendre symbol. It will be a class of geometric sequences but its properties such as the period, autocorrelation, and linear complexity have not been discussed. Then, this paper shows that the generated binary sequence (geometric sequence by Legendre symbol) has the period n=2(pm-1)/(p-1) and a typical periodic autocorrelation. Moreover, it is experimentally observed that its linear complexity becomes the maximum, that is the period n. Among such experimental observations, especially in the case of m=2, it is shown that the maximum linear complexity is theoretically proven. After that, this paper also demonstrates these properties with a small example.
Takafumi HAYASHI Takao MAEDA Shigeru KANEMOTO Shinya MATSUFUJI
The present paper introduces a novel method for the construction of sequences that have a zero-correlation zone. For the proposed sequence set, both the cross-correlation function and the side lobe of the autocorrelation function are zero for phase shifts within the zero-correlation zone. The proposed scheme can generate a set of sequences, each of length 16n2, from an arbitrary Hadamard matrix of order n and a set of 4n trigonometric function sequences of length 2n. The proposed construction can generate an optimal sequence set that satisfies, for a given zero-correlation zone and sequence period, the theoretical bound on the number of members. The peak factor of the proposed sequence set is equal to √2.
Nozomi MIYA Tota SUKO Goki YASUDA Toshiyasu MATSUSHIMA
In this paper, sequential prediction is studied. The typical assumptions about the probabilistic model in sequential prediction are following two cases. One is the case that a certain probabilistic model is given and the parameters are unknown. The other is the case that not a certain probabilistic model but a class of probabilistic models is given and the parameters are unknown. If there exist some parameters and some models such that the distributions that are identified by them equal the source distribution, an assumed model or a class of models can represent the source distribution. This case is called that specifiable condition is satisfied. In this study, the decision based on the Bayesian principle is made for a class of probabilistic models (not for a certain probabilistic model). The case that specifiable condition is not satisfied is studied. Then, the asymptotic behaviors of the cumulative logarithmic loss for individual sequence in the sense of almost sure convergence and the expected loss, i.e. redundancy are analyzed and the constant terms of the asymptotic equations are identified.
N-Shift Regional Low Correlation (NS-RLC) sequences have the low values of the correlation function only in N-shift positions. Especially, N-Shift Regional Zero Correlation (NS-RZC) sequences have the zero values in N-shift positions. In this letter, the generation algorithm of N-shift RLC/RZC sequences derived from Three Low Correlation Zones (T-LCZ) sequence set and Three Zero Correlation Zones (T-ZCZ) sequence set is proposed. In order to highlight the relationship between these sequences, the corresponding theoretical bound is calculated and analyzed.
DetF (Detect-and-Forward) is studied as a relay method in multi-hop networks. When an error detection scheme is introduced, DetF is likely to achieve an efficient transmission. In this study, AMI (Alternate Mark Inversion) code is focused on as an error detection scheme. Error detection performances of ternary PSK (Phase Shift Keying) using AMI code and binary PSK using parity check code are examined. It is shown that ternary PSK using AMI code has a good error detection performance.
Michitarou YABUUCHI Ryo KISHIDA Kazutoshi KOBAYASHI
We analyze the correlation between BTI (Bias Temperature Instability) -induced degradations and process variations. Those reliability issues are correlated. BTI is one of the most significant aging-degradations on LSIs. Threshold voltages of MOSFETs increase with time when biases stress their gates. It shows a strong effect of BTI on highly scaled LSIs in the same way as the process variations. The accurate prediction of the combinational effects is indispensable. We should analyze both aging-degradations and process variations of MOSFETs to explain the correlation. We measure frequencies of ROs (Ring Oscillators) of 65-nm process test circuits on two types of LSIs, ASICs and FPGAs. There are 98 and 837 ROs on our ASICs and FPGAs respectively. The frequencies of ROs follow gaussian distributions. We describe the highest frequency group as the “fast” conditon, the average group as the “typical” conditon and the lowest group as the “slow” conditon. We measure the aging-degradations of the ROs of the three conditions on the accelerated test. The degradations can be approximated by logarithmic function of stress time. The degradation at the “fast” condition has a higher impact on the frequency than the “slow” one. The correlation coefficient is 0.338. In this case, we can define a smaller design margin for BTI-induced degradations than that without considering the correlation because the degradation at the “slow” conditon is smaller than the average and the fast.
Daisuke FUKUDA Kenichi WATANABE Naoki IDANI Yuji KANAZAWA Masanori HASHIMOTO
As VLSI process node continue to shrink, chemical mechanical planarization (CMP) process for copper interconnect has become an essential technique for enabling many-layer interconnection. Recently, Edge-over-Erosion error (EoE-error), which originates from overpolishing and could cause yield loss, is observed in various CMP processes, while its mechanism is still unclear. To predict these errors, we propose an EoE-error prediction method that exploits machine learning algorithms. The proposed method consists of (1) error analysis stage, (2) layout parameter extraction stage, (3) model construction stage and (4) prediction stage. In the error analysis and parameter extraction stages, we analyze test chips and identify layout parameters which have an impact on EoE phenomenon. In the model construction stage, we construct a prediction model using the proposed multi-level machine learning method, and do predictions for designed layouts in the prediction stage. Experimental results show that the proposed method attained 2.7∼19.2% accuracy improvement of EoE-error prediction and 0.8∼10.1% improvement of non-EoE-error prediction compared with general machine learning methods. The proposed method makes it possible to prevent unexpected yield loss by recognizing EoE-errors before manufacturing.
Hirofumi SHIMIZU Hiromitsu AWANO Masayuki HIROMOTO Takashi SATO
The modeling of random telegraph noise (RTN) of MOS transistors is becoming increasingly important. In this paper, a novel method is proposed for realizing automated estimation of two important RTN-model parameters: the number of interface-states and corresponding threshold voltage shift. The proposed method utilizes a Gaussian mixture model (GMM) to represent the voltage distributions, and estimates their parameters using the expectation-maximization (EM) algorithm. Using information criteria, the optimal estimation is automatically obtained while avoiding overfitting. In addition, we use a shared variance for all the Gaussian components in the GMM to deal with the noise in RTN signals. The proposed method improved estimation accuracy when the large measurement noise is observed.
Takehiko AMAKI Masanori HASHIMOTO Takao ONOYE
This paper presents an oscillator-based true random number generator (TRNG) that dynamically unbiases 0/1 probability. The proposed TRNG automatically adjusts the duty cycle of a fast oscillator to 50%, and generates unbiased random numbers tolerating process variation and dynamic temperature fluctuation. A prototype chip of the proposed TRNG was fabricated with a 65nm CMOS process. Measurement results show that the developed duty cycle monitor obtained the probability of ‘1’ 4,100 times faster than the conventional output bit observation, or estimated the probability with 70 times higher accuracy. The proposed TRNG adjusted the probability of ‘1’ to within 50±0.07% in five chips in the temperature range of 0°C to 75°C. Consequently, the proposed TRNG passed the NIST and DIEHARD tests at 7.5Mbps with 6,670µm2 area.
James LIN Masaya MIYAHARA Akira MATSUZAWA
This paper proposes an ultra-low-voltage, wide signal swing, and clock-scalable differential dynamic amplifier using a common-mode voltage detection technique. The essential characteristics of an amplifier, such as gain, linearity, power consumption, noise, etc., are analyzed. In measurement, the proposed dynamic amplifier achieves a 13dB gain with less than 1dB drop over a differential output signal swing of 340mVpp with a supply voltage of 0.5V. The attained maximum operating frequency is 700MHz. With a 0.7V supply, the gain increases to 16dB with a signal swing of 700mVpp. The prototype amplifier is fabricated in 90nm CMOS technology with the low threshold voltage and the deep N-well options.
Yohei UMEKI Koji YANAGIDA Shusuke YOSHIMOTO Shintaro IZUMI Masahiko YOSHIMOTO Hiroshi KAWAGUCHI Koji TSUNODA Toshihiro SUGII
This paper reports a 65nm 8Mb spin transfer torque magnetoresistance random access memory (STT-MRAM) operating at a single supply voltage with a process-variation-tolerant sense amplifier. The proposed sense amplifier comprises a boosted-gate nMOS and negative-resistance pMOSs as loads, which maximizes the readout margin at any process corner. The STT-MRAM achieves a cycle time of 1.9µs (=0.526MHz) at 0.38V. The operating power is 1.70µW at this voltage. The minimum energy per access is 1.12 pJ/bit when the supply voltage is 0.44V. The proposed STT-MRAM operates at a lower energy than an SRAM when the utilization of the memory bandwidth is 14% or less.
Yiqiang SHENG Atsushi TAKAHASHI
In this paper, a novel high-performance heuristic algorithm, named relay-race algorithm (RRA), which was proposed to approach a global optimal solution by exploring similar local optimal solutions more efficiently within shorter runtime for NP-hard problem is investigated. RRA includes three basic parts: rough search, focusing search and relay. The rough search is designed to get over small hills on the solution space and to approach a local optimal solution as fast as possible. The focusing search is designed to reach the local optimal solution as close as possible. The relay is to escape from the local optimal solution in only one step and to maintain search continuity simultaneously. As one of typical applications, multi-objective placement problem in physical design optimization is solved by the proposed RRA. In experiments, it is confirmed that the computational performance is considerably improved. RRA achieves overall Pareto improvement of two conflicting objectives: power consumption and maximal delay. RRA has its potential applications to improve the existing search methods for more hard problems.
Tsutomu SASAO Yuta URANO Yukihiro IGUCHI
This paper shows a method to find a linear transformation that reduces the number of variables to represent a given incompletely specified index generation function. It first generates the difference matrix, and then finds a minimal set of variables using a covering table. Linear transformations are used to modify the covering table to produce a smaller solution. Reduction of the difference matrix is also considered.
Mika FUJISHIRO Masao YANAGISAWA Nozomu TOGAWA
LED (Light Encryption Device) block cipher, one of lightweight block ciphers, is very compact in hardware. Its encryption process is composed of AES-like rounds. Recently, a scan-based side-channel attack is reported which retrieves the secret information inside the cryptosystem utilizing scan chains, one of design-for-test techniques. In this paper, a scan-based attack method on the LED block cipher using scan signatures is proposed. In our proposed method, we focus on a particular 16-bit position in scanned data obtained from an LED LSI chip and retrieve its secret key using scan signatures. Experimental results show that our proposed method successfully retrieves its 64-bit secret key using 36 plaintexts on average if the scan chain is only connected to the LED block cipher. These experimental results also show the key is successfully retrieved even if the scan chain includes additional 130,000 1-bit data.
Hayato MASHIKO Yukihide KOHIRA
Due to the progress of the process technology in LSI, the yield of LSI chips is reduced by timing violations caused by delay variations. To recover the timing violations, delay tuning methods insert programmable delay elements called PDEs into the clock tree before fabrication and tune their delays after fabrication. The yield improvement of existing methods is not enough. In this paper, a delay tuning method of PDEs with an ordered finite set of delays is proposed for the yield improvement. The proposed delay tuning method is based on the modified Bellman-Ford algorithm. Therefore, its optimality is guaranteed and its time complexity is polynomial. In the experiments under Monte-Carlo simulation, the yield of the proposed method is improved higher when the number of delays in each PDE is increased.
The index generation function is a multi-valued logic function which checks if the given input vector is a registered or not, and returns its index value if the vector is registered. If the latency of the operation is critical, dedicated hardware is used for implementing the index generation functions. This paper proposes a method implementing the index generation functions using parallel index generator. A novel and efficient algorithm called ‘conflict free partitioning’ is proposed to synthesize parallel index generators. Experimental results show the proposed method outperforms other existing methods. Also, A novel architecture of index generator which is suitable for parallelized implementation is introduced. A new architecture has advantages in the sense of both area and delay.
Yukihide KOHIRA Atsushi TAKAHASHI
Multi-domain clock skew scheduling in general-synchronous framework is an effective technique to improve the performance of sequential circuits by using practical clock distribution network. Although the upper bound of performance of a circuit increases as the number of clock domains increases in multi-domain clock skew scheduling, the improvement of the performance becomes smaller while the cost of clock distribution network increases much. In this paper, a linear time algorithm that finds an optimum two-domain clock skew schedule in general-synchronous framework is proposed. Experimental results on ISCAS89 benchmark circuits and artificial data show that optimum circuits are efficiently obtained by our method in short time.
Heming SUN Dajiang ZHOU Peilin LIU Satoshi GOTO
In this paper, we present an area-efficient 4/8/16/32-point inverse discrete cosine transform (IDCT) architecture for a HEVC decoder. Compared with previous work, this work reduces the hardware cost from two aspects. First, we reduce the logical costs of 1D IDCT by proposing a reordered parallel-in serial-out (RPISO) scheme. By using the RPISO scheme, we can reduce the required calculations for butterfly inputs in each cycle. Secondly, we reduce the area of transpose architecture by proposing a cyclic data mapping scheme that can achieve 100% I/O utilization of each SRAM. To design a fully pipelined 2D IDCT architecture, we propose a pipelining schedule for row and column transform. The results show that the normalized area by maximum throughput for the logical IDCT part can be reduced by 25%, and the memory area can be reduced by 62%. The maximum throughput reaches 1248 Mpixels/s, which can support real-time decoding of a 4K × 2K 60fps video sequence.
Hideki TAKASE Gang ZENG Lovic GAUTHIER Hirotaka KAWASHIMA Noritoshi ATSUMI Tomohiro TATEMATSU Yoshitake KOBAYASHI Takenori KOSHIRO Tohru ISHIHARA Hiroyuki TOMIYAMA Hiroaki TAKADA
This paper presents a framework for reducing the energy consumption of embedded real-time systems. We implemented the presented framework as both an optimization toolchain and an energy-aware real-time operating system. The framework consists of the integration of multiple techniques to optimize the energy consumption. The main idea behind our approach is to utilize trade-offs between the energy consumption and the performance of different processor configurations during task checkpoints, and to maintain memory allocation during task context switches. In our framework, a target application is statically analyzed at both intra-task and inter-task levels. Based on these analyzed results, runtime optimization is performed in response to the behavior of the application. A case study shows that our toolchain and real-time operating systems have achieved energy reduction while satisfying the real-time performance. The toolchain has also been successfully applied to a practical application.
Jiayi ZHU Dajiang ZHOU Shinji KIMURA Satoshi GOTO
High efficiency video coding (HEVC) is the new generation video compression standard. Sample adaptive offset (SAO) is a new compression tool adopted in HEVC which reduces the distortion between original samples and reconstructed samples. SAO estimation is the process of determining SAO parameters in video encoding. It is divided into two phases: statistic collection and parameters determination. There are two difficulties for VLSI implementation of SAO estimation. The first is that there are huge amount of samples to deal with in statistic collection phase. The other is that the complexity of Rate Distortion Optimization (RDO) in parameters determination phase is very high. In this article, a fast SAO estimation algorithm and its corresponding VLSI architecture are proposed. For the first difficulty, we use bitmaps to collect statistics of all the 16 samples in one 4×4 block simultaneously. For the second difficulty, we simplify a series of complicated procedures in HM to balance the algorithms complexity and BD-rate performance. Experimental results show that the proposed algorithm maintains the picture quality improvement. The VLSI design based on this algorithm can be implemented using 156.32K gates, 8,832bits single port RAM for 8bits depth case. It can be synthesized to 400MHz @ 65nm technology and is capable of 8K×4K @ 120fps encoding.
Akihiro SUDA Hideki TAKASE Kazuyoshi TAKAGI Naofumi TAKAGI
We propose a synthesis method of nested loops into parallelized circuits by integrating the polyhedral optimization, which is a state-of-the-art technique in the field of software, into high-level synthesis. Our method constructs circuits equipped with multiple processing elements (PEs), using information generated by the polyhedral optimizing compiler. Since multiple PEs cannot concurrently access the off-chip RAM, a method for constructing on-chip buffers is also proposed. Our buffering method reduces the off-chip RAM access conflicts and further enables burst accesses and data reuses. In our experimental result, the buffered circuits generated by our method are 8.2 times on average and 26.5 times at maximum faster than the sequential non-buffered ones, when each of the parallelized circuits is configured with eight PEs.
Hiroaki YOSHIDA Masayuki WAKIZAKA Shigeru YAMASHITA Masahiro FUJITA
With the shorter time-to-market and the rising cost in SoC development, the demand for post-silicon programmability has been increasing. Recently, programmable accelerators have attracted more attention as an enabling solution for post-silicon engineering change. However, programmable accelerators suffers from 5∼10X less energy efficiency than fixed-function accelerators mainly due to their extensive use of memories. This paper proposes a highly energy-efficient accelerator which enables post-silicon engineering change by a control patching mechanism. Then, we propose a patch compilation method from a given pair of an original design and a modified design. We also propose a design method to add redundant wires in advance to decrease the necessary amount of patch memory for post-silicon engineering change. Experimental results demonstrate that the proposed accelerators offer high energy efficiency competitive to fixed-function accelerators and can achieve about 5X higher efficiency than the existing programmable accelerators. We also show the trade-off between redundant wires and the necessary amount of patch memory.
Hiroaki KONOURA Dawood ALNAJJAR Yukio MITSUYAMA Hajime SHIMADA Kazutoshi KOBAYASHI Hiroyuki KANBARA Hiroyuki OCHI Takashi IMAGAWA Kazutoshi WAKABAYASHI Masanori HASHIMOTO Takao ONOYE Hidetoshi ONODERA
This paper proposes a mixed-grained reconfigurable architecture consisting of fine-grained and coarse-grained fabrics, each of which can be configured for different levels of reliability depending on the reliability requirement of target applications, e.g. mission-critical applications to consumer products. Thanks to the fine-grained fabrics, the architecture can accommodate a state machine, which is indispensable for exploiting C-based behavioral synthesis to trade latency with resource usage through multi-step processing using dynamic reconfiguration. In implementing the architecture, the strategy of dynamic reconfiguration, the assignment of configuration storage and the number of implementable states are key factors that determine the achievable trade-off between used silicon area and latency. We thus split the configuration bits into two classes; state-wise configuration bits and state-invariant configuration bits for minimizing area overhead of configuration bit storage. Through a case study, we experimentally explore the appropriate number of implementable states. A proof-of-concept VLSI chip was fabricated in 65nm process. Measurement results show that applications on the chip can be working in a harsh radiation environment. Irradiation tests also show the correlation between the number of sensitive bits and the mean time to failure. Furthermore, the temporal error rate of an example application due to soft errors in the datapath was measured and demonstrated for reliability-aware mapping.
While Triple modular Redundancy (TMR) is effective in eliminating soft errors in LSIs, the overhead of the triplicated area as well as the triplicated energy consumption is the problem. In addition to the spatial TMR mode where executions are simply tripricated and the majority is taken, the temporal TMR mode is available where only two copies of an operation are executed and the results are compared, then if the results differ, the third copy is executed to get the correct result. Appropriately selecting the power supply voltage is also an effective technique to reduce the energy consumption. In this paper, a method to derive a TMR design is proposed which selects the TMR mode and supply voltage for each operation to minimize the energy consumption within the time and area constraints.
We consider single and multiple attacker scenarios in guessing and obtain bounds on various success parameters in terms of Renyi entropies. We also obtain a new derivation of the union bound.
A generalized chirp-like (GCL) sequence of period N is constructed by modulating a Zadoff-Chu sequence of period N with an arbitrary unimodular sequence of period m, where m divides N. Under some specific conditions, the cross-correlations between two GCL sequences are shown to have exactly the same magnitudes as those of their corresponding Zadoff-Chu sequences regardless of the employed unimodular sequences. In this paper, we first investigate the sufficient conditions under which such a relation holds. We then use them to construct a new class of optimal zero-correlation zone (ZCZ) sequence sets which can be considered to be an extension of the so-called GCL-ZCZ sequence sets.
Longye WANG Xiaoli ZENG Hong WEN
An asymmetric zero correlation zone (A-ZCZ) sequence set is a type of ZCZ sequence set and consists of multiple sequence subsets. It is the most important property that is the cross-correlation function between arbitrary sequences belonging to different sequence subsets has quite a large zero-cross-correlation zone (ZCCZ). Our proposed A-ZCZ sequence sets can be constructed based on interleaved technique and orthogonality-preserving transformation by any perfect sequence of length P=Nq(2k+1) and Hadamard matrices of order T≥2, where N≥1, q≥1 and k≥1. If q=1, the novel sequence set is optimal ZCZ sequence set, which has parameters (TP,TN,2k+1) for all positive integers P=N(2k+1). The proposed A-ZCZ sequence sets have much larger ZCCZ, which are expected to be useful for designing spreading sequences for QS-CDMA systems.
Young-Tae KIM Min Kyu SONG Dae San KIM Hong-Yeop SONG
In this paper, we show that if the d-decimation of a (q-1)-ary Sidelnikov sequence of period q-1=pm-1 is the d-multiple of the same Sidelnikov sequence, then d must be a power of a prime p. Also, we calculate the crosscorrelation magnitude between some constant multiples of d- and d'-decimations of a Sidelnikov sequence of period q-1 to be upper bounded by (d+d'-1)√q+3.
In this paper, with a modification of our earlier construction in [12], new classes of optimal LHZ FHS sets with new parameters are obtained which are optimal in the sense that their parameters meet the Peng-Fan-Lee bound. It is shown that all the sequences in the proposed FHS sets are shift distinct. The proposed FHS sets are suitable for quasi-synchronous time/frequency hopping code division multiple access systems to eliminate multiple-access interference.
Yusuke TAKAMARU Sachin RAI Hiromasa HABUCHI
A code shift keying (CSK) using pseudo-noise (PN) codes for optical wireless communications with intensity/modulation and direct detection (IM/DD) is considered. Since CSK has several PN codes, the data transmission rate and the bit error rate (BER) performance can be improved by increasing the number of PN codes. However, the conventional optical PN codes are not suitable for optical CSK with IM/DD because the ratio of the number of PN codes and the code length of PN code, M/L is smaller than 1/√L. In this paper, an optical CSK with a new PN code, which combines the generalized modified prime sequence code (GMPSC) and Hadamard code is analyzed. The new PN code can achieve M/L=1. Moreover, the BER performance and the data transmission rate of the CSK system with the new PN code are evaluated through theoretical analysis by taking the scintillation, background-noise, avalanche photodiode (APD) noise, thermal noise, and signal dependent noise into account. It is found that the CSK system with the new PN code outperforms the conventional optical CSK system.
Kyohei SUMIKAWA Hiromasa HABUCHI
In this paper, the low density generator matrix (LDGM) coded scheme with unequal transmission power allocation (UTPA) in optical wireless channel is evaluated by computer simulation. In particular, the bit error rate performance of the LDGM-coded binary pulse position modulation (LDGM-BPPM) with the UTPA scheme is investigated in the presence of avalanche photo diode (APD) noise, scintillation and background noise. Consequently, the BER performance of the LDGM-BPPM with UTPA is better than that of the conventional LDGM-BPPM. It is found that there is the optimum power ratio (R). The optimum R varies with scintillation and background noise. For example, when the average received laser power is -47[dBm], the variance of scintillation is 0.1, and background noise is -45[dBm], the optimum R is 3.1. Thus, the LDGM-BPPM with the UTPA scheme is superior to the conventional LDGM-BPPM system.
Yuta IDA Chang-Jun AHN Takahiro MATSUMOTO Shinya MATSUFUJI
To achieve more high speed and high quality systems of wireless communications, orthogonal frequency division multiple access (OFDMA) has been proposed. Moreover, OFDMA considering the multiuser diversity (MUDiv) has been also proposed to achieve more high system performance. On the other hand, the conventional MUDiv/OFDMA requires large complexity to select the subcarrier of each user. To solve this problem, we have proposed a MUDiv/OFDMA based on the low granularity block (LGB). However, it degrades the system performance in the environment which contains many deep faded subcarrier channels. Therefore, in this paper, we propose a cooperative LGB-MUDiv/OFDMA to mitigate the influence due to the deep faded subcarrier channel.
Takahiro MATSUMOTO Hideyuki TORII Yuta IDA Shinya MATSUFUJI
In this paper, we propose a new structure for a compact matched filter bank for a mutually orthogonal zero-correlation zone (MO-ZCZ) sequence set consisting of ternary sequence pairs obtained by Hadamard and binary ZCZ sequence sets; this construction reduces the number of two-input adders and delay elements. The matched filter banks are implemented on a field-programmable gate array (FPGA) with 51,840 logic elements (LEs). The proposed matched filter bank for an MO-ZCZ sequence set of length 160 can be constructed by a circuit size that is about 8.6% that of a conventional matched filter bank.
Shunsuke YAMAKI Masahide ABE Masayuki KAWAMATA
This paper proposes statistical analysis of phase-only correlation functions based on linear statistics and directional statistics. We derive the expectation and variance of the phase-only correlation functions assuming phase-spectrum differences of two input signals to be probability variables. We first assume linear probability distributions for the phase-spectrum differences. We next assume circular probability distributions for the phase-spectrum differences, considering phase-spectrum differences to be circular data. As a result, we can simply express the expectation and variance of phase-only correlation functions as linear and quadratic functions of circular variance of phase-spectrum differences, respectively.
Takashi YOSHIDA Yosuke SUGIURA Naoyuki AIKAWA
Maximally flat digital differentiators (MFDDs) are widely used in many applications. By using MFDDs, we obtain the derivative of an input signal with high accuracy around their center frequency of flat property. Moreover, to avoid the influence of noise, it is desirable to attenuate the magnitude property of MFDDs expect for the vicinity of the center frequency. In this paper, we introduce a design method of linear phase FIR band-pass MFDDs with an arbitrary center frequency. The proposed transfer function for both of TYPE III and TYPE IV can be achieved as a closed form function using Jacobi polynomial. Furthermore, we can easily derive the weighting coefficients of the proposed MFDDs using recursive formula. Through some design examples, we confirm that the proposed method can adjust the center frequency arbitrarily and the band width having flat property.
Takahide TERADA Hiroshi SHINODA
A two-dimensional (2D) wireless power transmission (WPT) system that handles a wide range of transmitted and received power is proposed and evaluated. A transmitter outputs the power to an arbitrary position on a 2D waveguide sheet by using a beam-forming technique. The 2D waveguide sheet does not require an absorber on its edge. The minimum propagation power on the sheet is increased 18 times by using the beam-forming technique. Power amplifier (PA) efficiency was improved from 19% to 46% when the output power was 10dB smaller than peak power due to the use of a PA supply-voltage and input power control method. Peak PA efficiency was 60%. A receiver inputs a wide range of power levels and drives various load impedances with a parallel rectifier. This rectifier enables a number of rectifying units to be tuned dynamically. The rectifier efficiency was improved 1.5 times while input power range was expanded by 6dB and the load-impedance range was expanded fourfold. The rectifier efficiency was 66-73% over an input power range of 18-36dBm at load impedances of 100 and 400Ω.
Jiunn-Tsair FANG Zong-Yi CHEN Chen-Cheng CHAN Pao-Chi CHANG
Rate control that is required to regulate the bitrate of video coding is critical to time-sensitive video applications used over networks. However, the H.264/AVC standard does not respond to scene changes, and this causes the transmission quality to deteriorate as a scene change occurs. In this work, a scene change is detected by comparing the ratio of the sum of absolute difference (SAD) between two consecutive frames. As the scene change is detected, the proposed method, which is modified from the reference software of H.264/AVC, re-assigns a quantization parameter (QP) value to regulate the bitrate. Because the inter-prediction works poorly for the scene-changed frame, the proposed method estimates its frame complexity based on the content, and further creates another Q-R model to assign QP. The adaptive rate control mechanism presented in this study can quickly respond to the heavy bitrate increment caused by a change of scene. Simulation results show that the proposed method improves the average peak signal noise ratio (PSNR) to approximately 1.1dB, with a smaller buffer size compared with the performance of the reference software JM version 17.2.
In this paper, an image prior based on soft-morphological filters and its application to image recovery are presented. In morphological image processing, a gray-scale image is represented as a subset in a three-dimensional space, which is spanned by spatial and intensity axes. Morphological opening and closing, which are basic operations in morphological image processing, respectively approximate the image subset and its complementary images as the unions of structuring elements that are translated in the three-dimensional space. In this study, the opening and closing filters are applied to an image prior to resolve the regularization problem of image recovery. When the proposed image prior is applied, the image is recovered as an image that has no noise component, which is eliminated by the opening and closing. However, the closing and opening filters are less able to eliminate Gaussian noise. In order to improve the robustness against Gaussian noise, the closing and opening filters are respectively approximated as soft-closing and soft-opening with relaxed max and min functions. In image recovery experiments, image denoising and deblurring using the proposed prior are demonstrated. Comparisons of the proposed prior with the existing priors that impose a penalty on the gradient of the intensity are also shown.
Guobing QIAN Liping LI Hongshu LIAO
The maximization of non-Gaussianity is an effective approach to achieve the complex independent component analysis (ICA) problem. However, the traditional complex maximization of non-Gaussianity (CMN) algorithm does not consider the influence of noise. In this letter, a modification of the fixed-point algorithm is proposed for more practical occasions of the complex noisy ICA model. Simulations show that the proposed method demonstrates significantly improved performance over the traditional CMN algorithm in the noisy ICA model when the sample size is sufficient.
Keunsang LEE Younghyun BAEK Dongwook KIM Junil SOHN Youngcheol PARK
This paper presents an adaptive feedback canceller (AFC) based on a pseudo affine projection (PAP) algorithm that can provide fast and stable adaptation to the time-varying environment. The proposed algorithm utilizes the adaptive linear prediction (LP) to obtain the LP coefficients of input signal model and the inverse gain filter (IGF) to alleviate the effect of compensation gain. As a result, when the input is model as an AR signal, the proposed algorithm satisfies the condition for having an almost unbiased estimatie of the feedback path and then its performance is relatively independent of the gain setting of hearing aids. Simulation results showed that the proposed algorithm is capable of obtaining unbaised feedback path estimates and high speech quality.
Bit-Na KWON Hyun-Jun SHIN Hyoung-Kyu SONG
In this letter, a cooperative scheme based on orthogonal frequency division multiplexing (OFDM) in vehicular communication system is proposed. In the conventional scheme, a destination exploits only one base station to communicate information. The proposed scheme can use an extra source from another base station through a relay, since the restriction of power in vehicle are less than cellular device. If a destination is distant from a base station, the performance is degraded. When a destination is distant from a base station, the proposed scheme employing space time block code (STBC) and cyclic delay diversity (CDD) has a higher bit error rate (BER) performance and throughput than the conventional scheme.
Chen WU Yifeng ZHANG Yuhui SHI Li ZHAO Minghai XIN
Recently, design of sparse finite impulse response (FIR) digital filters has attracted much attention due to its ability to reduce the implementation cost. However, finding a filter with the fewest number of nonzero coefficients subject to prescribed frequency domain constraints is a rather difficult problem because of its non-convexity. In this paper, an algorithm based on binary particle swarm optimization (BPSO) is proposed, which successively thins the filter coefficients until no sparser solution can be obtained. The proposed algorithm is evaluated on a set of examples, and better results can be achieved than other existing algorithms.
Pranab KUMAR DHAR Tetsuya SHIMAMURA
This letter presents a new blind audio watermarking scheme using eigenvalue decomposition (EVD). Initially, the original audio is divided into frames and the samples of each frame are arranged into a square matrix. EVD is applied to each of these matrices. Watermark data is embedded into the largest eigenvalue of each diagonal matrix by quantization. Data embedding rate of the proposed scheme is 172.39bps. Simulation results confirm the imperceptibility of the proposed scheme and its higher robustness against various attacks compared to the state-of-the-art watermarking methods available in the literature.
You Sung KANG Dong-Jo PARK Daniel W. ENGELS Dooho CHOI
We present a dynamic key generation method, KeyQ, for establishing shared secret keys in EPCglobal Generation 2 (Gen2) compliant systems. Widespread adoption of Gen2 technologies has increased the need for protecting communications in these systems. The highly constrained resources on Gen2 tags limit the usability of traditional key distribution techniques. Dynamic key generation provides a secure method to protect communications with limited key distribution requirements. Our KeyQ method dynamically generates fresh secret keys based on the Gen2 adaptive Q algorithm. We show that the KeyQ method generates fresh and unique secret keys that cannot be predicted with probability greater than 10-250 when the number of tags exceeds 100.
Minghui YANG Dongdai LIN Minjia SHI
The stability theory of stream ciphers plays an important role in designing good stream cipher systems. Two algorithms are presented, to determine the optimal shift and the minimum linear complexity of the sequence, that differs from a given sequence over Fq with period qn-1 by one digit. We also describe how the linear complexity changes with respect to one digit differing from a given sequence.
Hideki YOSHIKAWA Masahiro KAMINAGA Arimitsu SHIKODA Toshinori SUZUKI
A method of round addition attack on substitution-permutation network (SPN) block ciphers using differential fault analysis (DFA) is presented. For the 128-bit advanced encryption standard (AES), we show that secret keys can be extracted using one correct ciphertext and two faulty ciphertexts. Furthermore, we evaluate the success rate of a round addition DFA attack, experimentally. The proposed method can also be applied to lightweight SPN block cipher such as KLEIN and LED.
Misbehaving nodes intrinsic to the physical vulnerabilities of ad-hoc sensor networks pose a challenging constraint on the designing of data fusion. To address this issue, a statistics-based reputation method for reliable data fusion is proposed in this study. Different from traditional reputation methods that only compute the general reputation of a node, the proposed method modeled by negative binomial reputation consists of two separated reputation metrics: fusion reputation and sensing reputation. Fusion reputation aims to select data fusion points and sensing reputation is used to weigh the data reported by sensor nodes to the fusion point. So, this method can prevent a compromised node from covering its misbehavior in the process of sensing or fusion by behaving well in the fusion or sensing. To tackle the unexpected facts such as packet loss, a discounting factor is introduced into the proposed method. Additionally, Local Outlier Factor (LOF) based outlier detection is applied to evaluate the behavior result of sensor nodes. Simulations show that the proposed method can enhance the reliability of data fusion and is more accurate than the general reputation method when applied in reputation evaluation.
Meng ZHANG Huihui BAI Meiqin LIU Anhong WANG Mengmeng ZHANG Yao ZHAO
As an ongoing video compression standard, High Efficiency Video Coding (HEVC) has achieved better rate distortion performance than H.264, but it also leads to enormous encoding complexity. In this paper, we propose a novel fast coding unit partition algorithm in the intra prediction of HEVC. Firstly, instead of the time-consuming rate distortion optimization for coding mode decision, just-noticeable-difference (JND) values can be exploited to partition the coding unit according to human visual system characteristics. Furthermore, coding bits in HEVC can also be considered as assisted information to refine the partition results. Compared with HEVC test model HM10.1, the experimental results show that the fast intra mode decision algorithm provides over 28% encoding time saving on average with comparable rate distortion performance.
This image sharing method is a secure way of protecting the security of the secret images. In 2011, Wang et al. proposed an image sharing method with verification. The idea of the method is to embed the secret and the watermark images into two shares by two equations to achieve the goal of the secret sharing. However, the constructed shares are meaningless images which are difficult to manage. Authors utilize the algorithm of the torus automorphism to increase the security of the shares. However, the algorithm of the torus automorphism must take much time to encrypt and decrypt an image. This paper proposes a friendly image sharing method to improve the above problem. Experimental results show the significant efficiency of the proposed method.
Tung-chin LEE Young-cheol PARK Dae-hee YOUN
This paper proposes a method of improving the performance of blind reverberation time (RT) estimation in noisy environments. RT estimation is conducted using a maximum likelihood (ML) method based on the autocorrelation function of the linear predictive residual signal. To reduce the effect of environmental noise, a noise reduction technique is applied to the noisy speech signal. In addition, a frequency coefficient selection is performed to eliminate signal components with low signal-to-noise ratio (SNR). Experimental results confirm that the proposed method improves the accuracy of RT measures, particularly when the speech signal is corrupted by a colored noise with a narrow bandwidth.
It is shown that an infinite lumped-element LC- ladder network generates all Bessel functions Jn(t) of the first kind as a response to a single non-zero initial condition. Closed-form expressions for the voltage responses in the time domain are presented if the LC- ladder is driven by a step-like input voltage.