In spite of continuous improvement of computational power of multi/many-core processors, the memory access performance of the processors has not been improved sufficiently, and thus the overall performance of recent processors is often restricted by the delay of off-chip memory accesses. Low-delay data compression for last level cache (LLC) would be effective to improve the processor performance because the compression increases the effective size of LLC, and thus reduces the number of off-chip memory accesses. This paper proposes a novel data compression method suitable for high-speed parallel decoding in the LLC. Since cache line data often have periodicity of certain lengths, such as 32- or 64-bit instructions, 32-bit integers, and 64-bit floating point numbers, an information word is encoded as a base pattern and a differential pattern between the original word and the base pattern. Evaluation using a GPU simulator shows that the compression ratio of the proposed coding is comparable to LZSS coding and X-Match Pro and superior to other conventional compression algorithms for cache memories. Also this paper presents an experimental decoder designed for ASIC, and the synthesized result shows that the decoder can decompress cache line data of length 32bytes in four clock cycles. Evaluation of the IPC on the GPU simulator shows that, for several benchmark programs, the IPC achieved by the proposed coding is higher than that by the conventional BΔI coding, where the maximum improvement of the IPC is 20%.
Yichao LU Gang HE Guifen TIAN Satoshi GOTO
Recently, non-binary low-density parity-check (NB-LDPC) codes starts to show their superiority in achieving significant coding gains when moderate codeword lengths are adopted. However, the overwhelming decoding complexity keeps NB-LDPC codes from being widely employed in modern communication devices. This paper proposes a hybrid message-passing decoding algorithm which consumes very low computational complexity. It achieves competitive error performance compared with conventional Min-max algorithm. Simulation result on a (255,174) cyclic code shows that this algorithm obtains at least 0.5dB coding gain over other state-of-the-art low-complexity NB-LDPC decoding algorithms. A partial-parallel NB-LDPC decoder architecture for cyclic NB-LDPC codes is also developed based on this algorithm. Optimization schemes are employed to cut off hard decision symbols in RAMs and also to store only part of the reliability messages. In addition, the variable node units are redesigned especially for the proposed algorithm. Synthesis results demonstrate that about 24.3% gates and 12% memories can be saved over previous works.
This letter presents a technique to reduce the complexity of the soft-output multiple-input multiple-output symbol detection based on Dijkstra's algorithm. By observing that the greedy behavior of Dijkstra's algorithm can entail unnecessary tree-visits for the symbol detection, this letter proposes a technique to evict non-promising candidates early from the search space. The early eviction technique utilizes layer information to determine if a candidate is promising, which is simple but effective. When the SNR is 30dB for 6×6 64-QAM systems, the average number of tree-visits in the proposed method is reduced by 72.1% in comparison to that in the conventional Dijkstra's algorithm-based symbol detection without the early eviction.
Xubo ZHAO Hang ZHOU Xiaoping LI
Under random linear coding (RLC) scheme, we present a simple expression of the probability distribution p(D=K) for decoding delay D incurred by the lossy channel, where K is a positive integer. In contrast with the previous contribution, our focus is firstly on deriving the cumulative distribution function of the discrete random variable D over a perfect channel. One benefit of such dispose is that, from the overall viewpoint, computing the cumulative distribution function of delay D can be related with calculating the cardinalities of sets of some special matrices, so that the former can be obtained from the latter. Moreover, our expression of the probability distribution is an explicit form, and is valid for any number of packets M, freewill field size q and arbitrary channel loss rate ε.
Kazumitsu SAKAMOTO Ken HIRAGA Tomohiro SEKI Tadao NAKAGAWA Kazuhiro UEHARA
A Simple decoding method for short-range MIMO (SR-MIMO) transmission can reduce the power consumption for MIMO decoding, but the distance between the transceivers requires millimeter-order accuracy in order to satisfy the required transmission quality. In this paper, we propose a phase difference control method between each propagation channel to alleviate the requirements for the transmission distance accuracy. In the proposed method, the phase difference between each propagation channel is controlled by changing the transmission (or received) power ratio of each element of sub-array antennas. In millimeter-wave broadband transmission simulation, we clarified that when sub-array antenna spacing is set to 6.6 mm and element spacing of sub-array antenna is set to 2.48mm, the proposed method can extend the transmission distance range satisfying the required transmission quality, which is that bit error rate (BER) before error correction is less than 10-2 from 9∼29mm to 0∼50mm in QPSK, from 15∼19mm to 0∼30mm in 16QAM, and from only 15mm to 4∼22mm in 64QAM.
This paper introduces two schemes to put the decoding of the convolutional network code (CNC) into practice, which are named the Intermittent Packet Transmission Scheme (IPTS) and the Redundancy Packet Transmission Scheme (RPTS). According to the decoding formula of the sink nodes, we can see that, at the time k+δ in order to decode the source packet generated at time k, the sink node should know all the source packets generated before k-1. This is impractical. The two schemes we devised make it unnecessary. A construction algorithm is also given about the RPTS networks. For the two schemes, we analyze the strengths and weaknesses and point out their implemented condition.
Gopalan, Klivans, and Zuckerman proposed a list-decoding algorithm for Reed-Muller codes. Their algorithm works up to a given list-decoding radius. Dumer, Kabatiansky, and Tavernier improved the complexity of the algorithm for binary Reed-Muller codes by using the well-known Plotkin construction. In this study, we propose a list-decoding algorithm for non-binary Reed-Muller codes as a generalization of Dumer et al.'s algorithm. Our algorithm is based on a generalized Plotkin construction, and is more suitable for parallel computation than the algorithm of Gopalan et al. Since the list-decoding algorithms of Gopalan et al., Dumer et al., and ours can be applied to more general codes than Reed-Muller codes, we give a condition for codes under which these list-decoding algorithms works.
Meng XU Xincun JI Jianhui WU Meng ZHANG
This paper presents a low-power LDPC decoder that can be used in Multimedia Wireless Sensor Networks. Three low power design techniques are proposed in the decoder design: a layered decoding algorithm, a modified Benes network and a modified memory bypassing scheme. The proposed decoder is implemented in TSMC 0.13 µm, 1.2 V CMOS process. Experiments show that when the clock frequency is 32 MHz, the power consumption of the proposed decoder is 38.4 mW, the energy efficiency is 53.3 pJ/bit/ite and the core area is 1.8 mm2.
Masayuki HIRATA Toshiki YOSHIMINE
Magnetoencephalography (MEG) measures very weak neuromagnetic signals using SQUID sensors. Standard MEG analyses include averaged waveforms, isofield maps and equivalent current dipoles. Beamforming MEG analyses provide us with frequency-dependent spatiotemporal information about the cerebral oscillatory changes related to not only somatosensory processing but also language processing. Language dominance is able to be evaluated using laterality of power attenuation in the low γ band in the frontal area. Neuromagnetic signals of the unilateral upper movements are able to be decoded using a support vector machine.
Yang YU Shiro HANDA Fumihito SASAMORI Osamu TAKYU
In this paper, through extrinsic information transfer (EXIT) band chart analysis, an adaptive iterative decoding approach (AIDA) is proposed to reduce the iterative decoding complexity and delay for finite-length differentially encoded Low-density parity-check (DE-LDPC) coded systems with multiple-symbol differential detection (MSDD). The proposed AIDA can adaptively adjust the observation window size (OWS) of the MSDD soft-input soft-output demodulator (SISOD) and the outer iteration number of the iterative decoder (consisting of the MSDD SISOD and the LDPC decoder) instead of setting fixed values for the two parameters of the considered systems. The performance of AIDA depends on its stopping criterion (SC) which is used to terminate the iterative decoding before reaching the maximum outer iteration number. Many SCs have been proposed; however, these approaches focus on turbo coded systems, and it has been proven that they do not well suit for LDPC coded systems. To solve this problem, a new SC called differential mutual information (DMI) criterion, which can track the convergence status of the iterative decoding, is proposed; it is based on tracking the difference of the output mutual information of the LDPC decoder between two consecutive outer iterations of the considered systems. AIDA using the DMI criterion can adaptively adjust the out iteration number and OWS according to the convergence situation of the iterative decoding. Simulation results show that compared with using the existing SCs, AIDA using the DMI criterion can further reduce the decoding complexity and delay, and its performance is not affected by a change in the LDPC code and transmission channel parameters.
Sun-Ting LIN Shou-Sheu LIN Je-An LAI
A stopping criterion is an indispensable function to reduce unnecessary power consumption and decoding delay in turbo decoding. Until now, a common design philosophy in previous works has involved using the entire block of information from the MAP decoder and its input/output information to calculate the stopping index. It is an intuitive method but suffers from heavy memory requirements and high calculation complexity. In this paper, a low-complexity stopping criterion is proposed that avoids the aforementioned disadvantages. A general abstraction model is utilized to analyze the design bottleneck of stopping criteria. Instead of using an entire block of information, a compact representation derived from the internal information of the MAP decoder at a single time instant is used as a low-complexity stopping index. Theoretical explanation is provided to justify the feasibility of the proposed criterion. Simulation results show that the proposed criterion can reduce the complexity of stopping criterion dramatically while continuing to achieve the same level of performance as previous works.
Shan LU Jun CHENG Ying LI Yoichiro WATANABE
Physical-layer network coding with binary turbo coding in a two-way relay channel is considered. A two-user turbo decoding scheme is proposed with a simplified sum trellis. For two-user iterative decoding at a relay, the component decoder with its simplified sum trellis decodes the superimposed signal to the arithmetic sum of two users' messages. The simplified sum trellis is obtained by removing one of the states in a pair of mutual symmetrical states from a sum trellis. This removal reduces the decoding complexity to half of that with the sum trellis, and does not degrade decoding performance over AWGN channel since two output sequences from the pair of mutual symmetrical states are the same.
Meng CHENG Xiaobo ZHOU Khoirul ANWAR Tad MATSUMOTO
In this work, a simple doped accumulator (DACC)-assisted relay system is proposed by using bit-interleaved coded modulation with iterative decoding (BICM-ID). An extrinsic information transfer (EXIT) chart analysis shows that DACC keeps the convergence tunnel of the EXIT curves open until almost the (1, 1) point of the mutual information, which avoids the error floor. In the relay system, errors may happen in the source-relay link (intra-link), however, they are allowed in our proposed technique where the correlation knowledge between the source and the relay is exploited at the destination node. Strong codes are not needed and even the systematic source bits can be simply extracted at the relay even though the systematic part may contain some errors. Hence, the complexity of the relay can be significantly reduced, and thereby the proposed system is energy-efficient. Furthermore, the error probability of the intra-link can be estimated at the receiver by utilizing the a posteriori log-likelihood ratios (LLRs) of the two decoders, and it can be further utilized in the iterative processing. Additionally, we provide the analysis of different relay location scenarios and compare the system performances by changing the relay's location. The transmission channels in this paper are assumed to suffer from additive white Gaussian noise (AWGN) and block Rayleigh fading. The theoretical background of this technique is the Slepian-Wolf/Shannon theorem for correlated source coding. The simulation results show that the bit-error-rate (BER) performances of the proposed system are very close to theoretical limits supported by the Slepian-Wolf/Shannon theorem.
Xiaobo ZHOU Xin HE Khoirul ANWAR Tad MATSUMOTO
In this paper, we reformulate the issue related to wireless mesh networks (WMNs) from the Chief Executive Officer (CEO) problem viewpoint, and provide a practical solution to a simple case of the problem. It is well known that the CEO problem is a theoretical basis for sensor networks. The problem investigated in this paper is described as follows: an originator broadcasts its binary information sequence to several forwarding nodes (relays) over Binary Symmetric Channels (BSC); the originator's information sequence suffers from independent random binary errors; at the forwarding nodes, they just further interleave, encode the received bit sequence, and then forward it, without making heavy efforts for correcting errors that may occur in the originator-relay links, to the final destination (FD) over Additive White Gaussian Noise (AWGN) channels. Hence, this strategy reduces the complexity of the relay significantly. A joint iterative decoding technique at the FD is proposed by utilizing the knowledge of the correlation due to the errors occurring in the link between the originator and forwarding nodes (referred to as intra-link). The bit-error-rate (BER) performances show that the originator's information can be reconstructed at the FD even by using a very simple coding scheme. We provide BER performance comparison between joint decoding and separate decoding strategies. The simulation results show that excellent performance can be achieved by the proposed system. Furthermore, extrinsic information transfer (EXIT) chart analysis is performed to investigate convergence property of the proposed technique, with the aim of, in part, optimizing the code rate at the originator.
Ken HARIMA Hirobumi SAITO Takuji EBINUMA
On rocket-borne GPS receivers, the high dynamics make the use of PLL is impractical and a FLL is used. This in turn leads to a string of errors on the demodulated navigation message. The present paper proposes two decoding algorithms for GPS navigation messages suited for strings of errors. The first philosophy uses parity check syndromes to correct demodulation errors. The second philosophy uses Log Correlation Ratio (LCR) comparison along with parity check syndromes to correct for the most probable error pattern. GPS data availability can be significantly improved by using the latter.
Zhiliang HUANG Ming CHEN Chunjuan DIAO Jiamin LI
This letter presents a novel weighted reliability-based (WRB) algorithm for decoding low-density parity-check (LDPC) codes. Viewing the well-known normalized min sum (NMS) algorithm as reliability-based, the WRB algorithm can be seen as a simplified version of the NMS algorithm. Unlike the NMS algorithm, the WRB algorithm does not update the soft information sent between the variable nodes and check nodes, which greatly reduces the decoding complexity. For finite geometry LDPC codes with larger row redundancy and column weights, simulation results show that the WRB algorithm almost matches the error performance of the NMS algorithm.
Hua JIANG Kanglian ZHAO Yang LI Sidan DU
In this letter we design a new family of space-time block codes (STBC) for multi-input multi-output (MIMO) systems. The complex orthogonal STBC achieves full diversity and full transmission rate with fast maximum-likelihood decoding when only two transmit antennas are employed. By combining the Alamouti STBC and the multidimensional signal constellation rotation based on the cyclotomic number field, we construct cyclotomic orthogonal space-time block codes (COSTBCs) which can achieve full diversity and full rate for multiple transmit antennas. Theoretical analysis and simulation results demonstrate excellent performance of the proposed codes, while the decoding complexity is further reduced.
Ji-Woong CHOI Jungwon LEE Jihwan P. CHOI Hui-Ling LOU
In this paper, we propose a soft-decoding near-ML MIMO demodulation scheme that achieves near optimal performance with fixed and low complexity. Exploiting the regular structure of bit-to-symbol mapping, the proposed scheme performs hard demodulation to find the first candidate symbol for each stream followed by selection of nearby candidate points such that at least one candidate exists for the computation of likelihood information of bit 0 and 1 without intermediate calculation of the Euclidean distance. This demodulation scheme enables an improvement in performance by guaranteeing the existence of candidates and a significant reduction in the number of distance calculations which is a major complexity burden. The performance is evaluated by computer simulation, and computational complexity is also assessed in terms of the number of complex multiplication.
Masakazu YOSHIDA Manabu HAGIWARA Takayuki MIYADERA Hideki IMAI
Entangled states play crucial roles in quantum information theory and its applied technologies. In various protocols such as quantum teleportation and quantum key distribution, a good entangled state shared by a pair of distant players is indispensable. In this paper, we numerically examine entanglement sharing protocols using quantum LDPC CSS codes. The sum-product decoding method enables us to detect uncorrectable errors, and thus, two protocols, Detection and Resending (DR) protocol and Non-Detection (ND) protocol are considered. In DR protocol, the players abort the protocol and repeat it if they detect the uncorrectable errors, whereas in ND protocol they do not abort the protocol. We show that DR protocol yields smaller error rate than ND protocol. In addition, it is shown that rather high reliability can be achieved by DR protocol with quantum LDPC CSS codes.
Chen LIU Xin JIN Tianruo ZHANG Satoshi GOTO
High-definition (HD) videos become more and more popular on portable devices these years. Due to the resolution mismatch between the HD video sources and the relative low-resolution screens of portable devices, the HD videos are usually fully decoded and then down-sampled (FDDS) for the displays, which not only increase the cost of both computational power and memory bandwidth, but also lose the details of video contents. In this paper, an encoder-unconstrained partial decoding scheme for H.264/AVC is presented to solve the problem by only decoding the object of interest (OOI) related region, which is defined by users. A simplified compression domain tracking method is utilized to ensure that the OOI locates in the center of the display area. The decoded partial area (DPA) adaptation, the reference block relocation (RBR) and co-located temporal Intra prediction (CTIP) methods are proposed to improve the visual quality for the DPA with low complexity. The simulation results show that the proposed partial decoding scheme provides an average of 50.16% decoding time reduction comparing to the fully decoding process. The displayed region also presents the original HD granularity of OOI. The proposed partial decoding scheme is especially useful for displaying HD video on the devices of which the battery life is a crucial factor.