Young-Woong KO Min-Ja KIM Jeong-Gun LEE Chuck YOO
In this paper, we propose a new user-level file system to support block relocation by modifying the file allocation table without actual data copying. The key idea of the proposed system is to provide the block insertion and deletion function for file manipulation. This approach can be used very effectively for block-aligned file modification applications such as a compress utility and a TAR archival system. To show the usefulness of the proposed file system, we adapted the new functionality to TAR application by modifying TAR file to support an efficient sub-file management scheme. Experiment results show that the proposed system can significantly reduce the file I/O overhead and improve the I/O performance of a file system.
Osamu TAKYU Yohtaro UMEDA Fumihito SASAMORI Shiro HANDA
This paper proposes the assignment of resource blocks (RBs) to reduce the peak-to-average power ratio (PAPR) of orthogonal frequency division multiplexing (OFDM) in a multi-user OFDM system. This system ranks the users according to the channel state information (CSI) for RB assignment. In our proposed technique, an RB is assigned to either the first- or second-ranked mobile station (MS) to minimize the PAPR of the OFDM signal. While this process reduces the PAPR, the throughput is also reduced because of the user diversity gain loss. A PAPR-throughput tradeoff is then established. Theoretical analyses and computer simulations confirm that when the number of MSs becomes large, the PAPR-throughput tradeoff is eased because of the minimal effect of the diversity gain loss. Therefore, significant PAPR reduction is achieved with only a slight degradation in the throughput.
Tiebin WU Hengzhu LIU Botao ZHANG
This paper presents a novel test data compression scheme for SoCs based on block merging and compatibility. The technique exploits the properties of compatibility and inverse compatibility between consecutive blocks, consecutive merged blocks, and two halves of the encoding merged block itself to encode the pre-computed test data. The decompression circuit is simple to be implemented and has advantage of test-independent. In addition, the proposed scheme is applicable for IP cores in SoCs since it compresses the test data without requiring any structural information of the circuit under test. Experimental results demonstrate that the proposed technique can achieve an average compression ratio up to 68.02% with significant low test application time.
Xiangyu ZHANG Yangdong DENG Shuai MU
General purpose computing on GPU (GPGPU) has become a popular computing model for high-performance, data-intensive applications. Accordingly, there is a strong need to develop highly efficient data structures to ease the development of GPGPU applications. In this work, we proposed an efficient concurrent queue data structure for GPU computing. The GPU based provably correct, lock-free FIFO queue allows a massive number of concurrent producers and consumers. Warp-centric en-queue and de-queue procedures are introduced to better match the underlying Single-Instruction, Multiple-Thread execution model of modern GPUs. It outperforms the best previous GPU queues by up to 40 fold. The correctness of the proposed queue operations is formally validated by linearizability criteria.
In this paper, we first prove beyond-birthyday-bound security for the Misty structure. Specifically, we show that an r-round Misty structure is secure against CCA attacks up to $O(2^{rac{rn}{r+7}})$ query complexity, where n is the size of each round permutation. So for any ε>0, a sufficient number of rounds would guarantee the security of the Misty structure up to 2n(1-ε) query complexity.
Yu GU Chuanyi LIU Dongsheng WANG
Cloud computing has rising as a new popular service paradigm with typical advantages as ease of use, unlimited resources and pay-as-you-go pricing model. Cloud resources are more flexible and cost-effective than private or colocation resources thus more suitable for storing the outdated backup data that are infrequently accessed by continuous data protection (CDP) systems. However, the cloud achieves low cost at the same time may slow down the recovery procedure due to its low bandwidth and high latency. In this paper, a novel block-level CDP system architecture: MYCDP is proposed to utilize cloud resources as the back-end storage. Unlike traditional delta-encoding based CDP approaches which should traverse all the dependent versions and decode the recovery point, MYCDP adopts data deduplication mechanism to eliminate data redundancy between all versions of all blocks, and constructs a version index for all versions of the protected storage, thus it can use a query-and-fetch process to recover version data. And with a specific version index data structure and a disk/memory hybrid cache module, MYCDP reduces the storage space consumption and data transfer between local and cloud. It also supports deletion of arbitrary versions without risk of invalidating some other versions. Experimental results demonstrate that MYCDP can achieve much lower cost than traditional local based CDP approaches, while remaining almost the same recovery speed with the local based deduplication approach for most recovery cases. Furthermore, MYCDP can obtain both faster recovery and lower cost than cloud based delta-encoding CDP approaches for any recovery points. And MYCDP gets more profits while protecting multiple systems together.
Tomoharu SHIBUYA Kazuki KOBAYASHI
In this paper, we propose a new encoding method applicable to any linear codes over arbitrary finite field whose computational complexity is O(δ*n) where δ* and n denote the maximum column weight of a parity check matrix of a code and the code length, respectively. This means that if a code has a parity check matrix with the constant maximum column weight, such as LDPC codes, it can be encoded with O(n) computation. We also clarify the relation between the proposed method and conventional methods, and compare the computational complexity of those methods. Then we show that the proposed encoding method is much more efficient than the conventional ones.
Sho IKEDA Sangyeop LEE Tatsuya KAMIMURA Hiroyuki ITO Noboru ISHIHARA Kazuya MASU
This paper proposes an ultra-low-power 5.5-GHz PLL which employs the new divide-by-4 injection-locked frequency divider (ILFD) and a class-C VCO with linearity-compensated varactor for low supply voltage operation. A forward-body-biasing (FBB) technique can decrease threshold voltage of MOS transistors, which can improve operation frequency and can widen the lock range of the ILFD. The FBB is also employed for linear-frequency-tuning of VCO under low supply voltage of 0.5V. The double-switch injection technique is also proposed to widen the lock range of the ILFD. The digital calibration circuit is introduced to control the lock-range of ILFD automatically. The proposed PLL was fabricated in a 65nm CMOS process. With a 34.3-MHz reference, it shows a 1-MHz-offset phase noise of -106dBc/Hz at 5.5GHz output. The supply voltage is 0.54V for divider and 0.5V for other components. Total power consumption is 0.95mW.
Jaeyoung LEE Hyundong SHIN Jun HEO
In this paper, we consider decouple-and-forward (DCF) relaying, where the relay encodes and amplifies decoupled data using orthogonal space-time block codes (OSTBCs), to achieve the maximum diversity gain of multiple-input multiple-output (MIMO) amplify-and-forward (AF) relaying. Since the channel status of all antennas is generally unknown and time-varying for cooperation in multi-antenna multiple-relay systems, we investigate an opportunistic relaying scheme for DCF relaying to harness distributed antennas and minimize the cooperation overheads by not using the global channel state information (CSI). In addition, for realistic wireless channels which have spatial fading correlation due to closely-spaced antenna configurations and poor scattering environments, we analyze the exact and lower bound on the symbol error probability (SEP) of the opportunistic DCF relaying over spatially correlated MIMO Rayleigh fading channels. Numerical results show that, even in the presence of spatial fading correlation, the proposed opportunistic relaying scheme is efficient and achieves additional performance gain with low overhead.
Lechang LIU Keisuke ISHIKAWA Tadahiro KURODA
Parametric resonance based solutions for sub-gigahertz radio frequency transceiver with 0.3V supply voltage are proposed in this paper. As an implementation example, a 0.3V 720µW variation-tolerant injection-locked frequency multiplier is developed in 90nm CMOS. It features a parametric resonance based multi-phase synthesis scheme, thereby achieving the lowest supply voltage with -110dBc@ 600kHz phase noise and 873MHz-1.008GHz locking range in state-of-the-art frequency synthesizers.
Pil-Ho LEE Hyun Bae LEE Young-Chan JANG
A 125MHz 64-phase delay-locked loop (DLL) is implemented for time recovery in a digital wire-line system. The architecture of the proposed DLL comprises a coarse-locking circuit added to a conventional DLL circuit, which consists of a delay line including a bias circuit, phase detector, charge pump, and loop filter. The proposed coarse-locking circuit reduces the locking time of the DLL and prevents harmonic locking, regardless of the duty cycle of the clock. In order to verify the performance of the proposed coarse-locking circuit, a 64-phase DLL with an operating frequency range of 40 to 200MHz is fabricated using a 0.18-µm 1-poly 6-metal CMOS process with a 1.8V supply. The measured rms and peak-to-peak jitter of the output clock are 3.07ps and 21.1ps, respectively. The DNL and INL of the 64-phase output clock are measured to be -0.338/+0.164 LSB and -0.464/+0.171 LSB, respectively, at an operating frequency of 125MHz. The area and power consumption of the implemented DLL are 0.3mm2 and 12.7mW, respectively.
The Generalized Feistel Structure (GFS) is one of the structures used in designs of blockciphers and hash functions. There are several types of GFSs, and we focus on Type 1 and Type 2 GFSs. The security of these structures are well studied and they are adopted in various practical blockciphers and hash functions. The round function used in GFSs consists of two layers. The first layer uses the nonlinear function. Type 1 GFS uses one nonlinear function in this layer, while Type 2 GFS uses a half of the number of sub-blocks. The second layer is a sub-block-wise permutation, and the cyclic shift is generally used in this layer. In this paper, we formalize Type 1.x GFS, which is the natural extension of Type 1 and Type 2 GFSs with respect to the number of nonlinear functions in one round. Next, for Type 1.x GFS using two nonlinear functions in one round, we propose a permutation which has a good diffusion property. We demonstrate that Type 1.x GFS with this permutation has a better diffusion property than other Type 1.x GFS with the sub-block-wise cyclic shift. We also present experimental results of evaluating the diffusion property and the security against the saturation attack, impossible differential attack, differential attack, and linear attack of Type 1.x GFSs with various permutations.
Jeonghoon HAN Masaya MIYAHARA Akira MATSUZAWA
This paper derives a maximum lock range of an injection locked ring oscillator in a direct injection method and presents an injection locked charge-pump phase-locked loop (CPPLL) with a replica of a ring oscillator. The proposed injection-locked PLL separates the injection-locked VCO from the continuous phase-tracking loop of the PLL such that can provide stable lock-state maintenance and tolerance to temperature and supply voltage variation. The measurement results show that the proposed injection-locked PLL can be tolerable to voltage variation of 11.2% in supply voltage of 1.2V. In-band noises of the injection-locked oscillator at offset frequencies of 10kHz and 100kHz are -108.2dBc/Hz and -114.6dBc/Hz, respectively.
Leida LI Hancheng ZHU Jiansheng QIAN Jeng-Shyang PAN
This letter presents a no-reference blocking artifact measure based on analysis of color discontinuities in YUV color space. Color shift and color disappearance are first analyzed in JPEG images. For color-shifting and color-disappearing areas, the blocking artifact scores are obtained by computing the gradient differences across the block boundaries in U component and Y component, respectively. An overall quality score is then produced as the average of the local ones. Extensive simulations and comparisons demonstrate the efficiency of the proposed method.
Shice NI Yong DOU Kai CHEN Jie ZHOU
This letter proposes a novel high performance crypto coprocessor that relies on Reconfigurable Cryptographic Blocks. We implement the prototype of the coprocessor on Xilinx FPGA chip. And the pipelining technique is adopted to realize data paralleling. The results show that the coprocessor, running at 189MHz, outperforms the software-based SSL protocol.
Koh JOHGUCHI Kasuaki YOSHIOKA Ken TAKEUCHI
In this paper, we propose an optimum access method for a phase change memory (PCM) with NAND strings. A PCM with a block erase interface is proposed. The method, which has a SET block erase operation and fast RESET programming, is proposed since the SET operation causes a slow access time for conventional PCM;. From the results of measurement, the SET-ERASE operation is successfully completed while the RESET-ERASE operation is incomplete owing to serial connection. As a result, the block erase interface with the SET-ERASE and RESET program method realizes a 7.7 times faster write speed compared than a conventional RAM interface owing to the long SET time. We also give pass-transistor design guidelines for PCM with NAND strings. In addition, the write-capability and write-disturb problems are investigated. The ERASE operation for the proposed device structure can be realized with the same current as that for the SET operation of a single cell. For the pass transistor, about 4.4 times larger on-current is needed to carry out the RESET operation and to avoid the write-disturb problem than the minimum RESET current of a single cell. In this paper, the SET programming method is also verified for a conventional RAM interface. The experimental results show that the write-capability and write-disturb problems are negligible.
SinNyoung KIM Akira TSUCHIYA Hidetoshi ONODERA
This paper proposes a radiation-hardened phase-locked loop (RH-PLL) with a switchable dual modular redundancy (DMR) structure. After radiation strikes, unhardened PLLs suffer clock perturbations. Conventional RH-PLLs have been proposed to reduce recovery time after perturbation. However, this recovery still requires tens of clock cycles. Our proposal involves ‘detecting’ and ‘switching’, rather than ‘recovering’ from clock perturbation. Detection speed is crucial for robust perturbation-immunity. We identify types of clock perturbation and then propose a set of detectors to detect each type. With this method, the detectors guarantee high-speed detection that leads to perturbation-immune switching from a radiated clock to an undistorted clock. The proposed RH-PLL was fabricated and then verified with a radiation test on real silicon.
Many kinds of data can be represented as a network or graph. It is crucial to infer the latent structure underlying such a network and to predict unobserved links in the network. Mixed Membership Stochastic Blockmodel (MMSB) is a promising model for network data. Latent variables and unknown parameters in MMSB have been estimated through Bayesian inference with the entire network; however, it is important to estimate them online for evolving networks. In this paper, we first develop online inference methods for MMSB through sequential Monte Carlo methods, also known as particle filters. We then extend them for time-evolving networks, taking into account the temporal dependency of the network structure. We demonstrate through experiments that the time-dependent particle filter outperformed several baselines in terms of prediction performance in an online condition.
SinNyoung KIM Akira TSUCHIYA Hidetoshi ONODERA
This paper presents an analysis of radiation-induced clock-perturbation in phase-locked loop (PLL). Due to a trade-off between cost, performance, and reliability, radiation hardened PLL design need robust strategy. Thus, evaluation of radiation vulnerability is important to choose the robust strategy. The conventional evaluation-method is however based on brute-force analysis — SPICE simulation and experiment. The presented analysis result eliminates the brute-force analysis in evaluation of the radiation vulnerability. A set of equations enables to predict the radiation-induced clock-perturbation at the every sub-circuits. From a demonstration, the most vulnerable nodes have been found, which are validated using a PLL fabricated with 0.18µm CMOS process.
Chia-Shao HUNG Shanq-Jang RUAN
Image binarization refers to convert gray-level images into binary ones, and many binarization algorithms have been developed. The related algorithms can be classified as either high quality computation or high speed performance. This letter presents an algorithm that ensures both benefits at the same time. The proposed algorithm intelligently segments input images into several sub-image, after which the sub-image binarization is performed independently. Experimental results reveal that our algorithm provides the appropriate quality with the medium speed.