1-7hit |
Pengjun WANG Yuejun ZHANG Jun HAN Zhiyi YU Yibo FAN Zhang ZHANG
In modern cryptographic systems, physical unclonable functions (PUFs) are efficient mechanisms for many security applications, which extract intrinsic random physical variations to generate secret keys. The classical PUFs mainly exhibit static challenge-response behaviors and generate static keys, while many practical cryptographic systems need reconfigurable PUFs which allow dynamic keys derived from the same circuit. In this paper, the concept of reconfigurable multi-port PUFs (RM-PUFs) is proposed. RM-PUFs not only allow updating the keys without physically replacement, but also generate multiple keys from different ports in one clock cycle. A practical RM-PUFs construction is designed based on asynchronous clock and fabricated in TSMC low-power 65 nm CMOS process. The area of test chip is 1.1 mm2, and the maximum clock frequency is 0.8 GHz at 1.2 V. The average power consumption is 27.6 mW at 27. Finally, test results show that the RM-PUFs generate four reconfigurable 128-bit secret keys, and the keys are secure and reliable over a range of environmental variations such as supply voltage and temperature.
Xiangyu MENG Kangfeng WEI Zhiyi YU Xinlun CAI
This paper proposes a low-power 100Gb/s four-level pulse amplitude modulation driver (PAM-4 Driver) based on linear distortion compensation structure for thin-film Lithium Niobate (LiNbO3) modulators, which manages to achieve high linearity in the output. The inductive peaking technology and open drain structure enable the overall circuit to achieve a 31-GHz bandwidth. With an area of 0.292 mm2, the proposed PAM-4 driver chip is designed in a 65-nm process to achieve power consumption of 37.7 mW. Post-layout simulation results show that the power efficiency is 0.37 mW/Gb/s, RLM is more than 96%, and the FOM value is 8.84.
Xiangyu MENG Yecong LI Zhiyi YU
This paper proposes a design of high-speed interconnection between optical modules and electrical modules via bonding-wires and coplanar waveguide transmission lines on printed circuit boards for 400 Gbps 4-channel optical communication systems. In order to broaden the interconnection bandwidth, interdigitated capacitors were integrated with GSG pads on chip for the first time. Simulation results indicate the reflection coefficient is below -10 dB from DC to 53 GHz and the insertion loss is below 1 dB from DC to 45 GHz. Both indicators show that the proposed interconnection structure can effectively satisfy the communication bandwidth requirements of 100-Gbps or even higher data-rate PAM4 signals.
Bei HUANG Kaidi YOU Yun CHEN Zhiyi YU Xiaoyang ZENG
Reed-Solomon (RS) codes are widely used in digital communication and storage systems. Unlike usual VLSI approaches, this paper presents a high throughput fully programmable Reed-Solomon decoder on a multi-core processor. The multi-core processor platform is a 2-Dimension mesh array of Single Instruction Multiple Data (SIMD) cores, and it is well suited for digital communication applications. By fully extracting the parallelizable operations of the RS decoding process, we propose multiple optimization techniques to improve system throughput, including: task level parallelism on different cores, data level parallelism on each SIMD core, minimizing memory access, and route length minimized task mapping techniques. For RS(255, 239, 8), experimental results show that our 12-core implementation achieve a throughput of 4.35 Gbps, which is much better than several other published implementations. From the results, it is predictable that the throughput is linear with the number of cores by our approach.
Yan YING Dan BAO Zhiyi YU Xiaoyang ZENG Yun CHEN
In this paper, a cost-efficient LDPC decoder for DVB-S2 is presented. Based on the Normalized Min-Sum algorithm and the turbo-decoding message-passing (TDMP) algorithm, a dual line-scan scheduling is proposed to enable hardware reusing. Furthermore, we present the solution to the address conflict issue caused by the characteristic of the parity-check matrix defined by DVB-S2 LDPC codes. Based on SMIC 0.13 µm standard CMOS process, the LDPC decoder has an area of 12.51 mm2. The required operating frequency to meet the throughput requirement of 135 Mbps with maximum iteration number of 30 is 105 MHz. Compared with the latest published DVB-S2 LDPC decoder, the proposed decoder reduces area cost by 34%.
Wenhua FAN Chen CHEN Yun CHEN Zhiyi YU Xiaoyang ZENG
This paper presents an efficient implementation of OFDM inner receiver on a programmable multi-core processor platform with CMMB as an application. The platform consists of an array of programmable SIMD processors interconnected in a 2-D mesh network, which can provide high performance and is quite suitable for wireless communication applications. Implemented on one cluster with 8 cores, the receiver includes symbol timing, carrier frequency offset and sampling frequency offset synchronization, channel estimation and equalization. Multiple optimization techniques are explored to improve system throughput such as: task-level parallelism on many cores, data-level parallelism on SIMD cores, minimization of memory access and route-length-minimization task mapping techniques. Besides, efficient memory strategy and specific instructions for complex computation increase the performance. The simulation results show that the inner receiver could achieve a throughput of up to 120 Mbps when operating at 750 MHz.
Zewen SHI Xiaoyang ZENG Zhiyi YU
Manufacturing defects in the deep sub-micron VLSI process and aging resulted problems of devices during lifecycle are inevitable, and fault-tolerant routing algorithms are important to provide the required communication for NoCs in spite of failures. The proposed algorithm, referred to as scalable and reconfigurable fault-tolerant distributed routing (RFDR), partitions the system into nine regions using the concept of divide-and-conquer. It is a distributed algorithm, and each router guarantees fault-tolerance within one's own region and the system can be still sustained with multiple fault areas. The proposed RFDR has excellent scalability with hardware cost keeping constant independent of system size. Also it is completely reconfigurable when new nodes fail. Simulations under various synthetic traffic patterns show its better performance compared to Extended-XY routing algorithm. Moreover, there is almost no hardware overhead compared to Logic-Based Distributed Routing (LBDR), but the fault-tolerance capacity is enhanced in the proposed algorithm. Hardware cost is reduced 37% compared to Reconfigurable Distributed Scalable Predictable Interconnect Network (R-DSPIN) which only supports single fault region.