Yun CHEN Xubin CHEN Zhiyuan GUO Xiaoyang ZENG Defeng HUANG
A highly parallel turbo decoder for 3GPP LTE/LTE-Advanced systems is presented. It consists of 32 radix-4 soft-in/soft-out (SISO) decoders. Each SISO decoder is based on the proposed full-parallel sliding window (SW) schedule. Implemented in a 0.13 µm CMOS technology, the proposed design occupies 12.96 mm2 and achieves 1.5 Gb/s while decoding size-6144 blocks with 5.5 iterations. Compared with conventional SW schedule, the throughput is improved by 30–76% with 19.2% area overhead and negligible energy overhead.
Chuang ZHU Xiao Feng HUANG Guo Qing XIANG Hui Hui DONG Jia Wen SONG
In this paper, we propose a highly efficient mobile visual search algorithm. For descriptor extraction process, we propose a low complexity feature detection which utilizes the detected local key points of the coarse octaves to guide the scale space construction and feature detection in the fine octave. The Gaussian and Laplacian operations are skipped for the unimportant area, and thus the computing time is saved. Besides, feature selection is placed before orientation computing to further reduce the complexity of feature detection by pre-discarding some unimportant local points. For the image retrieval process, we design a high-performance reranking method, which merges both the global descriptor matching score and the local descriptor similarity score (LDSS). In the calculating of LDSS, the tf-idf weighted histogram matching is performed to integrate the statistical information of the database. The results show that the proposed highly efficient approach achieves comparable performance with the state-of-the-art for mobile visual search, while the descriptor extraction complexity is largely reduced.
In this paper we investigate a low complexity channel estimation and data transmission scheme for bi-directional relaying networks. We also propose a semi-orthogonal pilot structure for channel estimation to increase the efficiency of data transmission between the Base Station (BS) and Mobile Station (MS) via a fixed Relay Node (RN).
I-Shyan HWANG I-Feng HUANG Chih-Dar CHIEN David H. SU
This work proposes a distributed fault protection mechanism called the Dynamic-Shared Segment Protection (DSSP) algorithm for WDM (Wavelength Division Multiplexing) mesh networks. The objects are to assure high probability of path protection and efficient use of network resources. The proposed approach exploits the segment protection mode, which accommodates the characteristics of both path-based and link-based protections, for providing finer service granularities, to satisfy the versatile requirements of critical applications in the foreseeable future. To show that DSSP can improve performance efficiency, simulations are conducted using four networks (NSFNET, USANET, Mesh 66, Mesh 99) for a comparative study of the proposed DSSP versus ordinary shared protection schemes and SLSP (Short Leap Shared Protection). Simulation results reveal that the proposed DSSP method results in much lower blocking probability and has higher network utilization. Consequently, it is very useful for applications to a real-time WDM network, which changes status dynamically.
Aibin YAN Huaguo LIANG Zhengfeng HUANG Cuiyun JIANG Maoxiang YI
In this paper, a self-recoverable, frequency-aware and cost-effective robust latch (referred to as RFC) is proposed in 45nm CMOS technology. By means of triple mutually feedback Muller C-elements, the internal nodes and output node of the latch are self-recoverable from single event upset (SEU), i.e. particle striking induced logic upset, regardless of the energy of the striking particle. The proposed robust latch offers a much wider spectrum of working clock frequency on account of a smaller delay and insensitivity to high impedance state. The proposed robust latch performs with lower costs regarding power and area than most of the compared latches. SPICE simulation results demonstrate that the area-power-delay product is 73.74% saving on average compared with previous radiation hardened latches.
Tianming NI Huaguo LIANG Mu NIE Xiumin XU Aibin YAN Zhengfeng HUANG
Three-dimensional integrated circuits (3D ICs) that employ through-silicon vias (TSVs) integrating multiple dies vertically have opened up the potential of highly improved circuit designs. However, various types of TSV defects may occur during the assembly process, especially the clustered TSV faults because of the winding level of thinned wafer, the surface roughness and cleanness of silicon dies,inducing TSV yield reduction greatly. To tackle this fault clustering problem, router-based and ring-based TSV redundancy architectures were previously proposed. However, these schemes either require too much area overhead or have limited reparability to tolerant clustered TSV faults. Furthermore, the repairing lengths of these schemes are too long to be ignored, leading to additional delay overhead, which may cause timing violation. In this paper, we propose a region-based TSV redundancy design to achieve relatively high reparability as well as low additional delay overhead. Simulation results show that for a given number of TSVs (8*8) and TSV failure rate (1%), our design achieves 11.27% and 20.79% reduction of delay overhead as compared with router-based design and ring-based scheme, respectively. In addition, the reparability of our proposed scheme is much better than ring-based design by 30.84%, while it is close to that of the router-based scheme. More importantly, the overall TSV yield of our design achieves 99.88%, which is slightly higher than that of both router-based method (99.53%) and ring-based design (99.00%).
Jinfeng HU Huanrui ZHU Huiyong LI Julan XIE Jun LI Sen ZHONG
Recently, many neural networks have been proposed for radar sea clutter suppression. However, they have poor performance under the condition of low signal to interference plus noise ratio (SINR). In this letter, we put forward a novel method to detect a small target embedded in sea clutter based on an optimal filter. The proposed method keeps the energy in the frequency cell under test (FCUT) invariant, at the same time, it minimizes other frequency signals. Finally, detect target by judging the output SINR of every frequency cell. Compared with the neural networks, the algorithm proposed can detect under lower SINR. Using real-life radar data, we show that our method can detect the target effectively when the SINR is higher than -39dB which is 23dB lower than that needed by the neural networks.
Semi-bent functions have almost maximal nonlinearity. In this paper, two classes of semi-bent functions are constructed by modifying the supports of two quadratic Boolean functions $f_1(x_1,x_2,cdots,x_n)=igopluslimits^{k}_{i=1}x_{2i-1}x_{2i}$ with $n=2k+1geq3$ and $f_2(x_1,x_2,cdots,x_n)=igopluslimits^{k}_{i=1}x_{2i-1}x_{2i}$ with $n=2k+2geq4$. Meanwhile, the algebraic normal forms of the newly constructed semi-bent functions are determined.
Chiao-Chan HUANG Zhi-Feng HUANG Ann-Chen CHANG
A minor component analysis approach based on the generalized sidelobe canceler is presented to realize the blind suppression of multiple-access interference in multicarrier code division multiple access systems. With a rough user-code and timing estimations, this proposed method of less computation performs the same as minimum mean square error detectors and outperforms existing blind detectors. Simulation results illustrate the effectiveness of the blind multiuser detection.
Huaguo LIANG Xin LI Zhengfeng HUANG Aibin YAN Xiumin XU
With the scaling of technology, nanoscale CMOS integrated circuits are becoming more sensitive to single event double node upsets induced by charge sharing. A novel highly robust hardened latch design is presented that is fully resilient to single event double node upsets and single node upsets. The proposed latch employs multiple redundant C-elements to form a dual interlocked structure in which the redundant C-elements can bring the affected nodes back to the correct states regardless of the energy of the striking particle. Detailed HSPICE results confirm that the proposed latch features complete resilience to double node upsets and achieves an improved trade-off in terms of robustness, area, delay and power in comparison with previous latches. Extensive Monte Carlo simulations validate the proposed latch features as less sensitive to process, supply voltage and temperature variations.
Feng HU Wei LI Hua ZHANG Matti LATVA-AHO Xiaohu YOU
Reducing the energy consumption of wireless communication systems with new technologies and solutions continues to be an important concern in developing future standards. In this paper, we study the routing strategies in multi-hop relaying networks. For a 2-way assignment routing method, an efficient feedback scheme is presented to minimize the power consumption over the whole system. Compared with the full channel information in traditional feedback scheme, only the backward accumulated feedback metrics are required. If the proposed routing calculation is used, there is no performance loss. When the number of the hops and the relays is large, the new scheme achieves a significant feedback overhead reduction. Moreover, we show a proof for the optimality of the presented routing strategy based on mathematical induction.
Wei XIA Wei LIU Xinglong XIA Jinfeng HU Huiyong LI Zishu HE Sen ZHONG
The recently proposed distributed adaptive direct position determination (D-ADPD) algorithm provides an efficient way to locating a radio emitter using a sensor network. However, this algorithm may be suboptimal in the situation of colored emitted signals. We propose an enhanced distributed adaptive direct position determination (EDA-DPD) algorithm. Simulations validate that the proposed EDA-DPD outperforms the D-ADPD in colored emitted signals scenarios and has the similar performance with the D-ADPD in white emitted signal scenarios.
Soon-Young OH Jang-Gn YUN Bin-Feng HUANG Yong-Jin KIM Hee-Hwan JI Sang-Bum HUH Han-Seob CHA Ui-Sik KIM Jin-Suk WANG Hi-Deok LEE
A novel NiSi technology with bi-layer Co/TiN structure as a capping layer is proposed for the highly thermal immune Ni Silicide technology. Much better thermal immunity of Ni Silicide was certified up to 700, 30 min post silicidation furnace annealing by introducing Co/TiN bi-layer capping. The proposed structure is successfully applied to nano-scale CMOSFET with a gate length of 80 nm. The sheet resistance of nano-scale gate poly shows little degradation even after the high temperature furnace annealing of 650, 30 min. The Ni/Co/TiN structure is very promising for the nano-scale MOSFET technology which needs the ultra shallow junction and high temperature post silicidation processes
Qinghua SHENG Yu CHENG Xiaofang HUANG Changcai LAI Xiaofeng HUANG Haibin YIN
Dependent Quantization (DQ) is a new quantization tool introduced in the Versatile Video Coding (VVC) standard. While it provides better rate-distortion calculation accuracy, it also increases the computational complexity and hardware cost compared to the widely used scalar quantization. To address this issue, this paper proposes a parallel-dependent quantization hardware architecture using Verilog HDL language. The architecture preprocesses the coefficients with a scalar quantizer and a high-frequency filter, and then further segments and processes the coefficients in parallel using the Viterbi algorithm. Additionally, the weight bit width of the rate-distortion calculation is reduced to decrease the quantization cycle and computational complexity. Finally, the final quantization of the TU is determined through sequential scanning and judging of the rate-distortion cost. Experimental results show that the proposed algorithm reduces the quantization cycle by an average of 56.96% compared to VVC’s reference platform VTM, with a Bjøntegaard delta bit rate (BDBR) loss of 1.03% and 1.05% under the Low-delay P and Random Access configurations, respectively. Verification on the AMD FPGA development platform demonstrates that the hardware implementation meets the quantization requirements for 1080P@60Hz video hardware encoding.
This Letter proposes a way of resolving spreading code mismatch in blind multiuser detection with subspace-based technique. It has been shown that subspace-based (SSB) blind multiuser detectors demonstrate the advantages of fast convergence speed and less sensitivity to spreading code mismatch over constrained mean output energy (CMOE) detectors. With a corrected scheme of the desired user code, the proposed method offers more robust capabilities over existing SSB techniques. Numerical results show that the effectiveness of the proposed technique.
Ann-Chen CHANG Chiao-Chan HUANG Zhi-Feng HUANG
Two simple frequency offset estimators based on projection approaches for multicarrier code-division multiple access systems are proposed, without using specific training sequences. It is not only can estimate and correct frequency offset, but also has less computational load. Several computer simulations are provided for illustrating the effectiveness of the blind estimate approaches.
Hongjun LIU Baokang ZHAO Xiaofeng HU Dan ZHAO Xicheng LU
Root cause analysis of BGP updates is the key to debug and troubleshoot BGP routing problems. However, it is a challenge to precisely diagnose the cause and the origin of routing instability. In this paper, we are the first to distinguish link failure events from policy change events based on BGP updates from single vantage points by analyzing the relationship of the closed loops formed through intersecting all the transient paths during instability and the length variation of the stable paths after instability. Once link failure events are recognized, their origins are precisely inferred with 100% accuracy. Through simulation, our method is effective to distinguish link failure events from link restoration events and policy related events, and reduce the size of candidate set of origins.
Chuang ZHU Jie LIU Xiao Feng HUANG Guo Qing XIANG
This paper reports a high-quality hardware-friendly integer motion estimation (IME) scheme. According to different characteristics of CTU content, the proposed method adopts different adaptive multi-resolution strategies coupled with accurate full-PU modes IME at the finest level. Besides, by using motion vector derivation, IME for the second reference frame is simplified and hardware resource is saved greatly through processing element (PE) sharing. It is shown that the proposed architecture can support the real-time processing of 4K-UHD @60fps, while the BD-rate is just increased by 0.53%.
ChangCheng WU Min WANG JunJie WANG WeiMing LUO JiaFeng HUA XiTao CHEN Wei GENG Yu LU Wei SUN
Although the classical vector median filter (VMF) has been widely used to suppress the impulse noise in the color image, many thin color curve pixels aligned in arbitrary directions are usually removed out as impulse noise. This serious problem can be solved by the proposed method that can protect the thin curves in arbitrary direction in color image and remove out the impulse noise at the same time. Firstly, samples in the 3x3 filter window are considered to preliminarily detect whether the center pixel is corrupted by impulse noise or not. Then, samples outside a 5x5 filter window are conditionally and partly considered to accurately distinguish the impulse noise and the noise-free pixel. At last, based on the previous outputs, samples on the processed positions in a 3x3 filter window are chosen as the samples of VMF operation to suppress the impulse noise. Extensive experimental results indicate that the proposed algorithm can be used to remove the impulse noise of color image while protecting the thin curves in arbitrary directions.
Shan DING Gang ZENG Ryo KURACHI Ruifeng HUANG
As a next-generation CAN (Controller Area Network), CAN FD (CAN with flexible data rate) has attracted much attention recently. However, how to use the improved bus bandwidth efficiently in CAN FD is still an issue. Contrasting with existing methods using greedy approximate algorithms, this paper proposes a genetic algorithm for CAN FD frame packing. It tries to minimize the bandwidth utilization by considering the different periods of signals when packing them in the same frame. Moreover, it also checks the schedulability of packed frames to guarantee the real-time constraints of each frame and proposed a merging algorithm to improve the schedulability for signal set with high bus load. Experimental results validate that the proposed algorithm can achieve significantly less bandwidth utilization and improved schedulability than existing methods for a given set of signals.