1-10hit |
A new Hybrid-Carry-Selection (HCS) approach for deriving an efficient modulo 2n-1 addition is presented in this study. Its resulting adder architecture is simple and applicable for all n values. Based on 180-nm CMOS technology, the HCS-based modulo 2n-1 adder demonstrates its superiority in Area-Time (AT) performance over existing solutions.
Chi-Chia SUN Ming-Hwa SHEU Jui-Yang CHI Yan-Kai HUANG
In this paper, a nonoverlapping multi-camera and people re-identification algorithm is proposed. It applies inflated major color features for re-identification to reduce computation time. The inflated major color features can dramatically improve efficiency while retaining high accuracy of object re-identification. The proposed method is evaluated over a wide range of experimental databases. The accuracy attains upwards of 40.7% in Rank 1 and 84% in Rank 10 on average, while it obtains three to 15 times faster than algorithms reported in the literature. The proposed algorithm has been implemented on a SOC-FPGA platform to reach 50 FPS with 1280×720 HD resolution and 25 FPS with 1920×1080 FHD resolution for real-time processing. The results show a performance improvement and reduction in computation complexity, which is especially ideal for embedded platform.
Jin-Fa LIN Yin-Tsung HWANG Ming-Hwa SHEU
A low power pulse generator design using hybrid logic realization of a 3-input NAND gate is presented. The hybrid logic approach successfully shortens the critical path along the discharging transistor stack and thus reduces the short circuit power consumption during the pulse generation. The combination of pass transistor and full CMOS logic styles in one NAND gate design also helps minimize the required transistor size, which alleviates the loading capacitance of clock tree as well. Simulation results reveal that, compared with prior work, our design can achieve 20.5% and 23% savings respectively in power and circuit area.
Po-Yu KUO Chia-Hsin HSIEH Jin-Fa LIN Ming-Hwa SHEU Yi-Ting HUNG
A novel low power sense-amplifier based flip-flop (FF) is presented. By using a simplified SRAM based latch design and pass transistor logic (PTL) circuit scheme, the transistor-count of the FF design is greatly reduced as well as leakage power performance. The performance claims are verified through extensive post-layout simulations. Compared to the conventional sense-amplifier FF design, the proposed circuit achieves 19.6% leakage reduction. Moreover, the delay, and area are reduced by 21.8% and 31%, respectively. The performance edge becomes even better when the flip-flop is integrated in N-bit register file.
Ming-Hwa SHEU Yuan-Ching KUO Su-Hon LIN Siang-Min SIAO
This paper presents a novel adaptable 4-moduli set {2n + k, 2n+1, 2n-1, 22n+1}. It offers diverse dynamic ranges (DRs) from 25n-2n to 25n + k-2n + k that are used to conquer the over-range issue in RNS-application hardware designs. The proposed adaptable set possesses the coarse parameter n and fine parameter k. It not only has better parallelism and larger dynamic range (DR) than the existing adaptive 3-moduli sets, but also holds more sizable and flexible than the general 4-moduli sets with single parameter. For the adaptable R-to-B conversion, this paper first derives a fast reverse converting algorithm based on Chinese Remainder Theorem (CRT) and then presents the efficient converter architecture. From the experimental results, the proposed adaptable converter achieves better hardware performance in various DRs. Based on TSMC 0.18 µm CMOS technology, the proposed converter design is implemented and its results get at least 20.93% saving of Area-Delay-Power (ADP) products on average when comparing with the latest converter works.
Jin-Fa LIN Yin-Tshung HWANG Ming-Hwa SHEU
A dual-mode pulse-triggered flip-flop design supporting functional versatility is presented. A low-complexity unified logic module, consisting of only five transistors, for dual-mode pulse generation is devised using pass transistor logic (PTL). Potential threshold voltage loss problem is successfully resolved to ensure the signal integrity. Despite the extra logic for dual-mode operations, the circuit complexity of the proposed design is comparable to those of the single mode designs. Simulations in different process corners and switching activities prove the competitive performance of proposed design against various single mode designs.
Su-Hon LIN Ming-Hwa SHEU Chao-Hsiang WANG
The moduli set (2n, 2n+1-1, 2n-1) which is free of (2n+1)-type modulus is profitable to construct a high-performance residue number system (RNS). In this paper, we derive a reduced-complexity residue-to-binary conversion algorithm for the moduli set (2n, 2n+1-1, 2n-1) by using New Chinese Remainder Theorem (CRT). The resulting converter architecture mainly consists of simple adder and multiplexer (MUX) which is suitable to realize an efficient VLSI implementation. For the various dynamic range (DR) requirements, the experimental results show that the proposed converter can significantly achieve at least 23.3% average Area-Time (AT) saving when comparing with the latest designs. Based on UMC 0.18 µm CMOS cell-based technology, the chip area for 16-bit residue-to-binary converter is 931931 µm2 and its working frequency is about 135 MHz including I/O pad.
Jin-Fa LIN Yin-Tsung HWANG Ming-Hwa SHEU
A novel signal transition detector design using as few as 8 transistors is presented. The proposed design cleverly exploits the property of a specific internal state transition to mitigate the voltage degradation problem by employing only one extra transistor. It is thus capable of supporting level intact output signals and eliminating DC power consumption in the trailing buffer. The proposed design, featuring low circuit complexity and low power consumption, is considered useful for applications in self-timed circuits. Simulation results show that, when compared with other pass transistor logic based counterpart designs, as much as 46% savings in power and 28% in area can be achieved by the proposed design.
Chung-chi LIN Ming-hwa SHEU Huann-keng CHIANG Chih-Jen WEI Chishyan LIAW
Scene changes occur frequently in film broadcasting, and tend to destabilize the performance with blurred, jagged, and artifacts effects when de-interlacing methods are utilized. This paper presents an efficient VLSI architecture of video de-interlacing with considering scene change to improve the quality of video results. This de-interlacing architecture contains three main parts. The first is scene change detection, which is designed based on examining the absolute pixel difference value of two adjacent even or odd fields. The second is background index mechanism for classifying motion and non-motion pixels of input field. The third component, spatial-temporal edge-based median filter, is used to deal with the interpolation for those motion pixels. Comparing with the existed de-interlacing approaches, our architecture design can significantly ameliorate the PSNRs of the video sequences with various scene changes; for other situations, it also maintains better performances. The proposed architecture has been implemented as a VLSI chip based on UMC 0.18-µm CMOS technology process. The total gate count is 30114 and its layout area is about 710 710-µm. The power consumption is 39.78 mW at working frequency 128.2 MHz, which is able to process de-interlacing for HDTV in real-time.
Jin-Fa LIN Yin-Tsung HWANG Ming-Hwa SHEU
Two novel low complexity dual-mode pulse generator designs suitable for FFs with triggering mode control are presented. The proposed designs successfully integrate XOR/OR (AND/XNOR) functions into a unified pass transistor logic (PTL) module to provide control on single- or double-edge operations. The designs use as few as 8 transistors each and ingeniously avoid the signal degradation problem inherent in most PTL circuits. As the only dual-mode designs so far, the proposed designs also outperform rival single-mode designs in both aspects of circuit complexity and power consumption.