Jingjie YAN Bojie YAN Ruiyu LIANG Guanming LU Haibo LI Shipeng XIE
In this paper, we present a novel regression-based robust locality preserving projections (RRLPP) method to effectively deal with the issue of noise and occlusion in facial expression recognition. Similar to robust principal component analysis (RPCA) and robust regression (RR) approach, the basic idea of the presented RRLPP approach is also to lead in the low-rank term and the sparse term of facial expression image sample matrix to simultaneously overcome the shortcoming of the locality preserving projections (LPP) method and enhance the robustness of facial expression recognition. However, RRLPP is a nonlinear robust subspace method which can effectively describe the local structure of facial expression images. The test results on the Multi-PIE facial expression database indicate that the RRLPP method can effectively eliminate the noise and the occlusion problem of facial expression images, and it also can achieve better or comparative facial expression recognition rate compared to the non-robust and robust subspace methods meantime.
Sun-Mi PARK Ku-Young CHANG Dowon HONG Changho SEO
In this paper, we present a new three-way split formula for binary polynomial multiplication (PM) with five recursive multiplications. The scheme is based on a recently proposed multievaluation and interpolation approach using field extension. The proposed PM formula achieves the smallest space complexity. Moreover, it has about 40% reduced time complexity compared to best known results. In addition, using developed techniques for PM formulas, we propose a three-way split formula for Toeplitz matrix vector product with five recursive products which has a considerably improved complexity compared to previous known one.
Rei UENO Naofumi HOMMA Takafumi AOKI
This paper presents a system for the automatic generation of Galois-field (GF) arithmetic circuits, named the GF Arithmetic Module Generator (GF-AMG). The proposed system employs a graph-based circuit description called the GF Arithmetic Circuit Graph (GF-ACG). First, we present an extension of the GF-ACG to handle GF(pm) (p≥3) arithmetic circuits, which can be efficiently implemented by multiple-valued logic circuits in addition to the conventional binary circuits. We then show the validity of the generation system through the experimental design of GF(pm) multipliers for different p-values. In addition, we evaluate the performance of three types of GF(2m) multipliers and typical GF(pm) multipliers (p≥3) empirically generated by our system. We confirm from the results that the proposed system can generate a variety of GF parallel multipliers, including practical multipliers over GF(pm) having extension degrees greater than 128.
Takahiro YAMAMOTO Ittetsu TANIGUCHI Hiroyuki TOMIYAMA Shigeru YAMASHITA Yuko HARA-AZUMI
Approximate computing is considered as a promising approach to design of power- or area-efficient digital circuits. This paper proposes a systematic methodology for design and worst-case accuracy analysis of approximate array multipliers. Our methodology systematically designs a series of approximate array multipliers with different area, delay, power and accuracy characteristics so that an LSI designer can select the one which best fits to the requirements of her/his applications. Our experiments explore the trade-offs among area, delay, power and accuracy of the approximate multipliers.
Nobutaka KITO Kazushi AKIMOTO Naofumi TAKAGI
A floating-point multiplier with concurrent error detection capability by partial duplication is proposed. It uses a truncated multiplier for checking of the significand (mantissa) multiplication instead of full duplication. The proposed multiplier can detect any erroneous output with error larger than one unit in the last place (1 ulp) of the significand, which may be overlooked by residue checking. Its circuit area is smaller than that of a fully duplicated one. Area overhead of a single-precision multiplier is about 78% and that of a double-precision one is about 65%.
This paper presents the optimal implementation methods for 256-bit elliptic curve digital signature algorithm (ECDSA) signature generation processors with high speed Montgomery multipliers. We have explored the radix of the data path of the Montgomery multiplier from 2-bit to 256-bit operation and proposed the use of pipelined Montgomery multipliers for signature generation speed, area, and energy optimization. The key factor in the design optimization is how to perform modular multiplication. The high radix Montgomery multiplier is known to be an efficient implementation for high-speed modular multiplication. We have implemented ECDSA signature generation processors with high radix Montgomery multipliers using 65-nm SOTB CMOS technology. Post-layout results show that the fastest ECDSA signature generation time of 63.5µs with radix-256-bit, a two-module four-streams pipeline architecture, and an area of 0.365mm2 (which is the smallest) with a radix-16-bit zero-pipeline architecture, and the smallest signature generation energy of 9.51µJ with radix-256-bit zero-pipeline architecture.
Sun-Mi PARK Ku-Young CHANG Dowon HONG Changho SEO
We propose subquadratic space complexity multipliers for any finite field $mathbb{F}_{q^n}$ over the base field $mathbb{F}_q$ using the Dickson basis, where q is a prime power. It is shown that a field multiplication in $mathbb{F}_{q^n}$ based on the Dickson basis results in computations of Toeplitz matrix vector products (TMVPs). Therefore, an efficient computation of a TMVP yields an efficient multiplier. In order to derive efficient $mathbb{F}_{q^n}$ multipliers, we develop computational schemes for a TMVP over $mathbb{F}_{q}$. As a result, the $mathbb{F}_{2^n}$ multipliers, as special cases of the proposed $mathbb{F}_{q^n}$ multipliers, have lower time complexities as well as space complexities compared with existing results. For example, in the case that n is a power of 3, the proposed $mathbb{F}_{2^n}$ multiplier for an irreducible Dickson trinomial has about 14% reduced space complexity and lower time complexity compared with the best known results.
Guang-Ming TANG Kazuyoshi TAKAGI Naofumi TAKAGI
A rapid single-flux-quantum (RSFQ) 4-bit bit-slice multiplier is proposed. A new systolic-like multiplication algorithm suitable for RSFQ implementation is developed. The multiplier is designed using the cell library for AIST 10-kA/cm2 1.0-µm fabrication technology (ADP2). Concurrent flow clocking is used to design a fully pipelined RSFQ logic design. A 4n×4n-bit multiplier consists of 2n+17 stages. For verifying the algorithm and the logic design, a physical layout of the 8×8-bit multiplier has been designed with target operating frequency of 50GHz and simulated. It consists of 21 stages and 11,488 Josephson junctions. The simulation results show correct operation up to 62.5GHz.
Yo-Hao TU Jen-Chieh LIU Kuo-Hsing CHENG
This paper proposes the proportional static-phase-error reduction (SPER) for the frequency-multiplier-based delay-locked-loop (DLL) architecture. The frequency multiplier (FM) can synthesize a combined clock to solve the high operational frequency of DLL. However, FM is sensitive to the static phase error of DLL. A SPER loop adopts a timing amplifier and a coarse-fine tuning technique to enhance the deterministic jitter of FM. The SPER loop proportionally reduces the static phase error and can extend the operating range of FM.
Fang TIAN Jie GUO Bin SONG Haixiao LIU Hao QIN
Distributed compressed video sensing (DCVS), combining advantages of compressed sensing and distributed video coding, is developed as a novel and powerful system to get an encoder with low complexity. Nevertheless, it is still unclear how to explore the method to achieve an effective video recovery through utilizing realistic signal characteristics as much as possible. Based on this, we present a novel spatiotemporal dictionary learning (DL) based reconstruction method for DCVS, where both the DL model and the l1-analysis based recovery with correlation constraints are included in the minimization problem to achieve the joint optimization of sparse representation and signal reconstruction. Besides, an alternating direction method with multipliers (ADMM) based numerical algorithm is outlined for solving the underlying optimization problem. Simulation results demonstrate that the proposed method outperforms other methods, with 0.03-4.14 dB increases in PSNR and a 0.13-15.31 dB gain for non-key frames.
Mona MORADI Reza FAGHIH MIRZAEE Keivan NAVI
This paper presents new Binary Converters (or current-mode compressors) by the usage of carbon nanotube field effect transistors. The new designs are made of three parts: 1) the input currents which are converted to voltage; 2) threshold detectors; and 3) the output current flow paths. In addition, an 8×8-bit multiplier is considered as a bench mark to estimate their efficiency degrees. The first approach is based on high-order Binary Converters, and the second one is only composed of 4BCs and Half Adders.
This paper proposes an analytical, closed-form AC-DC voltage multiplier model and investigates the dependency of output current and input power on circuit and device parameters. The model uses no fitting parameters and a frequency term applicable to both multipliers using diodes and metal-oxide semiconductor field effect transistors (MOSFETs). Analysis enables circuit designers to estimate circuit parameters, such as the number of stages and capacitance per stages, and device parameters such as saturation current (in the case of diodes) or transconductance (in the case of MOSFETs). Comparisons of the proposed model with SPICE simulation results as well as other models are also provided for validation. In addition, design optimizations and the impact of AC power source impedance on output power are also investigated.
Sun-Mi PARK Ku-Young CHANG Dowon HONG Changho SEO
A field multiplication in the extended binary field is often expressed using Toeplitz matrix-vector products (TMVPs), whose matrices have special properties such as symmetric or triangular. We show that such TMVPs can be efficiently implemented by taking advantage of some properties of matrices. This yields an efficient multiplier when a field multiplication involves such TMVPs. For example, we propose an efficient multiplier based on the Dickson basis which requires the reduced number of XOR gates by an average of 34% compared with previously known results.
Cell voltage equalizers are necessary to ensure years of operation and maximize the chargeable/dischargeable energy of series-connected supercapacitors (SCs). A two-switch voltage equalizer using a series-resonant voltage multiplier operating in frequency-multiplied discontinuous conduction mode (DCM) is proposed for series-connected SCs in this paper. The frequency-multiplied mode virtually increases the operation frequency and hence mitigates the negative impact of the impedance mismatch of capacitors on equalization performance, allowing multi-layer ceramic capacitors (MLCCs) to be used instead of bulky and costly tantalum capacitors, the conventional approach when using voltage multipliers in equalizers. Furthermore, the DCM operation inherently provides the constant current characteristic, realizing the excessive current protection that is desirable for SCs, which experience 0V and equivalently become an equivalent short-circuit load. Experimental equalization tests were performed for eight SCs connected in series under two frequency conditions to verify the improved equalization performance at the increased virtual operation frequencies. The standard deviation of cell voltages under the higher-frequency condition was lower than that under the lower-frequency condition, demonstrating superior equalization performance at higher frequencies.
Chin-Long WEY Ping-Chang JUI Muh-Tian SHIUE
A constant multiplier performs a multiplication of a data-input with a constant value. Constant multipliers are essential components in various types of arithmetic circuits, such as filters in digital signal processor (DSP) units, and they are prevalent in modern VLSI designs. This study presents an efficient algorithm and fast hardware implementation for performing multiply-by-(1+2k) operation with additions. No multiplications are needed. The value of (1+2k)N can be computed by adding N to its k-bit left-shifted value 2kN. The additions can be performed by the full-adder-based (FA-based) ripple carry adder (RCA) for simple architecture. This paper introduces the unit cells for additions (UCAs) to construct the UCA-based RCA which achieves 35% faster than the FA-based RCA in speed performance. Further, in order to improve the speed performance, a simple and modular hybrid adder is presented with the proposed UCA concept, where the carry lookahead adder (CLA) as a module and many of the CLA modules are serially connected in a fashion similar to the RCA. Results show that the hybrid adder significantly improves the speed performance.
Sun-Mi PARK Ku-Young CHANG Dowon HONG Changho SEO
In several important applications, we often encounter with the computation of a Toeplitz matrix vector product (TMVP). In this work, we propose a k-way splitting method for a TMVP over any field F, which is a generalization of that over GF(2) presented by Hasan and Negre. Furthermore, as an application of the TMVP method over F, we present the first subquadratic space complexity multiplier over any finite field GF(pn) defined by an irreducible trinomial.
Junping DENG Xian-Hua HAN Yen-Wei CHEN Gang XU Yoshinobu SATO Masatoshi HORI Noriyuki TOMIYAMA
Chronic liver disease is a major worldwide health problem. Diagnosis and staging of chronic liver diseases is an important issue. In this paper, we propose a quantitative method of analyzing local morphological changes for accurate and practical computer-aided diagnosis of cirrhosis. Our method is based on sparse and low-rank matrix decomposition, since the matrix of the liver shapes can be decomposed into two parts: a low-rank matrix, which can be considered similar to that of a normal liver, and a sparse error term that represents the local deformation. Compared with the previous global morphological analysis strategy based on the statistical shape model (SSM), our proposed method improves the accuracy of both normal and abnormal classifications. We also propose using the norm of the sparse error term as a simple measure for classification as normal or abnormal. The experimental results of the proposed method are better than those of the state-of-the-art SSM-based methods.
Lechang LIU Keisuke ISHIKAWA Tadahiro KURODA
Parametric resonance based solutions for sub-gigahertz radio frequency transceiver with 0.3V supply voltage are proposed in this paper. As an implementation example, a 0.3V 720µW variation-tolerant injection-locked frequency multiplier is developed in 90nm CMOS. It features a parametric resonance based multi-phase synthesis scheme, thereby achieving the lowest supply voltage with -110dBc@ 600kHz phase noise and 873MHz-1.008GHz locking range in state-of-the-art frequency synthesizers.
Xizhu PENG Yuki YAMANASHI Nobuyuki YOSHIKAWA Akira FUJIMAKI Naofumi TAKAGI Kazuyoshi TAKAGI Mutsuo HIDAKA
Recently, we proposed a new data-path architecture, named a large-scale reconfigurable data-path (LSRDP), based on single-flux-quantum (SFQ) circuits, to establish a fundamental technology for future high-end computers. In this architecture, a large number of SFQ floating-point units (FPUs) are used as core components, and their high performance and low power consumption are essential. In this research, we implemented an SFQ half-precision bit-serial floating-point multiplier (FPM) with a target clock frequency of 50GHz, using the AIST 10kA/cm2 Nb process. The FPM was designed, based on a systolic-array architecture. It contains 11,066 Josephson junctions, including on-chip high-speed test circuits. The size and power consumption of the FPM are 6.66mm × 1.92mm and 2.83mW, respectively. Its correct operation was confirmed at a maximum frequency of 93.4GHz for the exponent part and of 72.0GHz for the significand part by on-chip high-speed tests.
Yasushi IGARASHI Tadashi CHIBA Shin-ichi O'UCHI Meishoku MASAHARA Kunihiro SAKAMOTO
Voltage multiplier (VM) circuits for RF (2.45GHz)-to-DC conversion are developed for battery-less sensor nodes. Converted DC power is charged on a storage capacitor before driving a wireless sensor module. A charging time of the storage capacitor of the proposed VM circuits is reduced 1/10 of the conventional VM circuits, because they have constant current characteristics owing to self-control of body bias in diode-connected SOI MOSFETs. The wireless sensor system composed of the fabricated VM chip and a commercially available sensor module is operated using an RF signal of a wireless LAN modem (2.45GHz) as a power source.