1-7hit |
Jianglin WEI Anna KUWANA Haruo KOBAYASHI Kazuyoshi KUBO
In this paper, an algorithm based on Taylor series expansion is proposed to calculate the logarithm (log2x) of IEEE754 binary32 accuracy floating-point number by a multi-domain partitioning method. The general mantissa (1≤x<2) is multiplied by 2, 4, 8, … (or equivalently left-shifted by 1, 2, 3, … bits), the regions of (2≤x<4), (4≤x<8), (8≤x<16),… are considered, and Taylor-series expansion is applied. In those regions, the slope of f(x)=log2 x with respect to x is gentle compared to the region of (1≤x<2), which reduces the required number of terms. We also consider the trade-offs among the numbers of additions, subtractions, and multiplications and Look-Up Table (LUT) size in hardware to select the best algorithm for the engineer's design and build the best hardware device.
Masamitsu TANAKA Kazuyoshi TAKAGI Naofumi TAKAGI
We present circuit implementations for computing exponentials and logarithms suitable for rapid single-flux-quantum (RSFQ) logic. We propose hardware algorithms based on the sequential table-lookup (STL) method using the radix-2 signed-digit representation that achieve high-throughput, digit-serial calculations. The circuits are implemented by processing elements formed in systolic-array-like, regularly-aligned pipeline structures. The processing elements are composed of adders, shifters, and readouts of precomputed constants. The iterative calculations are fully overlapped, and throughputs approach the maximum throughput of serial processing. The circuit size for calculating significand parts is estimated to be approximately 5-10 times larger than that of a bit-serial floating-point adder or multiplier.
Masamitsu TANAKA Atsushi KITAYAMA Masakazu OKADA Tomohito KOUKETSU Takumi TAKINAMI Masato ITO Akira FUJIMAKI
We report the successful operation of a low-power arithmetic logic unit (ALU) based on a low-voltage rapid single-flux-quantum (LV-RSFQ) logic circuit, whereby a dc bias current is fed to circuits from lowered constant-voltage sources through small resistors. Both the static and dynamic energy consumptions are reduced because of the reduction in the amplitudes of voltage pulses across the Josephson junctions, with a trade-off of slightly slower switching speeds. The designed bias voltage was set to 0.25mV, which is one-tenth that of our standard RSFQ circuit design. We investigated several issues related to such low-voltage operation, including margins and timing design. To achieve successful operation, we tuned the circuit parameters in the logic gate design and carefully controlled the timing by considering the interference of pulse signals. We show test results for the low-voltage ALU in on-chip high-speed testing. The circuit was fabricated using the AIST Nb/AlOx/Nb Advanced Process with a critical current density of 10kA/cm2. We verified that arithmetic and logical operations were correctly implemented and obtained dc bias margins of 18% at a target clock frequency of 20GHz and achieved a maximum clock frequency of 28GHz with a power consumption of 28µW. These experimental results indicate energy efficiency of 3.6 times that of the standard RSFQ circuit design.
Yong-Eun KIM Kyung-Ju CHO Jin-Gyun CHUNG Xinming HUANG
This paper presents an error compensation method for fixed-width group canonic signed digit (GCSD) multipliers that receive a W-bit input and generate a W-bit product. To efficiently compensate for the truncation error, the encoded signals from the GCSD multiplier are used for the generation of the error compensation bias. By Synopsys simulations, it is shown that the proposed method leads to up to 84% reduction in power consumption and up to 78% reduction in area compared with the fixed-width modified Booth multipliers.
Young-Geun LEE Han-Sam JUNG Ki-Seok CHUNG
Many DSP applications such as FIR filtering and DCT (discrete cosine transformation) require multiplication with constants. Therefore, optimizing the performance of constant multiplication improves the overall performance of these applications. It is well-known that shifting can replace a constant multiplication if the constant is a power of two. In this paper, we extend this idea in such a way that by employing more than two barrel shifters, we can design highly efficient constant multipliers. We have found that by using two or three shifters, we can generate a large set of constants. Using these constants, we can execute a typical set of FIR or DCT applications with few errors. Furthermore, with variable precision support, we can carry out a fairly large class of DSP applications with high computational efficiency. Compared to conventional multipliers, we can achieve power savings of up to 56% with negligible computational errors.
Yong-Eun KIM Kyung-Ju CHO Jin-Gyun CHUNG
In this paper, based on the variation of the modified Booth encoding method, an efficient modified Booth multiplier design method for predetermined coefficient groups is proposed. In the case of pulse-shaping filter design used in CDMA, it is shown that by the proposed method, area and power consumption can be reduced up to 44% and 48%, respectively, compared with the conventional designs. Also, it is shown that in the case of 128-point radix-24 FFT, the area and power consumption can be reduced by 18% and 36%, respectively.
Pisana PLACIDI Leonardo VERDUCCI Guido MATRELLA Luca ROSELLI Paolo CIAMPOLINI
In this paper, characteristics of a digital system dedicated to the fast execution of the FDTD algorithm, widely used for electromagnetic simulation, are presented. Such system is conceived as a module communicating with a host personal computer via a PCI bus, and is based on a VLSI ASIC, which implements the "field-update" engine. The system structure is defined by means of a hardware description language, allowing to keep high-level system specification independent of the actual fabrication technology. A virtual implementation of the system has been carried out, by mapping such description in a standard-cell style on a commercial 0.35 µm technology. Simulations show that significant speed-up can be achieved, with respect to state-of-the-art software implementations of the same algorithm.