Norihiro KAMAE Akira TSUCHIYA Hidetoshi ONODERA
A body bias generator (BBG) for fine-grained body biasing (FGBB) is proposed. The FGBB is effective to reduce variability and power consumption in a system-on-chip (SoC). Since FGBB needs a number of BBGs, the BBG is preferred to be implemented in cell-based design procedure. In the cell-based design, it is inefficient to provide an extra supply voltage for BBGs. We invented a BBG with switched capacitor configuration and it enables BBG to operate with wide range of the supply voltage from 0.6V to 1.2V. We fabricated the BBG in a 65nm CMOS process to control 0.1mm2 of core circuit with the area overhead of 1.4% for the BBG.
Xizhu PENG Yuki YAMANASHI Nobuyuki YOSHIKAWA Akira FUJIMAKI Naofumi TAKAGI Kazuyoshi TAKAGI Mutsuo HIDAKA
Recently, we proposed a new data-path architecture, named a large-scale reconfigurable data-path (LSRDP), based on single-flux-quantum (SFQ) circuits, to establish a fundamental technology for future high-end computers. In this architecture, a large number of SFQ floating-point units (FPUs) are used as core components, and their high performance and low power consumption are essential. In this research, we implemented an SFQ half-precision bit-serial floating-point multiplier (FPM) with a target clock frequency of 50GHz, using the AIST 10kA/cm2 Nb process. The FPM was designed, based on a systolic-array architecture. It contains 11,066 Josephson junctions, including on-chip high-speed test circuits. The size and power consumption of the FPM are 6.66mm × 1.92mm and 2.83mW, respectively. Its correct operation was confirmed at a maximum frequency of 93.4GHz for the exponent part and of 72.0GHz for the significand part by on-chip high-speed tests.
Akira FUJIMAKI Masamitsu TANAKA Ryo KASAGI Katsumi TAKAGI Masakazu OKADA Yuhi HAYAKAWA Kensuke TAKATA Hiroyuki AKAIKE Nobuyuki YOSHIKAWA Shuichi NAGASAWA Kazuyoshi TAKAGI Naofumi TAKAGI
We describe a large-scale integrated circuit (LSI) design of rapid single-flux-quantum (RSFQ) circuits and demonstrate several reconfigurable data-path (RDP) processor prototypes based on the ISTEC Advanced Process (ADP2). The ADP2 LSIs are made up of nine Nb layers and Nb/AlOx/Nb Josephson junctions with a critical current density of 10kA/cm2, allowing higher operating frequencies and integration. To realize truly large-scale RSFQ circuits, careful design is necessary, with several compromises in the device structure, logic gates, and interconnects, balancing the competing demands of integration density, design flexibility, and fabrication yield. We summarize numerical and experimental results related to the development of a cell-based design in the ADP2, which features a unit cell size reduced to 30-µm square and up to four strip line tracks in the unit cell underneath the logic gates. The ADP LSIs can achieve ∼10 times the device density and double the operating frequency with the same power consumption per junction as conventional LSIs fabricated using the Nb four-layer process. We report the design and test results of RDP processor prototypes using the ADP2 cell library. The RDP processors are composed of many arrays of floating-point units (FPUs) and switch networks, and serve as accelerators in a high-performance computing system. The prototypes are composed of two-dimensional arrays of several arithmetic logic units instead of FPUs. The experimental results include a successful demonstration of full operation and reconfiguration in a 2×2 RDP prototype made up of 11.5k junctions at 45GHz after precise timing design. Partial operation of a 4×4 RDP prototype made up of 28.5k-junctions is also demonstrated, indicating the scalability of our timing design.
Yusuke MIZUNO Kazunobu KONDO Takanori NISHINO Norihide KITAOKA Kazuya TAKEDA
Blind source separation is a technique that can separate sound sources without such information as source location, the number of sources, and the utterance content. Multi-channel source separation using many microphones separates signals with high accuracy, even if there are many sources. However, these methods have extremely high computational complexity, which must be reduced. In this paper, we propose a computational complexity reduction method for blind source separation based on frequency domain independent component analysis (FDICA) and examine temporal data that are effective for source separation. A frame with many sound sources is effective for FDICA source separation. We assume that a frame with a low kurtosis has many sound sources and preferentially select such frames. In our proposed method, we used the log power spectrum and the kurtosis of the magnitude distribution of the observed data as selection criteria and conducted source separation experiments using speech signals from twelve speakers. We evaluated the separation performances by the signal-to-interference ratio (SIR) improvement score. From our results, the SIR improvement score was 24.3dB when all the frames were used, and 23.3dB when the 300 frames selected by our criteria were used. These results clarified that our proposed selection criteria based on kurtosis and magnitude is effective. Furthermore, we significantly reduced the computational complexity because it is proportional to the number of selected frames.
Yosuke TANAKA Shun-ichi AZUMA Toshiharu SUGIE
This paper addresses a broadcast control problem of multi-agent systems with quantized measurements, where each agent moves based on the common broadcasted signal and tries to minimize a given quadratic performance index. The problem is solved by introducing dither type random movements to the agents' action which reduce the degradation caused by quantized measurements. A broadcast controller is derived and it is proven that the controller approximately achieves given tasks with probability 1. The effectiveness of the proposed controller is demonstrated by numerical simulation.
Takashi SUDO Hirokazu TANAKA Ryuji KOHNO
In this paper, we study an objective quality measure that approximates the subjective mean opinion score (MOS) for bandwidth-extended wideband speech with respect to narrowband speech. Bandwidth-extended speech should be widely evaluated by a subjective quality assessment such as MOS. However, such subjective quality assessments are expensive and time-consuming. This paper proposes a new objective quality measure that combines the perceptual evaluation of speech quality (PESQ) and spectral-distortion. We evaluated the correlation between our proposed scheme and MOS using AMR and AMR-WB speech codecs. The coefficient of correlation between the proposed scheme and the MOS value was found to be 0.973. We concluded that the proposed scheme is a valid and effective objective quality measure.
Bin XU Yi CUI Guangyi ZHOU Biao YOU Jian YANG Jianshe SONG
In this paper, a new method is proposed for unsupervised speckle level estimation in synthetic aperture radar (SAR) images. It is assumed that fully developed speckle intensity has a Gamma distribution. Based on this assumption, estimation of the equivalent number of looks (ENL) is transformed into noise variance estimation in the logarithmic SAR image domain. In order to improve estimation accuracy, texture analysis is also applied to exclude areas where speckle is not fully developed (e.g., urban areas). Finally, the noise variance is estimated by a 2-dimensional autoregressive (AR) model. The effectiveness of the proposed method is verified with several SAR images from different SAR systems and simulated images.
Yuki YAMANASHI Nobuyuki YOSHIKAWA
A promising application of a single-flux quantum (SFQ) circuit is read-out circuitry for a multi-channel superconductive sensor array. In such applications, the SFQ read-out circuit is expected to operate outside a magnetic shield. We investigated an SFQ circuit structure, which is tolerant to an external magnetic field, using the AIST 2.5kA/cm2 Nb standard 2 process, which has four Nb wiring layers including the ground plane. By covering the entire circuit using an upper Nb wiring layer called the control (CTL) layer, the influences of the external magnetic field on the SFQ circuit operation can be avoided. We experimentally evaluated the sheet inductance of the wiring layer underneath the CTL shielding layer to design a magnetic-field-tolerant SFQ circuit. We implemented and measured test circuits comprising toggle flip-flops (TFFs) to evaluate their magnetic field tolerances. The operating margin and maximum operating frequency of the designed TFF did not deteriorate with increases in the magnetic field applied to the test circuit, whereas the operating margin of the conventional TFF was reduced by applying the magnetic field. We have also demonstrated the high-speed operation of the designed TFF operated in an unshielded environment at a frequency of up to 120GHz with a wide operating margin.
Yuka ITANO Shotaro MORIMOTO Sadayuki YOSHITOMI Nobuyuki ITOH
This paper presents the strategy of MOS varactor's high-Q optimization, a novel scalable model for the quasi-millimeter-wave MOS varactors, and confirmation results by discrete MOS varactors and VCO measurements. To realize a high-Q MOS varactor in the quasi-millimeter-wave region, low MOS varactor capacitance and low series resistance of unit cell are essential. Downsizing is a key to realize both low capacitance and low resistance. However, it is induced by Cmax/Cmin reduction, simultaneously. Therefore, scalable MOS varactor model is necessary to use optimum MOS varactor to cover various application requirements using same process. Decreasing the MOS varactor's size of W/L =2µm/2µm to 0.5µm/0.26µm, the Q factor increased sevenfold at f =20GHz but Cmax/Cmin is reduced by 60%, by using conventional PSP model, an error of approximately 20% is shown. Proposed model has been improved its accuracy from 18.9% to 0.2% for N+ MOS varactor and from 22.1% to 0.8% for P+ MOS varactor, for minimum size of MOS varactor even if model covers wide dimension range. Also, it has been confirmed this model is covered in two types of layouts. Oscillation frequency and phase noise also have been confirmed by three types of 22GHz VCOs. The accuracy of oscillation frequency is less than 2.5% and that of phase noise at 1MHz offset from carrier is less than 5dB.
Hiroshi KATAOKA Hiroaki HONDA Farhad MEHDIPOUR Nobuyuki YOSHIKAWA Akira FUJIMAKI Hiroyuki AKAIKE Naofumi TAKAGI Kazuaki MURAKAMI
The single flux quantum (SFQ) is expected to be a next-generation high-speed and low-power technology in the field of logic circuits. CMOS as the dominant technology for conventional processors cannot be replaced with SFQ technology due to the difficulty of implementing feedback loops and conditional branches using SFQ circuits. This paper investigates the applicability of a reconfigurable data-path (RDP) accelerator based on SFQ circuits. The authors introduce detailed specifications of the SFQ-RDP architecture and compare its performance and power/performance ratio with those of a graphics-processing unit (GPU). The results show at most 1600 times higher efficiency in terms of Flops/W (floating-point operations per second/Watt) for some high-performance computing application programs.
Daisuke ANZAI Takashi KOYA Jingjing SHI Jianqing WANG
Space diversity reception is well known as a technique that can improve the performance of wireless communication systems without any temporal and spectral resource expansion. Implant body area networks (BANs) require high-speed transmission and low energy consumption. Therefore, applying spatial diversity reception to implant BANs can be expected to fulfill these requirements. For this purpose, this paper presents a local frequency offset diversity system with π/4-differential quadrature phase shift keying (DQPSK) for implant BANs that offer improved communication performance with a simpler receiver structure, and evaluates the proposal's bit error rate (BER) performance by theoretical analysis. In the theoretical analysis, it is difficult to analytically derive the probability density function (pdf) on the combined signal-to-noise power ratio (SNR) at the local offset frequency diversity receiver output. Therefore, this paper adopts the moment generating function approximation method and demonstrates that the resulting theoretical analyses yield performances that basically match the results of computer simulations. We first confirm that the local frequency offset diversity reception can effectively improve the communication performance of implant BANs. Next, we perform an analysis of a realistic communication performance, namely, a link budget analysis based on derived BER performance and evaluate the link parameters including system margin, maximum link distance and required transmit power. These analyses demonstrate that the local frequency offset diversity system can realize a reliable communication link in a realistic implant BAN scenario.
Masamitsu TANAKA Atsushi KITAYAMA Masakazu OKADA Tomohito KOUKETSU Takumi TAKINAMI Masato ITO Akira FUJIMAKI
We report the successful operation of a low-power arithmetic logic unit (ALU) based on a low-voltage rapid single-flux-quantum (LV-RSFQ) logic circuit, whereby a dc bias current is fed to circuits from lowered constant-voltage sources through small resistors. Both the static and dynamic energy consumptions are reduced because of the reduction in the amplitudes of voltage pulses across the Josephson junctions, with a trade-off of slightly slower switching speeds. The designed bias voltage was set to 0.25mV, which is one-tenth that of our standard RSFQ circuit design. We investigated several issues related to such low-voltage operation, including margins and timing design. To achieve successful operation, we tuned the circuit parameters in the logic gate design and carefully controlled the timing by considering the interference of pulse signals. We show test results for the low-voltage ALU in on-chip high-speed testing. The circuit was fabricated using the AIST Nb/AlOx/Nb Advanced Process with a critical current density of 10kA/cm2. We verified that arithmetic and logical operations were correctly implemented and obtained dc bias margins of 18% at a target clock frequency of 20GHz and achieved a maximum clock frequency of 28GHz with a power consumption of 28µW. These experimental results indicate energy efficiency of 3.6 times that of the standard RSFQ circuit design.
Kyosuke SANO Yuki YAMANASHI Nobuyuki YOSHIKAWA
We have been developing a superconducting time-of-flight mass spectrometry (TOF-MS) system, which utilizes a superconductive strip ion detector (SSID) and a single-flux-quantum (SFQ) multi-stop time-to-digital converter (TDC). The SFQ multi-stop TDC can measure the time intervals between multiple input signals and directly convert them into binary data. In this study, we designed and implemented 24-bit SFQ multi-stop TDCs with a 3×24-bit FIFO buffer using the AIST Nb standard process (STP2), whose time resolution and dynamic range are 100ps and 1.6ms, respectively. The timing jitter of the TDC was investigated by comparing two types of TDCs: one uses an on-chip SFQ clock generator (CG) and the other uses a microwave oscillator at room temperature. We confirmed the correct operation of both TDCs and evaluated their timing jitter. The experimentally-obtained timing jitter is about 40ns and 700ps for the TDCs with and without the on-chip SFQ CG, respectively, for the measured time interval of 50µs, which linearly increases with increase of the measured time interval.
Yoshitaka TAKAHASHI Hiroshi SHIMADA Masaaki MAEZAWA Yoshinao MIZUGAKI
We present our design and operation of a 6-bit quasi-triangle voltage waveform generator comprising three circuit blocks; an improved variable Pulse Number Multiplier (variable-PNM), a Code Generator (CG), and a Double-Flux-Quantum Amplifier (DFQA). They are integrated into a single chip using a niobium Josephson junction technology. While the multiplication factor of our previous m-bit variable-PNM was limited between 2m-1 and 2m, that of the improved one is extended between 1 and 2m. Correct operations of the 6-bit variable-PNM are confirmed in low-speed testing with respect to the codes from the CG, whereas generation of a 6-bit, 0.20mVpp quasi-triangle voltage waveform is demonstrated with the 10-fold DFQA in high-speed testing.
Mingfu XUE Wei LIU Aiqun HU Youdong WANG
Hardware Trojan (HT) has emerged as an impending security threat to hardware systems. However, conventional functional tests fail to detect HT since Trojans are triggered by rare events. Most of the existing side-channel based HT detection techniques just simply compare and analyze circuit's parameters and offer no signal calibration or error correction properties, so they suffer from the challenge and interference of large process variations (PV) and noises in modern nanotechnology which can completely mask Trojan's contribution to the circuit. This paper presents a novel HT detection method based on subspace technique which can detect tiny HT characteristics under large PV and noises. First, we formulate the HT detection problem as a weak signal detection problem, and then we model it as a feature extraction model. After that, we propose a novel subspace HT detection technique based on time domain constrained estimator. It is proved that we can distinguish the weak HT from variations and noises through particular subspace projections and reconstructed clean signal analysis. The reconstructed clean signal of the proposed algorithm can also be used for accurate parameter estimation of circuits, e.g. power estimation. The proposed technique is a general method for related HT detection schemes to eliminate noises and PV. Both simulations on benchmarks and hardware implementation validations on FPGA boards show the effectiveness and high sensitivity of the new HT detection technique.
Fanxin ZENG Xiaoping ZENG Zhenyu ZHANG Guixin XUAN
This letter presents three methods for producing 8-QAM+ sequences. The first method transforms a ternary complementary sequence set (CSS) with even number of sub-sequences into an 8-QAM+ periodic CSS with both of the period and the number of sub-sequences unaltered. The second method results in an 8-QAM+ aperiodic CSS with confining neither the period nor the number of sub-sequences. The third method produces 8-QAM+ periodic sequences having ideal autocorrelation property on the real part of the autocorrelation function. The proposed sequences can be potentially applied to suppression of multiple access interference or synchronization in a communication system.
Shuichi NAGASAWA Kenji HINODE Tetsuro SATOH Mutsuo HIDAKA Hiroyuki AKAIKE Akira FUJIMAKI Nobuyuki YOSHIKAWA Kazuyoshi TAKAGI Naofumi TAKAGI
We describe the recent progress on a Nb nine-layer fabrication process for large-scale single flux quantum (SFQ) circuits. A device fabricated in this process is composed of an active layer including Josephson junctions (JJ) at the top, passive transmission line (PTL) layers in the middle, and a DC power layer at the bottom. We describe the process conditions and the fabrication equipment. We use both diagnostic chips and shift register (SR) chips to improve the fabrication process. The diagnostic chip was designed to evaluate the characteristics of basic elements such as junctions, contacts, resisters, and wiring, in addition to their defect evaluations. The SR chip was designed to evaluate defects depending on the size of the SFQ circuits. The results of a long-term evaluation of the diagnostic and SR chips showed that there was fairly good correlation between the defects of the diagnostic chips and yields of the SRs. We could obtain a yield of 100% for SRs including 70,000JJs. These results show that considerable progress has been made in reducing the number of defects and improving reliability.
Kazuyoshi TAKAGI Nobutaka KITO Naofumi TAKAGI
Superconducting Single-Flux-Quantum (SFQ) devices have been paid much attention as alternative devices for digital circuits, because of their high switching speed and low power consumption. For large-scale circuit design, the role of computer-aided design environment is significant. As the characteristics of the SFQ devices are different from conventional devices, a new design environment is required. In this paper, we propose a new timing-aware circuit description method which can be used for SFQ circuit design. Based on the description and the dedicated algorithms we have been developing for SFQ logic circuit design, we propose an integrated design flow for SFQ logic circuits. We have designed a circuit using our developed design tools along with the design flow and demonstrated the correct operation.
The estimation of the power spectral density (PSD) of noise is crucial for retrieving speech in noisy environments. In this study, we propose a novel method for estimating the non-white noise PSD from noisy speech on the basis of a generalized gamma distribution and the minimum mean square error (MMSE) approach. Because of the highly non-stationary nature of speech, deriving its actual spectral probability density function (PDF) using conventional modeling techniques is difficult. On the other hand, spectral components of noise are more stationary than those of speech and can be represented more accurately by a generalized gamma PDF. The generalized gamma PDF can be adapted to optimally match the actual distribution of the noise spectral amplitudes observed at each frequency bin utilizing two real-time updated parameters, which are calculated in each frame based on the moment matching method. The MMSE noise PSD estimator is derived on the basis of the generalized gamma PDF and Gaussian PDF models for noise and speech spectral amplitudes, respectively. Combined with an improved Weiner filter, the proposed noise PSD estimate method exhibits the best performance compared with the minimum statistics, weighted noise estimation, and MMSE-based noise PSD estimation methods in terms of both subjective and objective measures.
Zilong ZHANG Baisheng DU Xiaodong XU
Broadband wireless channels are frequency selective in nature. In this paper, a novel precoder with finite impulse response (FIR) structure is proposed to maximize the throughput of the multiple-input multiple-output (MIMO) frequency-selective multicast channel. An iteration mechanism is investigated to obtain the desired FIR precoding matrix. In the iterative process, two associated parameters, namely the innovation orientation and the iteration step size, are jointly derived by the convex optimization program and the traditional Gauss-Newton algorithm. Convergence and complexity analyses are presented, and the numerical simulations indicate that the proposed method outperforms the existing schemes in the moderate to high signal to noise ratio (SNR) regime.