Nobutaka KITO Ryota ODAKA Kazuyoshi TAKAGI
A rapid single-flux-quantum (RSFQ) truncated multiplier based on bit-level processing is proposed. In the multiplier, two operands are transformed to two serialized patterns of bits (pulses), and the multiplication is carried out by processing those bits. The result is obtained by counting bits. By calculating in bit-level, the proposed multiplier can be implemented in small area. The gate level design of the multiplier is shown. The layout of the 4-bit multiplier was also designed.
Koki ISHIDA Masamitsu TANAKA Takatsugu ONO Koji INOUE
CMOS microprocessors are limited in their capacity for clock speed improvement because of increasing computing power, i.e., they face a power-wall problem. Single-flux-quantum (SFQ) circuits offer a solution with their ultra-fast-speed and ultra-low-power natures. This paper introduces our contributions towards ultra-high-speed cryogenic SFQ computing. The first step is to design SFQ microprocessors. From qualitatively and quantitatively evaluating past-designed SFQ microprocessors, we have found that revisiting the architecture of SFQ microprocessors and on-chip caches is the first critical challenge. On the basis of cross-layer discussions and analysis, we came to the conclusion that a bit-parallel gate-level pipeline architecture is the best solution for SFQ designs. This paper summarizes our current research results targeting SFQ microprocessors and on-chip cache architectures.
Hiroshi KATAOKA Hiroaki HONDA Farhad MEHDIPOUR Nobuyuki YOSHIKAWA Akira FUJIMAKI Hiroyuki AKAIKE Naofumi TAKAGI Kazuaki MURAKAMI
The single flux quantum (SFQ) is expected to be a next-generation high-speed and low-power technology in the field of logic circuits. CMOS as the dominant technology for conventional processors cannot be replaced with SFQ technology due to the difficulty of implementing feedback loops and conditional branches using SFQ circuits. This paper investigates the applicability of a reconfigurable data-path (RDP) accelerator based on SFQ circuits. The authors introduce detailed specifications of the SFQ-RDP architecture and compare its performance and power/performance ratio with those of a graphics-processing unit (GPU). The results show at most 1600 times higher efficiency in terms of Flops/W (floating-point operations per second/Watt) for some high-performance computing application programs.
Yoshitaka TAKAHASHI Hiroshi SHIMADA Masaaki MAEZAWA Yoshinao MIZUGAKI
We present our design and operation of a 6-bit quasi-triangle voltage waveform generator comprising three circuit blocks; an improved variable Pulse Number Multiplier (variable-PNM), a Code Generator (CG), and a Double-Flux-Quantum Amplifier (DFQA). They are integrated into a single chip using a niobium Josephson junction technology. While the multiplication factor of our previous m-bit variable-PNM was limited between 2m-1 and 2m, that of the improved one is extended between 1 and 2m. Correct operations of the 6-bit variable-PNM are confirmed in low-speed testing with respect to the codes from the CG, whereas generation of a 6-bit, 0.20mVpp quasi-triangle voltage waveform is demonstrated with the 10-fold DFQA in high-speed testing.
Shuichi NAGASAWA Kenji HINODE Tetsuro SATOH Mutsuo HIDAKA Hiroyuki AKAIKE Akira FUJIMAKI Nobuyuki YOSHIKAWA Kazuyoshi TAKAGI Naofumi TAKAGI
We describe the recent progress on a Nb nine-layer fabrication process for large-scale single flux quantum (SFQ) circuits. A device fabricated in this process is composed of an active layer including Josephson junctions (JJ) at the top, passive transmission line (PTL) layers in the middle, and a DC power layer at the bottom. We describe the process conditions and the fabrication equipment. We use both diagnostic chips and shift register (SR) chips to improve the fabrication process. The diagnostic chip was designed to evaluate the characteristics of basic elements such as junctions, contacts, resisters, and wiring, in addition to their defect evaluations. The SR chip was designed to evaluate defects depending on the size of the SFQ circuits. The results of a long-term evaluation of the diagnostic and SR chips showed that there was fairly good correlation between the defects of the diagnostic chips and yields of the SRs. We could obtain a yield of 100% for SRs including 70,000JJs. These results show that considerable progress has been made in reducing the number of defects and improving reliability.
Akira FUJIMAKI Isao NAKANISHI Shigeyuki MIYAJIMA Kohei ARAI Yukio AKITA Takekazu ISHIDA
We propose a neutron diffractometer system based on MgB2 thin film detectors and an SFQ signal processor. Small dimensions of MgB2 thin film detectors and high processing capability of the single flux quantum (SFQ) circuits enable us to handle several thousand or more detectors in a cryocooler, leading to a very compact system. In addition, the system can provide many diffraction patterns for different kinetic energies simultaneously. Kinetic energy is determined for individual neutrons by means of the time-of-flight method by using SFQ time-to-digital converters (TDCs). Digital outputs of the TDCs are multiplexed in time domain and sent to room-temperature electronics with reduced number of cables. A dual-input SFQ signal processor including TDCs and a multiplexer has been successfully demonstrated with a time resolution of 20 ns and power consumption of 400 µW. These values show high feasibility of the neutron diffraction system proposed here.
Yuki YAMANASHI Toshiki KAINUMA Nobuyuki YOSHIKAWA Irina KATAEVA Hiroyuki AKAIKE Akira FUJIMAKI Masamitsu TANAKA Naofumi TAKAGI Shuichi NAGASAWA Mutsuo HIDAKA
A single flux quantum (SFQ) logic cell library has been developed for the 10 kA/cm2 Nb multi-layer fabrication process to efficiently design large-scale SFQ digital circuits. In the new cell library, the critical current density of Josephson junctions is increased from 2.5 kA/cm2 to 10 kA/cm2 compared to our conventional cell library, and the McCumber-Stwart parameter of each Josephson junction is increased to 2 in order to increase the circuit operation speed. More than 300 cells have been designed, including fundamental logic cells and wiring cells for passive interconnects. We have measured all cells and confirmed they stably operate with wide operating margins. On-chip high-speed test of the toggle flip-flop (TFF) cell has been performed by measuring the input and output voltages. The TFF cell at the input frequency of up to 400 GHz was confirmed to operate correctly. Also, several fundamental digital circuits, a 4-bit concurrent-flow shift register and a bit-serial adder have been designed using the new cell library, and the correct operations of the circuits have been demonstrated at high clock frequencies of more than 100 GHz.
Yoshihito HASHIMOTO Shinichi YOROZU Yoshio KAMEDA
A cryocooled system with I/O interface circuits, which enables high-speed system operation of superconductive single-flux-quantum (SFQ) circuits at over 40 GHz, and the demonstration of a 47-Gbps SFQ 22 switch system are presented. The cryocooled system has 32 I/Os and cools an SFQ multi-chip module (MCM) to 4 K with a two-stage 1-W Gifford-McMahon cryocooler. An SFQ 4:1 multiplexer (MUX) and an SFQ 1:4 demultiplexer (DEMUX) have been designed to interface the speed gap between the I/O (~10 Gbps/ch) and SFQ circuits (>40 GHz). An SFQ 22 switch chip, in which the MUX/DEMUX and an SFQ 22 switch are integrated, and an 8-channel superconductive voltage driver (SVD) chip have been designed with an advanced cell library for a junction critical current density of 10 kA/cm2. An SFQ 22 switch MCM has been made by flip-chip bonding the switch chip and SVD chip on a superconductive MCM carrier with φ 50-µm InSn solder bumps. An SFQ 22 switch system, which is the switch MCM packaged in the cryocooled system, has been demonstrated up to a port speed of 47 Gbps for the first time.
Akira FUJIMAKI Masamitsu TANAKA Takahiro YAMADA Yuki YAMANASHI Heejoung PARK Nobuyuki YOSHIKAWA
We describe the development of single-flux-quantum (SFQ) microprocessors and the related technologies such as designing, circuit architecture, microarchitecture, etc. Since the microprocessors studied here aim for a general-purpose computing system, we employ the complexity-reduced (CORE) architecture in which the high-speed nature of the SFQ circuits is used not for increasing processor performance but for reducing the circuit complexity. The bit-serial processing is the most suitable way to realize the CORE architecture. We assembled all the best technologies concerning SFQ integrated circuits and designed the SFQ microprocessors, CORE1α, CORE1β, and CORE1γ. The CORE1β was made up of about 11000 Josephson junctions and successfully demonstrated. The peak performance reached 1400 million operations per second with a power consumption of 3.4 mW. We showed that the SFQ microprocessors had an advantage in a performance density to semiconductor's ones, which lead to the potential for constructing a high performance SFQ-circuit-based computing system.
Keiichi TANABE Hironori WAKANA Koji TSUBONE Yoshinobu TARUTANI Seiji ADACHI Yoshihiro ISHIMARU Michitaka MARUYAMA Tsunehiro HATO Akira YOSHIDA Hideo SUZUKI
We have developed the fabrication process, the circuit design technology, and the cryopackaging technology for high-Tc single flux quantum (SFQ) devices with the aim of application to an analog-to-digital (A/D) converter circuit for future wireless communication and a sampler system for high-speed measurements. Reproducibility of fabricating ramp-edge Josephson junctions with IcRn products above 1 mV at 40 K and small Ic spreads on a superconducting groundplane was much improved by employing smooth multilayer structures and optimizing the junction fabrication process. The separated base-electrode layout (SBL) method that suppresses the Jc spread for interface-modified junctions in circuits was developed. This method enabled low-frequency logic operations of various elementary SFQ circuits with relatively wide bias current margins and operation of a toggle-flip-flop (T-FF) above 200 GHz at 40 K. Operation of a 1:2 demultiplexer, one of main elements of a hybrid-type Σ-Δ A/D converter circuit, was also demonstrated. We developed a sampler system in which a sampler circuit with a potential bandwidth over 100 GHz was cooled by a compact stirling cooler, and waveform observation experiments confirmed the actual system bandwidth well over 50 GHz.
Mutsuo HIDAKA Shuichi NAGASAWA Kenji HINODE Tetsuro SATOH
We developed an Nb-based fabrication process for single flux quantum (SFQ) circuits in a Japanese government project that began in September 2002 and ended in March 2007. Our conventional process, called the Standard Process (SDP), was improved by overhauling all the process steps and routine process checks for all wafers. Wafer yield with the improved SDP dramatically increased from 50% to over 90%. We also developed a new fabrication process for SFQ circuits, called the Advanced Process (ADP). The specifications for ADP are nine planarized Nb layers, a minimum Josephson junction (JJ) size of 11 µm, a line width of 0.8 µm, a JJ critical current density of 10 kA/cm2, a 2.4 Ω Mo sheet resistance, and vertically stacked superconductive contact holes. We fabricated an eight-bit SFQ shift register, a one million SQUID array and a 16-kbit RAM by using the ADP. The shift register was operated up to 120 GHz and no short or open circuits were detected in the one million SQUID array. We confirmed correct memory operations by the 16-kbit RAM and a 5.7 times greater integration level compared to that possible with the SDP.
Koji TSUBONE Hironori WAKANA Yoshinobu TARUTANI Seiji ADACHI Yoshihiro ISHIMARU Keiichi TANABE
Single flux quantum (SFQ) circuit elements have been designed and fabricated using the YBa2Cu3O7-δ ramp-edge junction technology. Logic operations of SFQ circuit elements, such as a toggle flip-flop (T-FF), a set-reset flip-flop (RS-FF), and a 96-junction Josephson transmission line (JTL), were successfully demonstrated, and dc supply current margins were confirmed up to temperatures higher than 30 K. The circuit layout was improved in order to suppress the critical current (Ic) spread that appears during the junction fabrication procedure. By employing the new circuit layout rule, correct operations at temperatures from 27 K to 34 K with dc supply current margins wider than 7% were confirmed for the T-FF with a single output. Moreover, the maximum operating frequencies of T-FFs were measured to be 360 GHz at 4.2 K and 210 GHz at 41 K, which are substantially higher than the values for the circuits with the conventional layout. According to the simulation result, the maximum operating frequency at 40 K was expected to be approximately 50% of the characteristic frequency at a bit error rate (BER) less than 10-6.
Yohei HORIMA Itsuhei SHIMIZU Masayuki KOBORI Takeshi ONOMI Koji NAKAJIMA
In this paper, we describe two approaches to optimize the Phase-Mode pipelined parallel multiplier. One of the approaches is reforming a data distribution for an AND array, which is named the hybrid structure. Another method is applying a Booth encoder as a substitute of the AND array in order to generate partial products. We design a 2-bit 2-bit Phase-Mode Booth encoder and test the circuit by the numerical simulations. The circuit consists of 21 ICF gates and operates correctly at a throughput of 37.0 GHz. The numbers of Josephson junctions and the pipelined stages in each scale of multipliers are reduced remarkably by using the encoder. According to our estimations, the Phase-Mode Booth encoder is the effective component to improve the performance of large-scale parallel multipliers.
Kazuo SAITOH Futoshi FURUTA Yoshihisa SOUTOME Tokuumi FUKAZAWA Kazumasa TAKAGI
The capability of a high-temperature superconducting sigma-delta modulator was studied by means of circuit simulation and FFT analysis. Parameters for the circuit simulation were extracted from experimental measurements. The present circuit simulation includes thermal-noise effect. Successive FFT analyses were made to evaluate the dynamic range of the sigma-delta modulator. As a result, the dynamic range was evaluated as 60.1 dB at temperature of 20 K and 56.9 dB at temperature of 77 K.
Haruhiro HASEGAWA Tatsunori HASHIMOTO Shuichi NAGASAWA Satoru HIRANO Kazunori MIYAHARA Youichi ENOMOTO
We investigated single flux quantum sinc filters with multistage decimation structure in order to realize high-speed sinc filter operation. Second- and third-order (k=2, 3) sinc filters with a decimation factor N=2 were designed and confirmed their proper operations. These sinc filters with N=2 are utilized as elementary circuit blocks of our multistage decimation sinc filters with N=2M, where M indicates the number of the stage of the decimation. As an example of the multistage decimation filter, we designed a k=2, N=4 sinc filter which was formed from a two-stage decimation structure using k=2, N=2 sinc filters, and confirmed its proper operation. The k=2, N=4 sinc filter consisted of 1372 Josephson junctions with the power consumption of 191 µW.
This article describes simulation study on SQUID applications for Single-Flux-Quantum(SFQ) Logic Circuits. Here, a SQUID is compatible to a Quantum Flux Parametorn (QFP). Several new circuits based on a SQUID are investigated. A cascaded SQUID is proposed with the signal amplitude in the same order of an SFQ. An SFQ-pulse driving circuits with the new SQUID are successfully simulated. An SFQ trap which catches SFQs is newly proposed. Focusing on a circulating current of a segment in a Josephson transmission line (JTL), an SFQ-pulse is non-destructively detected by a SQUID. A conventional SQUID inserted in a JTL operates as a gate which controls SFQ-pulse transmission through it. Compatibility of SQUIDs and SFQ circuits is demonstrated.
We have quantitatively and systematically investigated the effect of parasitic inductance on rapid single flux quantum (RSFQ) circuits by numerical simulation. While a parasitic inductance in parallel to a junction has virtually no effect on the circuit performance, a parasitic inductance in series with a junction significantly reduces the operating margins and speeds of circuits that have been optimized with the assumption that no parasitic inductance exists. To improve the reduced margins and speeds we have re-optimized the circuits for operation with parasitic inductance. While the speeds are sufficiently improved by the re-optimization procedure, the margins do not reach those without the parasitics. This suggests that the parasitic inductance shrinks the operating regions of the circuits and improvement of the margins by changing only the values of the parameters is limited. For further improvement of the margins it is important to employ processes and layouts that minimize the series parasitic inductance.
This paper reviews the recent development of the Boolean Single Flux Quantum (BSFQ) circuits. BSFQ circuits perform Boolean operation based on the superconducting flux level, and let digital bits propagate in the form of 'set' and 'reset' pulses using dual-rail Josephson transmission line (JTL). Just the same as CMOS circuits BSFQ circuits do not require any local clock system for the operation gates, and thus are delay insensitive, and comparably simple in terms of the number of Josephson junctions. Implementation of basic BSFQ circuits, namely 'NOT,' 'AND,' 'OR,' 'XOR' gate, is described. These circuits have been experimentally tested, and their workability has been proven.
Nobuyuki YOSHIKAWA Kaoru YONEYAMA
We have developed a parameter optimization tool, Monte Carlo Josephson simulator (MJSIM), for rapid single flux quantum (RSFQ) digital circuits based on a Monte Carlo yield analysis. MJSIM can generate a number of net lists for the JSIM, where all parameter values are varied randomly according to the Gaussian distribution function, and calculate the circuit yields automatically. MJSIM can also produce an improved parameter set using the algorithm of the center-of-gravity method. In this algorithm, an improved parameter vector is derived by calculating the average of parameter vectors inside and outside the operating region. As a case study, we have optimized the circuit parameters of an RS flip-flop, and investigated the validity and efficiency of this optimization method by considering the convergency and initial condition dependence of the final results. We also proposed a method for accelerating the optimization speed by increasing 3σ spreads of the parameter distribution during the optimization.
Hiroaki MYOREN Seiichiro ONO Susumu TAKADA
We propose a universal NAND logic gate based on single flux quantum (SFQ) logic. The NAND gate enables the construction of any logic circuits. In the proposed gate, three superconducting loops share two Josephson junctions (JJs). The critical currents of the JJs were designed to allow each of any two loops to trap an SFQ at the same time. We simulated dynamic operation of this NAND gate. The results show that the NAND gate can operate with a delay time of 45 ps, and the power consumption of this circuit is close to 0.06 µW/gate.