1-5hit |
Shuichi NAGASAWA Masamitsu TANAKA Naoki TAKEUCHI Yuki YAMANASHI Shigeyuki MIYAJIMA Fumihiro CHINA Taiki YAMAE Koki YAMAZAKI Yuta SOMEI Naonori SEGA Yoshinao MIZUGAKI Hiroaki MYOREN Hirotaka TERAI Mutsuo HIDAKA Nobuyuki YOSHIKAWA Akira FUJIMAKI
We developed a Nb 4-layer process for fabricating superconducting integrated circuits that involves using caldera planarization to increase the flexibility and reliability of the fabrication process. We call this process the planarized high-speed standard process (PHSTP). Planarization enables us to flexibly adjust most of the Nb and SiO2 film thicknesses; we can select reduced film thicknesses to obtain larger mutual coupling depending on the application. It also reduces the risk of intra-layer shorts due to etching residues at the step-edge regions. We describe the detailed process flows of the planarization for the Josephson junction layer and the evaluation of devices fabricated with PHSTP. The results indicated no short defects or degradation in junction characteristics and good agreement between designed and measured inductances and resistances. We also developed single-flux-quantum (SFQ) and adiabatic quantum-flux-parametron (AQFP) logic cell libraries and tested circuits fabricated with PHSTP. We found that the designed circuits operated correctly. The SFQ shift-registers fabricated using PHSTP showed a high yield. Numerical simulation results indicate that the AQFP gates with increased mutual coupling by the planarized layer structure increase the maximum interconnect length between gates.
A nonvolatile field-programmable gate array (NV-FPGA), where the circuit-configuration information still remains without power supply, offers a powerful solution against the standby power issue. In this paper, an NV-FPGA is proposed where the programmable logic and interconnect function blocks are described in a hardware description language and are pushed through a standard-cell-based design flow with nonvolatile flip-flops. The use of the standard-cell-based design flow makes it possible to migrate any arbitrary process technology and to perform architecture-level simulation with physical information. As a typical example, the proposed NV-FPGA is designed under 55nm CMOS/100nm magnetic tunnel junction (MTJ) technologies, and the performance of the proposed NV-FPGA is evaluated in comparison with that of a CMOS-only volatile FPGA.
Akira FUJIMAKI Masamitsu TANAKA Ryo KASAGI Katsumi TAKAGI Masakazu OKADA Yuhi HAYAKAWA Kensuke TAKATA Hiroyuki AKAIKE Nobuyuki YOSHIKAWA Shuichi NAGASAWA Kazuyoshi TAKAGI Naofumi TAKAGI
We describe a large-scale integrated circuit (LSI) design of rapid single-flux-quantum (RSFQ) circuits and demonstrate several reconfigurable data-path (RDP) processor prototypes based on the ISTEC Advanced Process (ADP2). The ADP2 LSIs are made up of nine Nb layers and Nb/AlOx/Nb Josephson junctions with a critical current density of 10kA/cm2, allowing higher operating frequencies and integration. To realize truly large-scale RSFQ circuits, careful design is necessary, with several compromises in the device structure, logic gates, and interconnects, balancing the competing demands of integration density, design flexibility, and fabrication yield. We summarize numerical and experimental results related to the development of a cell-based design in the ADP2, which features a unit cell size reduced to 30-µm square and up to four strip line tracks in the unit cell underneath the logic gates. The ADP LSIs can achieve ∼10 times the device density and double the operating frequency with the same power consumption per junction as conventional LSIs fabricated using the Nb four-layer process. We report the design and test results of RDP processor prototypes using the ADP2 cell library. The RDP processors are composed of many arrays of floating-point units (FPUs) and switch networks, and serve as accelerators in a high-performance computing system. The prototypes are composed of two-dimensional arrays of several arithmetic logic units instead of FPUs. The experimental results include a successful demonstration of full operation and reconfiguration in a 2×2 RDP prototype made up of 11.5k junctions at 45GHz after precise timing design. Partial operation of a 4×4 RDP prototype made up of 28.5k-junctions is also demonstrated, indicating the scalability of our timing design.
Yoshio KAMEDA Shinichi YOROZU Shuichi TAHARA
We describe the logic design of a single-flux-quantum (SFQ) 22 unit switch. It is the main component of the SFQ Banyan packet switch we are developing that enables a switching capacity of over 1 Tbit/s. In this paper, we focus on the design of the controller in the unit switch. The controller does not have a simple "off-the-shelf" conventional circuit, like those used in shift registers or adders. To design such a complicated random logic circuit, we need to adopt a systematic top-down design approach. Using a graphical technique, we first obtained logic functions. Next, to use the deep pipeline architecture, we broke down the functions into one-level logic operations that can be executed within one clock cycle. Finally, we mapped the functions on to the physical circuits using pre-designed SFQ standard cells. The 22 unit switch consists of 59 logic gates and needs about 600 Josephson junctions without gate interconnections. We tested the gate-level circuit by logic simulation and found that it operates correctly at a throughput of 40 GHz.
This paper presents a heuristic algorithm that minimizes the delay of the given circuit through a two-pass cell selection in cell-based design. First, we introduce a new graph, called candidate web, which conveniently represents all cell combinations available for the implementation of the given circuit. We, then, present an efficient method to obtain a tentative set of optimal cells, while estimating the delay of the longest path between each cell and the primary output on the candidate web. In this step, multiple cells are allowed to bind the same logic gate. Finally, we describe how the proposed approach actually selects the optimal cells from the tentative set, which would minimize the circuit delay. Experimental results on a set of benchmarks show that the proposed approach is effective and efficient in minimizing the delay of the given circuit.