Mitsuyoshi KISHIHARA Masaya TAKEUCHI Akinobu YAMAGUCHI Yuichi UTSUMI Isao OHTA
The microfabrication technique based on synchrotron radiation (SR) direct etching process has recently been applied to construct PTFE microstructures. This paper proposes a PTFE substrate integrated waveguide (PTFE SIW). It is expected that the PTFE SIW contributes to the improvement of the structural strength. A rectangular through-hole is introduced taking the advantage of the SR direct etching process. First, a PTFE SIW for the Q-band is designed. Then, a cruciform 3-dB directional coupler consisting of the PTFE SIW is designed and fabricated by the SR direct etching process. The validity of the PTFE SIW coupler is confirmed by measuring the frequency characteristics of the S-parameters. The mechanical strength of the PTFE SIW and the peeling strength of its Au film are also additionally investigated.
Mutsuo HIDAKA Shuichi NAGASAWA
This review provides a current overview of the fabrication processes for superconducting digital circuits at CRAVITY (clean room for analog and digital superconductivity) at the National Institute of Advanced Industrial Science and Technology (AIST), Japan. CRAVITY routinely fabricates superconducting digital circuits using three types of fabrication processes and supplies several thousand chips to its collaborators each year. Researchers at CRAVITY have focused on improving the controllability and uniformity of device parameters and the reliability, which means reducing defects. These three aspects are important for the correct operation of large-scale digital circuits. The current technologies used at CRAVITY permit ±10% controllability over the critical current density (Jc) of Josephson junctions (JJs) with respect to the design values, while the critical current (Ic) uniformity is within 1σ=2% for JJs with areas exceeding 1.0 µm2 and the defect density is on the order of one defect for every 100,000 JJs.
Akira ITO Rei UENO Naofumi HOMMA
This study presents a formal verification method for Galois-field (GF) arithmetic circuits with the characteristics of more than two values. The proposed method formally verifies the correctness of circuit functionality (i.e., the input-output relations given as GF-polynomials) by checking the equivalence between a specification and a gate-level netlist. We represent a netlist using simultaneous algebraic equations and solve them based on a novel polynomial reduction method that can be efficiently applied to arithmetic over extension fields $mathbb{F}_{p^m}$, where the characteristic p is larger than two. By using the reverse topological term order to derive the Gröbner basis, our method can complete the verification, even when a target circuit includes bugs. In addition, we introduce an extension of the Galois-Field binary moment diagrams to perform the polynomial reductions faster. Our experimental results show that the proposed method can efficiently verify practical $mathbb{F}_{p^m}$ arithmetic circuits, including those used in modern cryptography. Moreover, we demonstrate that the extended polynomial reduction technique can enable verification that is up to approximately five times faster than the original one.
Thao-Nguyen TRUONG Ryousei TAKANO
Data parallelism is the dominant method used to train deep learning (DL) models on High-Performance Computing systems such as large-scale GPU clusters. When training a DL model on a large number of nodes, inter-node communication becomes bottle-neck due to its relatively higher latency and lower link bandwidth (than intra-node communication). Although some communication techniques have been proposed to cope with this problem, all of these approaches target to deal with the large message size issue while diminishing the effect of the limitation of the inter-node network. In this study, we investigate the benefit of increasing inter-node link bandwidth by using hybrid switching systems, i.e., Electrical Packet Switching and Optical Circuit Switching. We found that the typical data-transfer of synchronous data-parallelism training is long-lived and rarely changed that can be speed-up with optical switching. Simulation results on the Simgrid simulator show that our approach speed-up the training time of deep learning applications, especially in a large-scale manner.
Tsutomu SASAO Takashi MATSUBARA Katsufumi TSUJI Yoshiaki KOGA
A universal interconnection network implements arbitrary interconnections among n terminals. This paper considers a problem to realize such a network using contact switches. When n=2, it can be implemented with a single switch. The number of different connections among n terminals is given by the Bell number B(n). The Bell number shows the total number of methods to partition n distinct elements. For n=2, 3, 4, 5 and 6, the corresponding Bell numbers are 2, 5, 15, 52, and 203, respectively. This paper shows a method to realize an n terminal universal interconnection network with $rac {3}{8}(n^2-1)$ contact switches when n=2m+1≥5, and $rac {n}{8}(3n+2)$ contact switches, when n=2m≥6. Also, it shows that a lower bound on the number of contact switches to realize an n-terminal universal interconnection network is ⌈log 2B(n)⌉, where B(n) is the Bell number.
Shinpei OSHIMA Hiroto MARUYAMA
In this paper, we propose a design method for a diplexer using a surface acoustic wave (SAW) filter, a multilayer ceramic filter, chip inductors, and chip capacitors. A controllable transmission zero can be created in the stopband by designing matching circuits based on the out-of-band characteristics of the SAW filter using this method. The proposed method can achieve good attenuation performance and a compact size because it does not use an additional resonator for creating the controllable transmission zero and the matching circuits are composed of only five components. A diplexer is designed for 2.4 GHz wireless systems and a global positioning system receiver using the proposed method. It is compact (8.0 mm × 8.0 mm), and the measurement results indicate good attenuation performance with the controllable transmission zero.
Yuta UKON Shimpei SATO Atsushi TAKAHASHI
Advanced information-processing services such as computer vision require a high-performance digital circuit to perform high-load processing at high speed. To achieve high-speed processing, several image-processing applications use an approximate computing technique to reduce idle time of the circuit. However, it is difficult to design the high-speed image-processing circuit while controlling the error rate so as not to degrade service quality, and this technique is used for only a few applications. In this paper, we propose a method that achieves high-speed processing effectively in which processing time for each task is changed by roughly detecting its completion. Using this method, a high-speed processing circuit with a low error rate can be designed. The error rate is controllable, and a circuit design method to minimize the error rate is also presented in this paper. To confirm the effectiveness of our proposal, a ripple-carry adder (RCA), 2-dimensional discrete cosine transform (2D-DCT) circuit, and histogram of oriented gradients (HOG) feature calculation circuit are evaluated. Effective clock periods of these circuits obtained by our method with around 1% error rate are improved about 64%, 6%, and 12%, respectively, compared with circuits without error. Furthermore, the impact of the miscalculation on a video monitoring service using an object detection application is investigated. As a result, more than 99% of detection points required to be obtained are detected, and it is confirmed the miscalculation hardly degrades the service quality.
Toshishige SHIMAMURA Hiroki MORIMURA
A new threshold circuit technique is proposed for a vibration sensing circuit that operates at a nanowatt power level. The sensing circuits that use sample-and-hold require a clock signal, and they consume power to generate a signal. In the use of a Schmitt trigger circuit that does not use a clock signal, a sink current flows when thresholding the analog signal output. The requirements for millimeter-sized wireless sensor nodes are an average power on the order of a nanowatt and a signal transition time of less than 1 ms. To meet these requirements, our circuit limits the sink current with a nanoampere-level current source. The chattering caused by current limiting is suppressed by feeding back the change in output voltage to the limiting current. The increase in the signal transition time that is caused by current limiting is reduced by accelerating the discharge of the load capacitance. For a test chip fabricated in the 0.35-µm CMOS process, the proposed threshold circuits operate without chattering and the average powers are 0.7-3 nW. The signal transition times are estimated in a circuit simulation to be 65-97 µs. The proposed circuit has 1/150th the power-delay product with no time interval of the sensing operation under the condition that the time interval is 1s. These results indicate that, the proposed threshold circuits are suitable for vibration sensing in millimeter-sized wireless sensor nodes.
This paper proposes a pulse-width modulated (PWM) signaling[1] to send clock and data over a pair of channels for in-vehicle network where a closed chain of point-to-point (P2P) interconnection between electronic control units (ECU) has been established. To improve detection speed and margin of proposed receiver, we also proposed a novel clock and data recovery (CDR) scheme with 0.5 unit-interval (UI) tuning range and a PWM generator utilizing 10 equally-spaced phases. The feasibility of proposed system has been proved by successfully detecting 1.25 Gb/s data delivered via 3 ECUs and inter-channels in 180 nm CMOS technology. Compared to previous study, the proposed system achieved better efficiency in terms of power, cost, and reliability.
Robert Chen-Hao CHANG Wei-Chih CHEN Shao-Che SU
A switching-based Li-ion battery charger without any additional compensation circuit is proposed. The proposed charger adopts a dual-current sensor and a current window control to ensure system stability in different charge modes: trickle current, constant current, and constant voltage. The proposed Li-ion battery charger has less chip area and a simpler structure to design than a conventional Li-ion battery charger with pulse width modulation. Simulation with a 1000µF capacitor as the battery equivalent, a 5V input, and a 1A charge current resulted in a charging time of 1.47ms and a 91% power efficiency.
The circuit satisfiability problem has been intensively studied since Ryan Williams showed a connection between the problem and lower bounds for circuit complexity. In this letter, we present a #SAT algorithm for synchronous Boolean circuits of n inputs and s gates in time $2^{nleft(1 - rac{1}{2^{O(s/n)}} ight)}$ if s=o(n log n).
Chao WANG Xianliang LUO Mohamed ATEF Pan TANG
In this paper, a balance operation Transimpedance Amplifier (TIA) with low-noise has been implemented for optical receivers in 130 nm SiGe BiCMOS Technology, in which the optimal tradeoff emitter current density and the location of high-frequency noise corner were analyzed for acquiring low-noise performance. The Auto-Zero Feedback Loop (AZFL) without introducing unnecessary noises at input of the TIA, the tail current sink with high symmetries and the balance operation TIA with the shared output of Operational Amplifier (OpAmp) in AZFL were designed to keep balanced operation for the TIA. Moreover, cascode and shunt-feedback were also employed to expanding bandwidth and decreasing input referred noise. Besides, the formula for calculating high-frequency noise corner in Heterojunction Bipolar Transistor (HBT) TIA with shunt-feedback was derived. The electrical measurement was performed to validate the notions described in this work, appearing 9.6 pA/√Hz of input referred noise current Power Spectral Density (PSD), balance operation (VIN1=896mV, VIN2=896mV, VOUT1=1.978V, VOUT2=1.979V), bandwidth of 32GHz, overall transimpedance gain of 68.6dBΩ, a total 117mW power consumption and chip area of 484µm × 486µm.
Shogo SEMBA Hiroshi SAITO Masato TATSUOKA Katsuya FUJIMURA
In this paper, we propose four optimization methods during the Register Transfer Level (RTL) conversion from synchronous RTL models into asynchronous RTL models. The modularization of data-path resources and the use of appropriate D flip-flops reduce the circuit area. Fixing the control signal of the multiplexers and inserting latches for the data-path resources reduce the dynamic power consumption. In the experiment, we evaluated the effect of the proposed optimization methods. The combination of all optimization methods could reduce the energy consumption by 21.9% on average compared to the ones without the proposed optimization methods.
Tsutomu INAMOTO Yoshinobu HIGAMI
In this paper, we aim to develop technologies for the circuit fault diagnosis and propose a formulation of a measure of a test pattern for the circuit fault diagnosis. Given a faulty circuit, the fault diagnosis is to deduce locations of faults that had occurred in the circuit. The fault diagnosis is executed in software before the failure analysis by which engineers inspect physical defects, and helps to improve the manufacturing process which yielded faulty circuits. The heart of the fault diagnosis is to distinguish between candidate faults by using test patterns, which are applied to the circuit-under-diagnosis (CUD), and thus test patterns that can distinguish as many faults as possible need to be generated. This fact motivates us to consider the test pattern measure based on the number of fault-pairs that become distinguished by a test pattern. To the best of the authors' knowledge, that measure requires the computational time of complexity order O(NF2), where NF denotes the number of candidate faults. Since NF is generally large for real industrial circuits, the computational time of the measure is long even when a high-performance computer is used. The formulation proposed in this paper makes it possible to calculate the measure in the computational complexity of O(NF log NF), and thus that measure is useful for the test pattern selection in the fault diagnosis. In computational experiments, the effectiveness of the formulation is demonstrated as samples of computational times of the measure calculated by the traditional and the proposed formulae and thorough comparisons between several greedy heuristics which are based on the measure.
In this paper, we propose a design method to design asynchronous circuits with bundled-data implementation on commercial Field Programmable Gate Arrays using placement constraints. The proposed method uses two types of placement constraints to reduce the number of delay adjustments to fix timing violations and to improve the performance of the bundled-data implementation. We also propose a floorplan algorithm to reduce the control-path delays specific to the bundled-data implementation. Using the proposed method, we could design the asynchronous circuits whose performance is close to and energy consumption is small compared to the synchronous counterparts with less delay adjustment.
Keijiro SUZUKI Ryotaro KONOIKE Satoshi SUDA Hiroyuki MATSUURA Shu NAMIKI Hitoshi KAWASHIMA Kazuhiro IKEDA
We review our research progress of multi-port optical switches based on the silicon photonics platform. Up to now, the maximum port-count is 32 input ports×32 output ports, in which transmissions of all paths were demonstrated. The switch topology is path-independent insertion-loss (PILOSS) which consists of an array of 2×2 element switches and intersections. The switch presented an average fiber-to-fiber insertion loss of 10.8 dB. Moreover, -20-dB crosstalk bandwidth of 14.2 nm was achieved with output-port-exchanged element switches, and an average polarization-dependent loss (PDL) of 3.2 dB was achieved with a non-duplicated polarization-diversity structure enabled by SiN overpass waveguides. In the 8×8 switch, we demonstrated wider than 100-nm bandwidth for less than -30-dB crosstalk with double Mach-Zehnder element switches, and less than 0.5 dB PDL with polarization diversity scheme which consisted of two switch matrices and fiber-type polarization beam splitters. Based on the switch performances described above, we discuss further improvement of switching performances.
Roberto PROIETTI Xian XIAO Marjan FARIBORZ Pouya FOTOUHI Yu ZHANG S. J. Ben YOO
This paper summarizes our recent studies on architecture, photonic integration, system validation and networking performance analysis of a flexible low-latency interconnect optical network switch (Flex-LIONS) for datacenter and high-performance computing (HPC) applications. Flex-LIONS leverages the all-to-all wavelength routing property in arrayed waveguide grating routers (AWGRs) combined with microring resonator (MRR)-based add/drop filtering and multi-wavelength spatial switching to enable topology and bandwidth reconfigurability to adapt the interconnection to different traffic profiles. By exploiting the multiple free spectral ranges of AWGRs, it is also possible to provide reconfiguration while maintaining minimum-diameter all-to-all interconnectivity. We report experimental results on the design, fabrication, and system testing of 8×8 silicon photonic (SiPh) Flex-LIONS chips demonstrating error-free all-to-all communication and reconfiguration exploiting different free spectral ranges (FSR0 and FSR1, respectively). After reconfiguration in FSR1, the bandwidth between the selected pair of nodes is increased from 50Gb/s to 125Gb/s while an all interconnectivity at 25Gb/s is maintained using FSR0. Finally, we investigate the use of Flex-LIONS in two different networking scenarios. First, networking simulations for a 256-node datacenter inter-rack communication scenario show the potential latency and energy benefits when using Flex-LIONS for optical reconfiguration based on different traffic profiles (a legacy fat-tree architecture is used for comparison). Second, we demonstrate the benefits of leveraging two FSRs in an 8-node 64-core computing system to provide reconfiguration for the hotspot nodes while maintaining minimum-diameter all-to-all interconnectivity.
Daichi FURUBAYASHI Yuta KASHIWAGI Takanori SATO Tadashi KAWAI Akira ENOKIHARA Naokatsu YAMAMOTO Tetsuya KAWANISHI
A new structure of the electro-optic modulator to compensate the third-order intermodulation distortion (IMD3) is introduced. The modulator includes two Mach-Zehnder modulators (MZMs) operating with frequency chirp and the two modulated outputs are combined with an adequate phase difference. We revealed by theoretical analysis and numerical calculations that the IMD3 components in the receiver output could be selectively suppressed when the two MZMs operate with chirp parameters of opposite signs to each other. Spectral power of the IMD3 components in the proposed modulator was more than 15dB lower than that in a normal Mach-Zehnder modulator at modulation index between 0.15π and 0.25π rad. The IMD3 compensation properties of the proposed modulator was experimentally confirmed by using a dual parallel Mach-Zehnder modulator (DPMZM) structure. We designed and fabricated the modulator with the single-chip structure and the single-input operation by integrating with 180° hybrid coupler on the modulator substrate. Modulation signals were applied to each modulation electrode by the 180° hybrid coupler to set the chirp parameters of two MZMs of the DPMZM. The properties of the fabricated modulator were measured by using 10GHz two-tone signals. The performance of the IMD3 compensation agreed with that in the calculation. It was confirmed that the IMD3 compensation could be realized even by the fabricated modulator structure.
Ai YANAGIHARA Keita YAMAGUCHI Takashi GOH Kenya SUZUKI
We demonstrated a compact 16×16 multicast switch (MCS) made from a silica-based planar lightwave circuit (PLC). The switch utilizes a new electrical connection method based on surface mount technology (SMT). Five electrical connectors are soldered directly to the PLC by using the standard reflow process used for electrical devices. We reduced the chip size to half of one made with conventional wire bonding technology. We obtained satisfactory solder contacts and excellent switching properties. These results indicate that the proposed method is suitable for large-scale optical switches including MCSs, variable optical attenuators, dispersion compensators, and so on.
Toshiya MURAI Yuya SHOJI Nobuhiko NISHIYAMA Tetsuya MIZUMOTO
Magneto-optical (MO) switches operate with a dynamically applied magnetic field. The MO devices presented in this paper consist of microring resonators (MRRs) fabricated on amorphous silicon-on-garnet platform. Two types of MO switches with MRRs were developed. In the first type, the switching state is controlled by an external magnetic field component included in the device. By combination of MO and thermo-optic effects, wavelength tunable operation is possible without any additional heater, and broadband switching is achievable. The other type of switch is a self-holding optical switch integrated with an FeCoB thin-film magnet. The switching state is driven by the remanence of the integrated thin-film magnet, and the state is maintained without any power supply.