Reference current used in sense amplifiers is a crucial factor in a single-end read manner for emerging memories. Dummy cell average read scheme uses multiple pairs of dummy cells inside the array to generate an accurate reference current for data sensing. The previous research adopts current mirror sense amplifier (CMSA) which is compatible with the dummy cell average read scheme. However, clamped bit-line sense amplifier (CBLSA) has higher sensing speed and lower power consumption compared with CMSA. Therefore, applying CBLSA to dummy cell average read scheme is expected to enhance the performance. This paper reveals that direct combination of CBLSA and dummy cell average read scheme leads to sense margin degradation. In order to solve this problem, a new array design is proposed to make CBLSA compatible with dummy cell average read scheme. Current mirror structure is employed to prevent CBLSA from being short-circuited directly. The simulation result shows that the minimum sensible tunnel magnetoresistance ratio (TMRR) can be extended from 14.3% down to 1%. The access speed of the proposed sensing scheme is less than 2 ns when TMRR is 70% or larger, which is about twice higher than the previous research. And this circuit design just consumes half of the energy in one read cycle compared with the previous research. In the proposed array architecture, all the dummy cells can be always short-circuited in totally isolated area by low-resistance metal wiring instead of using controlling transistors. This structure is able to contribute to increasing the dummy cell averaging effect. Besides, the array-level simulation validates that the array design is accessible to every data cell. This design is generally applicable to any kinds of resistance-variable emerging memories including STT-MRAM.
Kento HASEGAWA Masao YANAGISAWA Nozomu TOGAWA
Cybersecurity has become a serious concern in our daily lives. The malicious functions inserted into hardware devices have been well known as hardware Trojans. In this letter, we propose a hardware-Trojan classification method at gate-level netlists utilizing boundary net structures. We first use a machine-learning-based hardware-Trojan detection method and classify the nets in a given netlist into a set of normal nets and a set of Trojan nets. Based on the classification results, we investigate the net structures around the boundary between normal nets and Trojan nets, and extract the features of the nets mistakenly identified to be normal nets or Trojan nets. Finally, based on the extracted features of the boundary nets, we again classify the nets in a given netlist into a set of normal nets and a set of Trojan nets. The experimental results demonstrate that our proposed method outperforms an existing machine-learning-based hardware-Trojan detection method in terms of its true positive rate.
Keisuke KAYANO Yojiro MORI Hiroshi HASEGAWA Ken-ichi SATO Shoichiro ODA Setsuo YOSHIDA Takeshi HOSHIDA
The spectral efficiency of photonic networks can be enhanced by the use of higher modulation orders and narrower channel bandwidth. Unfortunately, these solutions are precluded by the margins required to offset uncertainties in system performance. Furthermore, as recently highlighted, the disaggregation of optical transport systems increases the required margin. We propose here highly spectrally efficient networks, whose margins are minimized by transmission-quality-aware adaptive modulation-order/channel-bandwidth assignment enabled by optical performance monitoring (OPM). Their effectiveness is confirmed by experiments on 400-Gbps dual-polarization quadrature phase shift keying (DP-QPSK) and 16-ary quadrature amplitude modulation (DP-16QAM) signals with the application of recently developed Q-factor-based OPM. Four-subcarrier 32-Gbaud DP-QPSK signals within 150/162.5/175GHz and two-subcarrier 32-Gbaud DP-16QAM signals within 75/87.5/100GHz are experimentally analyzed. Numerical network simulations in conjunction with the experimental results demonstrate that the proposed scheme can drastically improve network spectral efficiency.
Hideo FUJIWARA Katsuya FUJIWARA Toshinori HOSOKAWA
Linear feed-forward/feedback shift registers are used as an effective tool of testing circuits in various fields including built-in self-test and secure scan design. In this paper, we consider the issue of testing linear feed-forward/feedback shift registers themselves. To test linear feed-forward/feedback shift registers, it is necessary to generate a test sequence for each register. We first present an experimental result such that a commercial ATPG (automatic test pattern generator) cannot always generate a test sequence with high fault coverage even for 64-stage linear feed-forward/feedback shift registers. We then show that there exists a universal test sequence with 100% of fault coverage for the class of linear feed-forward/feedback shift registers so that no test generation is required, i.e., the cost of test generation is zero. We prove the existence theorem of universal test sequences for the class of linear feed-forward/feedback shift registers.
Keisuke NAKASHIMA Takahiro MATSUDA Masaaki NAGAHARA Tetsuya TAKINE
Wireless networked control systems (WNCSs) are control systems whose components are connected through wireless networks. In WNCSs, a controlled object (CO) could become unstable due to bursty packet losses in addition to random packet losses and round-trip delays on wireless networks. In this paper, to reduce these network-induced effects, we propose a new design for multihop TDMA-based WNCSs with two-disjoint-path switching, where two disjoint paths are established between a controller and a CO, and they are switched if bursty packet losses are detected. In this system, we face the following two difficulties: (i) link scheduling in TDMA should be done in such a way that two paths can be switched without rescheduling, taking into account of the constraint of control systems. (ii) the conventional cross-layer design method of control systems is not directly applicable because round-trip delays may vary according to the path being used. Therefore, to overcome the difficulties raised by the two-path approach, we reformulate link scheduling in multihop TDMA and cross-layer design for control systems. Simulation results confirm that the proposed WNCS achieves better performance in terms of the 2-norm of CO's states.
Meiting XUE Huan ZHANG Weijun LI Feng YU
Sorting is one of the most fundamental problems in mathematics and computer science. Because high-throughput and flexible sorting is a key requirement in modern databases, this paper presents efficient techniques for designing a high-throughput sorting matrix that supports continuous data sequences. There have been numerous studies on the optimization of sorting circuits on FPGA (field-programmable gate array) platforms. These studies focused on attaining high throughput for a single command with fixed data width. However, the architectures proposed do not meet the requirement of diversity for database data types. A sorting matrix architecture is thus proposed to overcome this problem. Our design consists of a matrix of identical basic sorting cells. The sorting cells work in a pipeline and in parallel, and the matrix can simultaneously process multiple data streams, which can be combined into a high-width single-channel data stream or low-width multiple-channel data streams. It can handle continuous sequences and allows for sorting variable-length data sequences. Its maximum throughput is approximately 1.4 GB/s for 32-bit sequences and approximately 2.5 GB/s for 64-bit sequences on our platform.
Yuma ABE Masaki OGURA Hiroyuki TSUJI Amane MIURA Shuichi ADACHI
Satellite communications (SATCOM) systems play important roles in wireless communication systems. In the future, they will be required to accommodate rapidly increasing communication requests from various types of users. Therefore, we propose a framework for efficient resource management in large-scale SATCOM systems that integrate multiple satellites. Such systems contain hundreds of thousands of communication satellites, user terminals, and gateway stations; thus, our proposed framework enables simpler and more reliable communication between users and satellites. To manage and control this system efficiently, we formulate an optimization problem that designs the network structure and allocates communication resources for a large-scale SATCOM system. In this mixed integer programming problem, we allow the cost function to be a combination of various factors so that SATCOM operators can design the network according to their individual management strategies. These factors include the total allocated bandwidth to users, the number of satellites and gateway stations to be used, and the number of total satellite handovers. Our numerical simulations show that the proposed management strategy outperforms a conventional strategy in which a user can connect to only one specific satellite determined in advance. Furthermore, we determine the effect of the number of satellites in the system on overall system performance.
Tung Thanh VU Duy Trong NGO Minh N. DAO Quang-Thang DUONG Minoru OKADA Hung NGUYEN-LE Richard H. MIDDLETON
This paper studies the joint optimization of precoding, transmit power and data rate allocation for energy-efficient full-duplex (FD) cloud radio access networks (C-RANs). A new nonconvex problem is formulated, where the ratio of total sum rate to total power consumption is maximized, subject to the maximum transmit powers of remote radio heads and uplink users. An iterative algorithm based on successive convex programming is proposed with guaranteed convergence to the Karush-Kuhn-Tucker solutions of the formulated problem. Numerical examples confirm the effectiveness of the proposed algorithm and show that the FD C-RANs can achieve a large gain over half-duplex C-RANs in terms of energy efficiency at low self-interference power levels.
Daijoon HYUN Younggwang JUNG Youngsoo SHIN
Multiple patterning lithography allows fine patterns beyond lithography limit, but it suffers from a large process cost. In this paper, we address a method to reduce the number of V0 masks; it consists of two sub-problems. First, stitch-induced via (SIV) is introduced to reduce the number of V0 masks. It involves the redesign of standard cells to replace some vias in V0 layer with SIVs, such that the remaining vias can be assigned to the reduced masks. Since SIV formation requires metal stitches in different masks, SIV replacement and metal mask assignment should be solved simultaneously. This sub-problem is formulated as integer linear programming (ILP). In the second sub-problem, inter-row via conflict aware detailed placement is addressed. Single row placement optimization is performed for each row to remove metal and inter-row via conflicts, while minimizing cell displacements. Since it is time consuming to consider many cell operations at once, we apply a few operations iteratively, where different operations are applied to each iteration and to each cell depending on whether the cell has a conflict in the previous iteration. Remaining conflicts are then removed by mapping conflict cells to white spaces. To this end, we minimize the number of cells to move and maximize the number of large white spaces before mapping. Experimental results demonstrate that the cell placement with two V0 masks is completed by proposed methods, with 7 times speedup and 21% reduction in total cell displacement, compared to conventional detailed placement.
Kai NAKAMURA Kenta IWAI Yoshinobu KAJIKAWA
In this paper, we propose an automatic design support system for compact acoustic devices such as microspeakers inside smartphones. The proposed design support system outputs the dimensions of compact acoustic devices with the desired acoustic characteristic. This system uses a deep neural network (DNN) to obtain the relationship between the frequency characteristic of the compact acoustic device and its dimensions. The training data are generated by the acoustic finite-difference time-domain (FDTD) method so that many training data can be easily obtained. We demonstrate the effectiveness of the proposed system through some comparisons between desired and designed frequency characteristics.
Shimpei SATO Eijiro SASSA Yuta UKON Atsushi TAKAHASHI
In order to obtain high-performance circuits in advanced technology nodes, design methodology has to take the existence of large delay variations into account. Clock scheduling and speculative execution have overheads to realize them, but have potential to improve the performance by averaging the imbalance of maximum delay among paths and by utilizing valid data available earlier than worst-case scenarios, respectively. In this paper, we propose a high-performance digital circuit design method with speculative executions with less overhead by utilizing clock scheduling with delay insertions effectively. The necessity of speculations that cause overheads is effectively reduced by clock scheduling with delay insertion. Experiments show that a generated circuit achieves 26% performance improvement with 1.3% area overhead compared to a circuit without clock scheduling and without speculative execution.
Jun XU Dongming BIAN Chuang WANG Gengxin ZHANG Ruidong LI
Due to the rapid development of small satellite technology and the advantages of LEO satellite with low delay and low propagation loss as compared with the traditional GEO satellite, the broadband LEO constellation satellite communication system has gradually become one of the most important hot spots in the field of satellite communications. Many countries and satellite communication companies in the world are formulating the project of broadband satellite communication system. The broadband satellite communication system is different from the traditional satellite communication system. The former requires a higher transmission rate. In the case of high-speed transmission, if the low elevation constellation is adopted, the satellite beam will be too much, which will increase the complexity of the satellite. It is difficult to realize the low-cost satellite. By comparing the complexity of satellite realization under different elevation angles to meet the requirement of terminal speed through link computation, this paper puts forward the conception of building broadband LEO constellation satellite communication system with high elevation angle. The constraint relation between satellite orbit altitude and user edge communication elevation angle is proposed by theoretical Eq. deduction. And the simulation is carried out for the satellite orbit altitude and edge communication elevation angle.
Constrained by quality-of-service (QoS), a robust transceiver design is proposed for multiple-input multiple-output (MIMO) interference channels with imperfect channel state information (CSI) under bounded error model. The QoS measurement is represented as the signal-to-interference-plus-noise ratio (SINR) for each user with single data stream. The problem is formulated as sum power minimization to reduce the total power consumption for energy efficiency. In a centralized manner, alternating optimization is performed at each node. For fixed transmitters, closed-form expression for the receive beamforming vectors is deduced. And for fixed receivers, the sum-power minimization problem is recast as a semi-definite program form with linear matrix inequalities constraints. Simulation results demonstrate the convergence and robustness of the proposed algorithm, which is important for practical applications in future wireless networks.
Yuhei FUKUI Aleksandar SHURBEVSKI Hiroshi NAGAMOCHI
In the obnoxious facility game, we design mechanisms that output a location of an undesirable facility based on the locations of players reported by themselves. The benefit of a player is defined to be the distance between her location and the facility. A player may try to manipulate the output of the mechanism by strategically misreporting her location. We wish to design a λ-group strategy-proof mechanism i.e., for every group of players, at least one player in the group cannot gain strictly more than λ times her primary benefit by having the entire group change their reports simultaneously. In this paper, we design a k-candidate λ-group strategy-proof mechanism for the obnoxious facility game in the metric defined by k half lines with a common endpoint such that each candidate is a point in each of the half-lines at the same distance to the common endpoint as other candidates. Then, we show that the benefit ratio of the mechanism is at most 1+2/(k-1)λ. Finally, we prove that the bound is nearly tight.
Takuya KOYANAGI Jun SHIOMI Tohru ISHIHARA Hidetoshi ONODERA
Body bias generators are useful circuits that can reduce variability and power dissipation in LSI circuits. However, the amplifier implemented into the body bias generator is difficult to design because of its complexity. To overcome the difficulty, this paper proposes a clearer cell-based design method of the amplifier than the existing cell-based design methods. The proposed method is based on a simple analytical model, which enables to easily design the amplifiers under various operating conditions. First, we introduce a small signal equivalent circuit of two-stage amplifiers by which we approximate a three-stage amplifier, and introduce a method for determining its design parameters based on the analytical model. Second, we propose a method of tuning parameters such as cell-based phase compensation elements and drive-strength of the output stage. Finally, based on the test chip measurement, we show the advantage of the body bias generator we designed in a cell-based flow over existing designs.
In this paper, to make asynchronous circuit design easy, we propose a conversion method from synchronous Register Transfer Level (RTL) models to asynchronous RTL models with bundled-data implementation. The proposed method consists of the generation of an intermediate representation from a given synchronous RTL model and the generation of an asynchronous RTL model from the intermediate representation. This allows us to deal with different representation styles of synchronous RTL models. We use the eXtensible Markup Language (XML) as the intermediate representation. In addition to the asynchronous RTL model, the proposed method generates a simulation model when the target implementation is a Field Programmable Gate Array and a set of non-optimization constraints for the control circuit used in logic synthesis and layout synthesis. In the experiment, we demonstrate that the proposed method can convert synchronous RTL models specified manually and obtained by a high-level synthesis tool to asynchronous ones.
Arnab MUKHOPADHYAY Tapas Kumar MAITI Sandip BHATTACHARYA Takahiro IIZUKA Hideyuki KIKUCHIHARA Mitiko MIURA-MATTAUSCH Hafizur RAHAMAN Sadayuki YOSHITOMI Dondee NAVARRO Hans Jürgen MATTAUSCH
This report focuses on an optimization scheme of advanced MOSFETs for designing CMOS circuits with high power efficiency. For this purpose the physics-based compact model HiSIM2 is applied so that the relationship between device and circuit characteristics can be investigated properly. It is demonstrated that the short-channel effect, which is usually measured by the threshold-voltage shift relative to long-channel MOSFETs, provides a consistent measure for device-performance degradation with reduced channel length. However, performance degradations of CMOS circuits such as the power loss cannot be predicted by the threshold-voltage shift alone. Here, the subthreshold swing is identified as an additional important measure for power-efficient CMOS circuit design. The increase of the subthreshold swing is verified to become obvious when the threshold-voltage shift is larger than 0.15V.
Cheng LUO Wei CAO Lingli WANG Philip H. W. LEONG
With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks. Unfortunately, the structural complexity and computational cost of residual networks make hardware implementation difficult. In this paper, we present the quantized and reconstructed deep neural network (QR-DNN) technique, which first inserts batch normalization (BN) layers in the network during training, and later removes them to facilitate efficient hardware implementation. Moreover, an accurate and efficient residual network accelerator (RNA) is presented based on QR-DNN with batch-normalization-free structures and weights represented in a logarithmic number system. RNA employs a systolic array architecture to perform shift-and-accumulate operations instead of multiplication operations. QR-DNN is shown to achieve a 1∼2% improvement in accuracy over existing techniques, and RNA over previous best fixed-point accelerators. An FPGA implementation on a Xilinx Zynq XC7Z045 device achieves 804.03 GOPS, 104.15 FPS and 91.41% top-5 accuracy for the ResNet-50 benchmark, and state-of-the-art results are also reported for AlexNet and VGG.
Seungtaek SONG Namhyun KIM Sungkil LEE Joyce Jiyoung WHANG Jinkyu LEE
Smartphone users often want to customize the positions and functions of physical buttons to accommodate their own usage patterns; however, this is unfeasible for electronic mobile devices based on COTS (Commercial Off-The-Shelf) due to high production costs and hardware design constraints. In this letter, we present the design and implementation of customized virtual buttons that are localized using only common built-in sensors of electronic mobile devices. We develop sophisticated strategies firstly to detect when a user taps one of the virtual buttons, and secondly to locate the position of the tapped virtual button. The virtual-button scheme is implemented and demonstrated in a COTS-based smartphone. The feasibility study shows that, with up to nine virtual buttons on five different sides of the smartphone, the proposed virtual buttons can operate with greater than 90% accuracy.
Yuka ISHII Naobumi MICHISHITA Hisashi MORISHITA Yuki SATO Kazuhiro IZUI Shinji NISHIWAKI
Radar-absorbent materials (RAM) with various characteristics, such as broadband, oblique-incidence, and polarization characteristics, have been developed according to applications in recent years. This paper presents the optimized design method of two flat layers RAM with both broadband and oblique-incidence characteristics for the required RAM performance. The oblique-incidence characteristics mean that the RAM is possible to absorb radio waves continuously up to the maximum incidence angle. The index of the wave-absorption amount is 20dB, corresponding to an absorption rate of 99%. Because determination of the electrical material constant of each layer is the most important task with respect to the received frequency and the incidence angle, we optimized the values by using Non-dominated sorting genetic algorithm-II (NSGA-II). Two types of flat-layer RAM composed of dielectric and magnetic materials were designed and their characteristics were evaluated. Consequently, it was confirmed that oblique-incidence characteristics were better for the RAM composed of dielectric materials. The dielectric RAM achieved an incidence angle of up to 60° with broadband characteristics and a relative bandwidth of 77.01% at the transverse-magnetic (TM) wave incidence. In addition, the magnetic RAM could lower the minimum frequency of the system more than the dielectric RAM. The minimum frequency of the magnetic RAM was 1.38GHz with a relative bandwidth of 174.18% at TM-wave incidence and an incidence angle of 45°. We confirmed that it is possible to design RAM with broadband characteristics and continuous oblique-incidence characteristics by using the proposed method.