Rei UENO Naofumi HOMMA Takafumi AOKI Sumio MORIOKA
This paper presents an automatic hierarchical formal verification method for arithmetic circuits over Galois fields (GFs) which are dedicated digital circuits for GF arithmetic operations used in cryptographic processors. The proposed verification method is based on a combination of a word-level computer algebra procedure with a bit-level PPRM (Positive Polarity Reed-Muller) expansion procedure. While the application of the proposed verification method is not limited to cryptographic processors, these processors are our important targets because complicated implementation techniques, such as field conversions, are frequently used for side-channel resistant, compact and low power design. In the proposed method, the correctness of entire datapath is verified over GF(2m) level, or word-level. A datapath implementation is represented hierarchically as a set of components' functional descriptions over GF(2m) and their wiring connections. We verify that the implementation satisfies a given total-functional specification over GF(2m), by using an automatic algebraic method based on the Gröbner basis and a polynomial reduction. Then, in order to verify whether each component circuit is correctly implemented by combination of GF(2) operations, i.e. logic gates in bit-level, we use our fast PPRM expansion procedure which is customized for handling large-scale Boolean expressions with many variables. We have applied the proposed method to a complicated AES (Advanced Encryption Standard) circuit with a masking countermeasure against side-channel attack. The results show that the proposed method can verify such practical circuit automatically within 4 minutes, while any single conventional verification methods fail within a day or even more.
Feng LIU Conggai LI Chen HE Xuan GENG
This letter considers the robust transceiver design for multiple-input multiple-output interference channels under channel state information mismatch. According to alternating schemes, an adaptive algorithm is proposed to solve the minimum SINR maximization problem. Simulation results show the convergence and the effectiveness of the proposed algorithm.
Ho Huu Minh TAM Hoang Duong TUAN Duy Trong NGO Ha Hoang NGUYEN
For a multiuser multi-input multi-output (MU-MIMO) multicell network, the Han-Kobayashi strategy aims to improve the achievable rate region by splitting the data information intended to a serviced user (UE) into a common message and a private message. The common message is decodable by this UE and another UE from an adjacent cell so that the corresponding intercell interference is cancelled off. This work aims to design optimal precoders for both common and private messages to maximize the network sum-rate, which is a highly nonlinear and nonsmooth function in the precoder matrix variables. Existing approaches are unable to address this difficult problem. In this paper, we develop a successive convex quadratic programming algorithm that generates a sequence of improved points. We prove that the proposed algorithm converges to at least a local optimum of the considered problem. Numerical results confirm the advantages of our proposed algorithm over conventional coordinated precoding approaches where the intercell interference is treated as noise.
This paper discusses the use of a common computer mouse as a pointing interface for tabletop displays. In the use of a common computer mouse for tabletop displays, there might be an angular distance between the screen coordinates and the mouse control coordinates. To align those coordinates, this paper introduces a screen coordinates calibration technique using a shadow cursor. A shadow cursor is the basic idea of manipulating a mouse cursor without any visual feedbacks. The shadow cursor plays an important role in obtaining the angular distance between the two coordinates. It enables the user to perform a simple mouse manipulation so that screen coordinates calibration will be completed in less than a second.
An output voltage-current equation of charge pump DC-DC voltage multiplier using diodes is provided to cover wide clock frequency and output current ranges for designing energy harvester operating at a near-threshold voltage or in sub-threshold region. Equivalent circuits in slow and fast switching limits are extracted. The effective threshold voltage of the diode in slow switching limit is also derived as a function of electrical characteristics of the diodes, such as the saturation current and voltage slope parameter, and design parameters such as the number of stages, capacitance per stage, parasitic capacitance at the top plate of the main boosting capacitor, and the clock frequency. The model is verified compared with SPICE simulation.
Yingjing QIAN Ni ZHOU Dajiang HE
Device-to-device (D2D) communication enables two local users to communicate with each other directly instead of relaying through a third party, e.g., base station. In this paper, we study a subchannel sharing strategy underlaying multichannel cellular network for D2D pairs and existing cellular users (CUs). In the investigated scenario, we try to improve the spectrum efficiency of D2D pairs, but inevitably brings cross interference between two user groups. To combat interference, we attempt to assign each D2D pair with appropriate subchannels, which may belong to different CUs, and manipulate transmission power of all users so as to maximize the sum rate of all D2D pairs, while assuring each CU with a minimum data rate on its subchannel set. The formulated problem is a nonconvex problem and thus, obtaining its optimal solution is a tough task. However, we can find optimal power and subchannel assignment for a special case by setting an independent data rate constraint on each subchannel. Then we find an efficient method to calculate a gradient for our original problem. Finally, we propose a gradient-based search method to address the problem with coupled minimum data rate constraint. The performance of our proposed subchannel sharing strategy is illustrated via extensive simulation results.
Tingting CHEN Weijun LI Feng YU Qianjian XING
A modular serial pipelined sorting architecture for continuous input sequences is presented. It supports continuous sequences, whose lengths can be dynamically changed, and does so using a very simple control strategy. It consists of identical serial cascaded sorting cells, and lends itself to high frequency implementation with any number of sorting cells, because both data and control signals are pipelined. With L cascaded sorting cells, it produces a fully sorted result for sequences whose length N is equal to or less than L+1; for longer sequences, the largest L elements are sorted out. Being modularly designed, several independent smaller sorters can be dynamically configured to form a larger sorter.
Junji YAMADA Ushio JIMBO Ryota SHIOYA Masahiro GOSHIMA Shuichi SAKAI
An 8-issue superscalar core generally requires a 24-port RAM for the register file. The area and energy consumption of a multiported RAM increase in proportional to the square of the number of ports. A register cache can reduce the area and energy consumption of the register file. However, earlier register cache systems suffer from lower IPC caused by register cache misses. Thus, we proposed the Non-Latency-Oriented Register Cache System (NORCS) to solve the IPC problem with a modified pipeline. We evaluated NORCS mainly from the viewpoint of microarchitecture in the original article, and showed that NORCS maintains almost the same IPC as conventional register files. Researchers in NVIDIA adopted the same idea for their GPUs. However, the evaluation was not sufficient from the viewpoint of LSI design. In the original article, we used CACTI to evaluate the area and energy consumption. CACTI is a design space exploration tool for cache design, and adopts some rough approximations. Therefore, this paper shows design of NORCS with FreePDK45, an open source process design kit for 45nm technology. We performed manual layout of the memory cells and arrays of NORCS, and executed SPICE simulation with RC parasitics extracted from the layout. The results show that, from a full-port register file, an 8-entry NORCS achieves a 75.2% and 48.2% reduction in area and energy consumption, respectively. The results also include the latency which we did not present in our original article. The latencies of critical path is 307ps and 318ps for an 8-entry NORCS and a conventional multiported register file, respectively, when the same two cycles are allocated to register file read.
Wenzhen YUE Yan ZHANG Jingwen XIE
The problem of radar constant-modulus (CM) waveform design for the detection of multiple targets is considered in this paper. The CM constraint is imposed from the perspective of hardware realization and full utilization of the transmitter's power. Two types of CM waveforms — the arbitrary-phase waveform and the quadrature phase shift keying waveform — are obtained by maximizing the minimum of the signal-to-clutter-plus-noise ratios of the various targets. Numerical results show that the designed CM waveforms perform satisfactorily, even when compared with their counterparts without constraints on the peak-to-average ratio.
Shuping ZHANG Jinjia ZHOU Dajiang ZHOU Shinji KIMURA Satoshi GOTO
In this paper, a hamburger architecture with a 3D stacked reconfigurable memory is proposed for a 4K motion estimation (ME) processor. By positioning the memory dies on both the top and bottom sides of the processor die, the proposed hamburger architecture can reduce the usage of the signal through-silicon via (TSV), and balance the power delivery network and the clock tree of the entire system. It results in 1/3 reduction of the usage of signal TSVs. Moreover, a stacked reconfigurable memory architecture is proposed to reduce the fabrication complexity and further reduce the number of signal TSVs by more than 1/2. The reduction of signal TSVs in the entire design is 71.24%. Finally, we address unique issues that occur in electronic design automation (EDA) tools during 3D large-scale integration (LSI) designs. As a result, a 4K ME processor with 7-die stacking 3D system-on-chip design is implemented. The proposed design can support real time 3840 × 2160 @ 120 fps encoding at 130 MHz with less than 540 mW.
Takashi OGURA Kentaro KOBAYASHI Hiraku OKADA Masaaki KATAYAMA
This paper studies H∞ control for networked control systems with packet loss. In networked control systems, packet loss is one of major weakness because the control performance deteriorates due to packet loss. H∞ control, which is one of robust control, can design a controller to reduce the influence of disturbances acting on the controlled object. This paper proposes an H∞ control design that considers packet loss as a disturbance. Numerical examples show that the proposed H∞ control design can more effectively reduce control performance deterioration due to packet loss than the conventional H∞ control design. In addition, this paper provides control performance comparisons of H∞ control and Linear Quadratic (LQ) control. Numerical examples show that the control performance of the proposed H∞ control design is better than that of the LQ control design.
Longye WANG Xiaoli ZENG Hong WEN
An uncorrelated asymmetric ZCZ (UA-ZCZ) sequence set is a special version of an asymmetric ZCZ (A-ZCZ) sequence set, which contains multiple subsets and each subset is a typical ZCZ sequence set. One of the most important properties of UA-ZCZ sequnence set is that two arbitrary sequences from different sequence subsets are uncorrelated sequences, whose cross-correlation function (CCF) is zeros at all shifts. Based on interleaved technique and an uncorrelated sequence set, a new UA-ZCZ sequence set is obtained via interleaving a perfect sequence. The uncorrelated property of the UA-ZCZ sequence sets is expected to be useful for avoiding inter-cell interference of QS-CDMA systems.
Kha HOANG HA Thanh TUNG VU Trung QUANG DUONG Nguyen-Son VO
In this paper, we propose two secure multiuser multiple-input multiple-output (MIMO) transmission approaches based on interference alignment (IA) in the presence of an eavesdropper. To deal with the information leakage to the eavesdropper as well as the interference signals from undesired transmitters (Txs) at desired receivers (Rxs), our approaches aim to design the transmit precoding and receive subspace matrices to minimize both the total inter-main-link interference and the wiretapped signals (WSs). The first proposed IA scheme focuses on aligning the WSs into proper subspaces while the second one imposes a new structure on the precoding matrices to force the WSs to zero. In each proposed IA scheme, the precoding matrices and the receive subspaces at the legitimate users are alternatively selected to minimize the cost function of a convex optimization problem for every iteration. We provide the feasible conditions and the proofs of convergence for both IA approaches. The simulation results indicate that our two IA approaches outperform the conventional IA algorithm in terms of the average secrecy sum rate.
Qiusheng WANG Xiaolan GU Yingyi LIU Haiwen YUAN
Multiple notch filters are used to suppress narrow-band or sinusoidal interferences in digital signals. In this paper, we propose a novel optimization design technique of an infinite impulse response (IIR) multiple notch filter. It is based on the Nelder-Mead simplex method. Firstly, the system function of the desired notch filter is constructed to form the objective function of the optimization technique. Secondly, the design parameters of the desired notch filter are optimized by Nelder-Mead simplex method. A weight function is also introduced to improve amplitude response of the notch filter. Thirdly, the convergence and amplitude response of the proposed technique are compared with other Nelder-Mead based design methods and the cascade-based design method. Finally, the practicability of the proposed notch filter design technique is demonstrated by some practical applications.
Ryo HAMAMOTO Chisa TAKANO Hiroyasu OBATA Kenji ISHIDA
Wireless Local Area Networks (WLANs) based on the IEEE 802.11 standard have been increasingly used. Access Points (APs) are being established in various public places, such as railway stations and airports, as well as private residences. Moreover, the rate of public WLAN services continues to increase. Throughput prediction of an AP in a multi-rate environment, i.e., predicting the amount of receipt data (including retransmission packets at an AP), is an important issue for wireless network design. Moreover, it is important to solve AP placement and selection problems. To realize the throughput prediction, we have proposed an AP throughput prediction method that considers terminal distribution. We compared the predicted throughput of the proposed method with a method that uses linear order computation and confirmed the performance of the proposed method, not by a network simulator but by the numerical computation. However, it is necessary to consider the impact of CSMA/CA in the MAC layer, because throughput is greatly influenced by frame collision. In this paper, we derive an effective transmission rate considering CSMA/CA and frame collision. We then compare the throughput obtained using the network simulator NS2 with a prediction value calculated by the proposed method. Simulation results show that the maximum relative error of the proposed method is approximately 6% and 15% for UDP and TCP, respectively, while that is approximately 17% and 21% in existing method.
Masaru OYA Noritaka YAMASHITA Toshihiko OKAMURA Yukiyasu TSUNOO Masao YANAGISAWA Nozomu TOGAWA
Since digital ICs are often designed and fabricated by third parties at any phases today, we must eliminate risks that malicious attackers may implement Hardware Trojans (HTs) on them. In particular, they can easily insert HTs during design phase. This paper proposes an HT rank which is a new quantitative analysis criterion against HTs at gate-level netlists. We have carefully analyzed all the gate-level netlists in Trust-HUB benchmark suite and found out several Trojan net features in them. Then we design the three types of Trojan points: feature point, count point, and location point. By assigning these points to every net and summing up them, we have the maximum Trojan point in a gate-level netlist. This point gives our HT rank. The HT rank can be calculated just by net features and we do not perform any logic simulation nor random test. When all the gate-level netlists in Trust-HUB, ISCAS85, ISCAS89 and ITC99 benchmark suites as well as several OpenCores designs, HT-free and HT-inserted AES netlists are ranked by our HT rank, we can completely distinguish HT-inserted ones (which HT rank is ten or more) from HT-free ones (which HT rank is nine or less). The HT rank is the world-first quantitative criterion which distinguishes HT-inserted netlists from HT-free ones in all the gate-level netlists in Trust-HUB, ISCAS85, ISCAS89, and ITC99.
Tatsuya KAMAKARI Jun SHIOMI Tohru ISHIHARA Hidetoshi ONODERA
In synchronous LSI circuits, memory subsystems such as Flip-Flops and SRAMs are essential components and latches are the base elements of the common memory logics. In this paper, a stability analysis method for latches operating in a low voltage region is proposed. The butterfly curve of latches is a key for analyzing a retention failure of latches. This paper discusses a modeling method for retention stability and derives an analytical stability model for latches. The minimum supply voltage where the latches can operate with a certain yield can be accurately derived by a simple calculation using the proposed model. Monte-Carlo simulation targeting 65nm and 28nm process technology models demonstrates the accuracy and the validity of the proposed method. Measurement results obtained by a test chip fabricated in a 65nm process technology also demonstrate the validity. Based on the model, this paper shows some strategies for variation tolerant design of latches.
This paper proposes a low power single-ended successive approximation register (SAR) analog-to-digital converter (ADC) to replace the only analog active circuit, the comparator, with a digital circuit, which is an inverter-based comparator. The replacement helps possible design automation. The inverter threshold voltage variation impact is minimal because an SAR ADC has only one comparator, and many applications are either insensitive to the resulting ADC offset or easily corrected digitally. The proposed resetting approach mitigates leakage when the input is close to the threshold voltage. As an intrinsic headroom-free, and thus low-rail-voltage, friendly structure, an inverter-based comparator also occupies a small area. Furthermore, an 11-bit ADC was designed and manufactured through a 0.35-µm CMOS process by adopting a low-power switching procedure. The ADC achieves an FOM of 181fJ/Conv.-step at a 25kS/s sampling rate when the supply voltage VDD is 1.2V.
Widiant Masaki HASHIZUME Shohei SUENAGA Hiroyuki YOTSUYANAGI Akira ONO Shyue-Kung LU Zvi ROTH
In this paper, a built-in test circuit for an electrical interconnect test method is proposed to detect an open defect occurring at an interconnect between an IC and a printed circuit board. The test method is based on measuring the supply current of an inverter gate in the test circuit. A time-varying signal is provided to an interconnect as a test signal by the built-in test circuit. In this paper, the test circuit is evaluated by SPICE simulation and by experiments with a prototyping IC. The experimental results reveal that a hard open defect is detectable by the test method in addition to a resistive open defect and a capacitive open one at a test speed of 400 kHz.
A method of color scheme is proposed considering contrast of luminance between adjacent regions and design property. This method aims at setting the contrast of luminance high, in order to make the image understandable to visually handicapped people. This method also realizes preferable color design for visually normal people by assigning color components from color combination samples. Interactive evolutionary computing is adopted to design the luminance and the color, so that the luminance and color components are assigned to each region appropriately on the basis of human subjective criteria. Here, the luminance is designed first, and then color components are assigned, keeping the luminance unchanged. Since samples of fine color combinations are applied, the obtained color design is also fine and harmonic. Computer simulations verify the high performance of this system.