This paper presents two power-saving designs for Quadratic Polynomial Permutation (QPP) interleave address generator of which interleave length K is fixed and unfixed, respectively. These designs are based on our observation that the quadratic term f2x2%K of f(x) = (f1x+f2x2)%K, which is the QPP address generating function, has a short period and is symmetric within the period. Power consumption is reduced by 27.4% in the design with fixed-K and 5.4% in the design with unfixed-K on the average for various values of K, when compared with existing designs.
Tohlu MATSUSHIMA Tetsushi WATANABE Yoshitaka TOYOTA Ryuji KOGA Osami WADA
Placing a guard trace next to a signal line is the conventional technique for reducing the common-mode radiation from a printed circuit board. In this paper, the suppression of common-mode radiation from printed circuit boards having guard traces is estimated and evaluated using the imbalance difference model, which was proposed by the authors. To reduce common-mode radiation further, a procedure for designing a transmission line with guard traces is proposed. Guard traces connected to a return plane through vias are placed near a signal line and they decrease a current division factor (CDF). The CDF represents the degree of imbalance of a transmission line, and a common-mode electromotive force depends on the CDF. Thus, by calculating the CDF, we can estimate the reduction in common-mode radiation. It is reduced not only by placing guard traces, but also by narrowing the signal line to compensate for the variation in characteristic impedance due to the guard traces. Experimental results showed that the maximum reduction in common-mode radiation was about 14 dB achieved by placing guard traces on both sides of the signal line, and the calculated reduction agreed with the measured one within 1 dB. According to the CDF and characteristic impedance calculations, common-mode radiation can be reduced by about 25 dB while keeping the characteristic impedance constant by changing the gap between the signal line and the guard trace and by narrowing the width of the signal line.
Nobuaki TOJO Nozomu TOGAWA Masao YANAGISAWA Tatsuo OHTSUKI
In an embedded system where a single application or a class of applications is repeatedly executed on a processor, its cache configuration can be customized such that an optimal one is achieved. We can have an optimal cache configuration which minimizes overall memory access time by varying the three cache parameters: the number of sets, a line size, and an associativity. In this paper, we first propose two cache simulation algorithms: CRCB1 and CRCB2, based on Cache Inclusion Property. They realize exact cache simulation but decrease the number of cache hit/miss judgments dramatically. We further propose three more cache design space exploration algorithms: CRMF1, CRMF2, and CRMF3, based on our experimental observations. They can find an almost optimal cache configuration from the viewpoint of access time. By using our approach, the number of cache hit/miss judgments required for optimizing cache configurations is reduced to 1/10-1/50 compared to conventional approaches. As a result, our proposed approach totally runs an average of 3.2 times faster and a maximum of 5.3 times faster compared to the fastest approach proposed so far. Our proposed cache simulation approach achieves the world fastest cache design space exploration when optimizing total memory access time.
Sergio SAPONARA Pierluigi NUZZO Claudio NANI Geert VAN DER PLAS Luca FANUCCI
Time-interleaved (TI) analog-to-digital converters (ADCs) are frequently advocated as a power-efficient solution to realize the high sampling rates required in single-chip transceivers for the emerging communication schemes: ultra-wideband, fast serial links, cognitive-radio and software-defined radio. However, the combined effects of multiple distortion sources due to channel mismatches (bandwidth, offset, gain and timing) severely affect system performance and power consumption of a TI ADC and need to be accounted for since the earlier design phases. In this paper, system-level design of TI ADCs is addressed through a platform-based methodology, enabling effective investigation of different speed/resolution scenarios as well as the impact of parallelism on accuracy, yield, sampling-rate, area and power consumption. Design space exploration of a TI successive approximation ADC is performed top-down via Monte Carlo simulations, by exploiting behavioral models built bottom-up after characterizing feasible implementations of the main building blocks in a 90-nm 1-V CMOS process. As a result, two implementations of the TI ADC are proposed that are capable to provide an outstanding figure-of-merit below 0.15 pJ/conversion-step.
Yu-Lung LO Wei-Bin YANG Ting-Sheng CHAO Kuo-Hsing CHENG
A high-speed and ultra-low-voltage divide-by-4/5 counter with dynamic floating input D flip-flop (DFIDFF) is presented in this paper. The proposed DFIDFF and control logic gates are merged to reduce effective capacitance of internal and external nodes, and increase the operating speed of divide-by-4/5 counter. The proposed divide-by-4/5 counter is fabricated in a 0.13-µm CMOS process. The measured maximum operating frequency and power consumption of the counter are 600 MHz and 8.35 µW at a 0.5 V supply voltage. HSPICE simulations demonstrate that the proposed counter (divide-by-4) reduces power-delay product (PDP) by 37%, 71%, and 57% from those of the TGFF counter, Yang's counter [1], and the E-TSPC counter [2], respectively.
Meilong JIANG Narayan PRASAD Yan XIN Guosen YUE Amir KHOJASTEPOUR Le LIU Takamichi INOUE Kenji KOYANAGI Yoshikazu KAKURA
The 3GPP Long Term Evolution Advanced (LTE-A) system, as compared to the LTE system, is anticipated to include several new features and enhancements, such as the usage of channel bandwidth beyond 20 MHz (up 100 MHz), higher order multiple input multiple output (MIMO) for both downlink and uplink transmissions, larger capacity especially for cell edge user equipment, and voice over IP (VoIP) users, and wider coverage and etc. This paper presents some key enabling technologies including flexible uplink access schemes, advanced uplink MIMO receiver designs, cell search, adaptive hybrid ARQ, and multi-resolution MIMO precoding, for the LTE-A system.
Ittetsu TANIGUCHI Praveen RAGHAVAN Murali JAYAPALA Francky CATTHOOR Yoshinori TAKEUCHI Masaharu IMAI
Low energy and high performance embedded processor is crucial in the future nomadic embedded systems design. Improvement of memory accesses, especially improvement of spatial and temporal locality is well known technique to reduce energy and increase performance. However, after transformations that improve locality, address calculation often becomes a bottleneck. In this paper, we propose novel AGU (Address Generation Unit) exploration and mapping technique based on a reconfigurable AGU model. Experimental results show that the proposed techniques help exploring AGU architectures effectively and designers can get trade-offs of real life applications for about 10 hours.
Yuji KUNITAKE Kazuhiro MIMA Toshinori SATO Hiroto YASUURA
A deep submicron semiconductor technology has increased process variations. This fact makes the estimate of the worst-case design margin difficult. In order to realize robust designs, we are investigating such a typical-case design methodology, which we call Constructive Timing Violation (CTV). In the CTV-based design, we can relax timing constraints. However, relaxing timing constraints might cause some timing errors. While we have applied the CTV-based design to a processor, unfortunately, the timing error recovery has serious impact on processor performance. In this paper, we investigate enhancement techniques of the CTV-based design. In addition, in order to accurately evaluate the CTV-based design, we build a co-simulation framework to consider circuit delay at the architectural level. From the co-simulation results, we find the performance penalty is significantly reduced by the enhancement techniques.
Gi-Ho PARK Jung-Wook PARK Hoi-Jin LEE Gunok JUNG Sung-Bae PARK Shin-Dug KIM
This paper presents a cache way enabling mechanism using branch target addresses. This mechanism uses branch prediction information to avoid the power consumption due to unnecessary cache way access by enabling only the cache way(s) that should be accessed. The proposed cache way enabling mechanism reduces the power consumption of the instruction cache by 63% without any performance degradation of the processor. An ARM1136 processor simulator and the Synopsys PrimeTime are used to perform the performance/power simulation and static timing analysis of the proposed mechanisms respectively.
The high-speed and low-power system LSIs in recent years have crucial need for managing power supply noise so that it might not substantially affect the circuit functionality and performance. The decoupling capacitance is known as an effective measure for suppressing the power supply noise. In this paper, we propose a design methodology for decoupling capacitance budgeting, in which the decoupling capacitance is distributed appropriately over the LSI chip area in order to suppress the power supply noise of each local region. For efficient budgeting, we introduced a new concept of power-capacitance ratio, which is the ratio of power dissipation to capacitance. The proposed method first performs a simplified power supply noise analysis by using a lumped circuit model to determine the total required on-chip capacitance, and calculate the power-capacitance ratio. Then, in the layout design phase, the decoupling capacitance budgeting is performed by using the above power-capacitance ratio as a guideline. The effectiveness of the proposed method was verified by using SPICE simulations on example chip models of 90 nm technology node. The verification results show that, even for a chip with very wide on-chip variation in power density, the proposed method can suppress the power supply noise of each local region effectively.
Yousuke NARUSE Jun-ichi TAKADA
We address the issue of MIMO channel estimation with the aid of a priori temporal correlation statistics of the channel as well as the spatial correlation. The temporal correlations are incorporated to the estimation scheme by assuming the Gauss-Markov channel model. Under the MMSE criteria, the Kalman filter performs an iterative optimal estimation. To take advantage of the enhanced estimation capability, we focus on the problem of channel estimation from a partial channel measurement in the MIMO antenna selection system. We discuss the optimal training sequence design, and also the optimal antenna subset selection for channel measurement based on the statistics. In a highly correlated channel, the estimation works even when the measurements from some antenna elements are omitted at each fading block.
The process of designing analogue circuits is formulated as a controlled dynamic system. For analysis of such system's properties it is suggested to use the concept of Lyapunov's function for a dynamic system. Various forms of Lyapunov's function are suggested. Analyzing the behavior of Lyapunov's function and its first derivative allowed us to determine significant correlation between this function's properties and processor time used to design the circuit. Numerical results prove the possibility of forecasting the behavior of various designing strategies and processor time based on the properties of Lyapunov's function for the process of designing the circuit.
Huanfei MA Haibin KAN Hideki IMAI
Construction of quaternion design for Space-Time-Polarization Block Codes (STPBCs) is a hot but difficult topic. This letter introduces a novel way to construct high dimensional quaternion designs based on any existing low dimensional quaternion orthogonal designs(QODs) for STPBC, while preserving the merits of the original QODs such as full diversity and simple decoding. Furthermore, it also provides a specific schema to reach full diversity and maximized code gain by signal constellation rotation on the polarization plane.
Kentaroh KATOH Kazuteru NAMBA Hideo ITO
This paper presents a scan design for delay fault testability of 2-rail logic circuits. The flip flops used in the scan design are based on master-slave ones. The proposed scan design provides complete fault coverage in delay fault testing of 2-rail logic circuits. In two-pattern testing with the proposed scan design, initial vectors are set using the set-reset operation, and the scan-in operation for initial vectors is not required. Hence, the test application time is reduced to about half that of the enhanced scan design. Because the additional function is only the set-reset operation of the slave latch, the area overhead is small. The evaluation shows that the differences in the area overhead of the proposed scan design from those of the standard scan design and the enhanced scan design are 2.1 and -14.5 percent on average, respectively.
Chang Ha LEE Youngmin KIM Amitabh VARSHNEY
The comprehensibility of large and complex 3D models can be greatly enhanced by guiding viewer's attention to important regions. Lighting is crucial to our perception of shape. Careful use of lighting has been widely used in art, scientific illustration, and computer graphics to guide visual attention. In this paper, we explore how the saliency of 3D objects can be used to guide lighting to emphasize important regions and suppress less important ones.
Naohiro KAWABATA Hisao KOGA Osamu MUTA Yoshihiko AKAIWA
As a method to realize a high-speed communication in the home network, the power-line communication (PLC) technique is known. A problem of PLC is that leakage radiation interferes with existing systems. When OFDM is used in a PLC system, the leakage radiation is not sufficiently reduced, even if the subcarriers corresponding to the frequency-band of the existing system are never used, because the signal is not strictly band-limited. To solve this problem, each subcarrier must be band-limited. In this paper, we apply the OQAM based multi-carrier transmission (OQAM-MCT) to a high-speed PLC system, where each subcarrier is individually band-limited. We also propose a pilot-symbol sequence suitable for frequency offset estimation, symbol-timing detection and channel estimation in the OQAM-MCT system. In this method, the pilot signal-sequence consists of a repeated series of the same data symbol. With this method, the pilot sequence approximately becomes equivalent to OFDM sequence and therefore existing pilot-assisted methods for OFDM are also applicable to OQAM-MCT system. Computer simulation results show that the OQAM-MCT system achieves both good transmission rate performance and low out-of-band radiation in PLC channels. It is also shown that the proposed pilot-sequence improves frequency offset estimation, symbol-timing detection and channel estimation performance as compared with the case of using pseudo-noise sequence.
Wei MIAO Xiang CHEN Ming ZHAO Shidong ZHOU Jing WANG
This paper addresses the problem of joint transceiver design for Tomlinson-Harashima Precoding (THP) in the multiuser multiple-input-multiple-output (MIMO) downlink under both perfect and imperfect channel state information at the transmitter (CSIT). For the case of perfect CSIT, we differ from the previous work by performing stream-wise (both inter-user and intra-user) interference pre-cancelation at the transmitter. A minimum total mean square error (MT-MSE) criterion is used to formulate our optimization problem. By some convex analysis of the problem, we obtain the necessary conditions for the optimal solution. An iterative algorithm is proposed to handle this problem and its convergence is proved. Then we extend our designed algorithm to the robust version by minimizing the conditional expectation of the T-MSE under imperfect CSIT. Simulation results are given to verify the efficacy of our proposed schemes and to show their superiorities over existing MMSE-based THP schemes.
In this letter we provide a steering law for redundant single-gimbal control moment gyros. The proposed steering law is an extended version of the singular direction avoidance (SDA) steering law based on the singular value decomposition (SVD). All internal singularities are escapable for any non-zero constant torque command using the proposed steering law.
Yoshiaki SONE Wataru IMAJUKU Naohide NAGATSU Masahiko JINNO
Bolstering survivable backbone networks against multiple failures is becoming a common concern among telecom companies that need to continue services even when disasters occur. This paper presents a multiple-failure recovery scheme that considers the operation and management of optical paths. The presented scheme employs scheme escalation from pre-planned restoration to full rerouting. First, the survivability of this scheme against multiple failures is evaluated considering operational constraints such as route selection, resource allocation, and the recovery order of failed paths. The evaluation results show that scheme escalation provides a high level of survivability even under operational constraints, and this paper quantitatively clarifies the impact of these various operational constraints. In addition, the fundamental functions of the scheme escalation are implemented in the Generalized Multi-Protocol Label Switching control plane and verified in an optical-cross-connect-based network.
Jinhyun CHO Doowon LEE Sangyong YOON Sanggyu PARK Soo-Ik CHAE
In this paper, we present a high-performance VC-1 main-profile decoder for high-definition (HD) video applications, which can decode HD 720p video streams with 30 fps at 80 MHz. We implemented the decoder with a one-poly eight-metal 0.13 µm CMOS process, which contains about 261,900 logic gates and on-chip memories of 13.9 KB SRAM and 13.1 KB ROM and occupies an area of about 5.1 mm2. In designing the VC-1 decoder, we used a template-based SoC design flow, with which we performed the design space exploration of the decoder by trying various configurations of communication channels. Moreover, we also describe architectures of the computation blocks optimized to satisfy the requirements of VC-1 HD applications.