Youichi FUKADA Takeshi YASUDA Shuji KOMATSU Koichi SAITO Yoichi MAEDA Yasuyuki OKUMURA
This paper describes a novel adaptive clock recovery method that uses proportional-integral-derivative (PID) control. The adaptive clock method is a clock recovery technique that synchronizes connected terminals via packet networks, and will be indispensable for circuit emulation services in the next generation Ethernet. Our adaptive clock method simultaneously achieves a short starting-time, accuracy, stable recovery clock frequency, and few buffer delays using the PID control technique. We explain the numerical simulations, experimental results, and circuit designs.
Hossein SHAMSI Omid SHOAEI Roghayeh DOOST
In this paper by using an exactly analytic approach the clock jitter in the feedback path of the continuous time Delta Sigma modulators (CT DSM) is modeled as an additive jitter noise, providing a time invariant model for a jittery CT DSM. Then for various DAC waveforms the power spectral density (psd) of the clock jitter at the output of DAC is derived and by using an approximation the in-band power of the clock jitter at the output of the modulator is extracted. The simplicity and generality of the proposed approach are the main advantages of this paper. The MATALB and HSPICE simulation results confirm the validity of the proposed formulas.
Yongqiang LU Chin-Ngai SZE Xianlong HONG Qiang ZHOU Yici CAI Liang HUANG Jiang HU
With VLSI design development, the increasingly severe power problem requests to minimize clock routing wirelength so that both power consumption and power supply noise can be alleviated. In contrast to most of traditional works that handle this problem only in clock routing, we propose to navigate standard cell register placement to locations that enable further less clock routing wirelength and power. To minimize adverse impacts to conventional cell placement goals such as signal net wirelength and critical path delay, the register placement is carried out in the context of a quadratic placement. The proposed technique is particularly effective for the recently popular prescribed skew clock routing. Experiments on benchmark circuits show encouraging results.
Takashi SATO Junji ICHIMIYA Nobuto ONO Koutaro HACHIYA Masanori HASHIMOTO
This paper quantitatively analyzes thermal gradient of SoC and proposes a thermal flattening procedure. First, the impact of dominant parameters, such as area occupancy of memory/logic block, power density, and floorplan on thermal gradient are studied quantitatively. Temperature difference is also evaluated from timing and reliability standpoints. Important results obtained here are 1) the maximum temperature difference increases with higher memory area occupancy and 2) the difference is very floorplan sensitive. Then, we propose a procedure to amend thermal gradient. A slight floorplan modification using the proposed procedure improves on-chip thermal gradient significantly.
Masanori HASHIMOTO Tomonori YAMAMOTO Hidetoshi ONODERA
This paper discusses clock skew due to manufacturing variability and environmental change. In clock tree design, transition time constraint is an important design parameter that controls clock skew and power dissipation. In this paper, we evaluate clock skew under several variability models, and demonstrate relationship among clock skew, transition time constraint and power dissipation. Experimental results show that constraint of small transition time reduces clock skew under manufacturing and supply voltage variabilities, whereas there is an optimum constraint value for temperature gradient. Our experiments in a 0.18 µm technology indicate that clock skew is minimized when clock buffer is sized such that the ratio of output and input capacitance is four.
Shinsaku KIYOMOTO Toshiaki TANAKA Kouichi SAKURAI
Guess-and-Determine (GD) attacks have recently been proposed for the effective analysis of word-oriented stream ciphers. This paper discusses GD attacks on clock-controlled stream ciphers, which use irregular clocking for a non-linear function. The main focus is the analysis of irregular clocking for GD attacks. We propose GD attacks on a typical clock-controlled stream cipher AA5, and calculate the process complexity of our proposed GD attacks. In the attacks, we assume that the clocking of linear feedback shift registers (LFSRs) is truly random. An important consideration affecting the practicality of these attacks is the question of whether these assumptions are realistic. Because in practice, the clocking is determined by the internal states. We implement miniature ciphers to evaluate the proposed attacks, and show that they are applicable. We also apply the GD attacks to other clock controlled stream ciphers and compare them. Finally, we discuss some properties of GD attacks on clock-controlled stream ciphers and the effectiveness of the clock controllers. Our research results contain information that are useful in the design of clock-controlled stream ciphers.
Hossein SHAMSI Omid SHOAEI Roghayeh DOOST
In this paper the spectral density of the additive jitter noise in continuous time (CT) Delta-Sigma modulators (DSM) is derived analytically. Making use of the analytic results, extracted in this paper, a novel method for elimination of the damaging effects of the clock jitter in continuous time Delta-Sigma modulators is proposed. In this method instead of the conventional waveforms used in the feedback path of CT DSM's such as the non return to-zero, the return to-zero, and the half delay return to-zero, an impulse waveform is employed.
Minseok KIM Aiko KIYONO Koichi ICHIGE Hiroyuki ARAI
Undersampling (or bandpass sampling) phase modulated signals directly at high frequency band, the harmful effects of the aperture jitter characteristics of ADCs (Analog-to-Digital converters) and sampling clock instability of the system can not be ignored. In communication systems the sampling jitter brings additional phase noise to the constellation pattern besides thermal noise, thus the BER (bit error rate) performance will be degraded. This paper examines the relationship between the input frequency to ADC and the sampling jitter in digital IF (Intermediate Frequency) downconversion receivers with undersampling scheme. This paper presents the measurement results with a real hardware prototype system as well as the computer simulation results with a theoretically modeled IF sampling receiver. We evaluated EVM (Error Vector Magnitude) in various clock jitter configurations with commonly used and reasonable cost ADCs of which sampling rates was 40 MHz. According to the results, the IF input frequencies of QPSK (16 QAM) signals were limited below around 290 (210) MHz for wireless LAN standard, and 730 (450) MHz for W-CDMA standard, respectively, in our best configuration.
Yi ZOU Yici CAI Qiang ZHOU Xianlong HONG Sheldon X.-D. TAN
This paper presents a novel approach to reducing the complexity of the transient linear circuit analysis for a hybrid structured clock network. Topology reduction is first used to reduce the complexity of the circuits and a preconditioned Krylov-subspace iterative method is then used to perform the nodal analysis on the reduced circuits. By proper selection of the simulation time step and interval based on Elmore delays, the delay of the clock signal between the clock source and the sink node as well as the clock skews between the sink nodes can be computed efficiently and accurately. Our experimental results show that the proposed algorithm is two orders of magnitude faster than HSPICE without loss of accuracy and stability. The maximum error is within 0.4% of the exact delay time.
Suk-Jin KIM Jeong-Gun LEE Kiseon KIM
This letter presents a synchronizer and its handshake interface for bridging clock domains in SoC. The proposed scheme uses a double two-flop synchronizer operated at different clock edges respectively, based on a two-phase handshake protocol. Performance analysis shows that the proposed design reduces latency up to a clock cycle, while retaining its safety to a tolerable level.
Takefumi YOSHIKAWA Tsuyoshi EBUCHI Yukio ARIMA Toru IWATA
A Spread Spectrum Clock Generator (SSCG) using Digital Tracking scheme (DT-SSCG) is described. Using digital tracking control outside a PLL, DT-SSCG can realize stable modulation characteristic independent of the PLL constants. Moreover, DT-SSCG can apply to various modulation profiles easily by brief change of the digital tracking parameters. A test chip has realized the fitting of 5000 ppm downspread with 6.02 dB and 8.02 dB spectrum peak reduction for triangle and Non-Linear modulation.
A wide-range multiphase delay-locked loop (DLL) using mixed-mode voltage-controlled delay lines (VCDLs) is presented. An edge-triggered duty cycle corrector is introduced to generate output clocks with 50% duty cycle. This DLL using an analog 3-states phase-frequency detector (PFD) and the proposed digital PFD can achieve low jitter operation over a wide frequency range without harmonic locking problems. It has been fabricated in a standard 0.25-µm CMOS technology and occupies a core area of 1 mm2 including the on-chip regulator and loop filter. For reference clocks from 20 MHz to 550 MHz, all the measured rms and peak-to-peak jitters are below 10 ps and 78 ps, respectively.
Yukihide KOHIRA Atsushi TAKAHASHI
Under the assumption that clock can be inputted to each register at an arbitrary timing, the minimum feasible clock period can be determined if delays between registers are given. This minimum feasible clock period might be reduced if delays between some registers are increased by delay insertion. In this paper, we propose a delay insertion algorithm to reduce the minimum clock period. First, the proposed algorithm determines a clock schedule ignoring some constraints. Second, the algorithm inserts delays to recover ignored constraints according to the delay-slack and delay-demand of the obtained clock schedule. We show that the proposed algorithm achieves the minimum clock period by delay insertion if the delay of each element in the circuit is unique. Experiments show that the amount of inserting delay and computational time are smaller than the conventional algorithm.
Takahiro SEKI Satoshi AKUI Katsunori SENO Masakatsu NAKAI Tetsumasa MEGURO Tetsuo KONDO Akihiko HASHIGUCHI Hirokazu KAWAHARA Kazuo KUMANO Masayuki SHIMURA
In this paper, a Dynamic Voltage and Frequency Management (DVFM) scheme introduced in a microprocessor for handheld devices with wideband embedded DRAM is reported. Our DVFM scheme reduces the power consumption effectively by cooperation of the autonomous clock frequency control and the adaptive supply voltage control. The clock frequency is controlled using hardware activity information to determine the minimum value required by the current processor load. This clock frequency control is realized without special power management software. The supply voltage is controlled according to the delay information provided from a delay synthesizer circuit, which consists of three programmable delay components, gate delay, RC delay and a rise/fall delay. The delay synthesizer circuit emulates the critical-path delay within 4% voltage accuracy over the full range of process deviation and voltage. This accurate tracking ability realizes the supply voltage scaling according to the fluctuation of the LSI's characteristic caused by the temperature and process deviation. The DVFM contributes not only the dynamic power reduction, but also the leakage power reduction. This microprocessor, fabricated in 0.18 µm CMOS embedded DRAM technology achieves 82% power reduction in a Personal Information Management scheduler (PIM) application and 40% power reduction in a MPEG4 movie playback application. As process technology shrinks, the DVFM scheme with leakage power compensation effect will become more important realizing in high-performance and low-power mobile consumer applications.
An all-digital CMOS duty cycle correction (DCC) circuit with a fixed rising edge was proposed to achieve the wide correction ranges of input duty cycle and PVT variations, the low standby power and the fast recovery from the standby mode for use in multi-phase clock systems. SPICE simulations showed that this DCC adjusts the output duty cycle to 500.7% for the wide range of input duty cycle from 15% to 85% at the input frequency of 1 GHz, within the commercial range of PVT corners. The all-digital implementation and the use of a toggle flip flop at the input stage enabled the wide correction ranges of PVT variations and input duty cycle, respectively.
This paper proposes a decentralized and asynchronous replica control method based on a fair assignment of the variation in numerical data that has weak consistency for loosely coupled database systems managed or used by different organizations of human activity. Our method eliminates the asynchronous abort of already committed transactions even if replicas in all network partitions continue to process transactions when network partitioning occurs. A decentralized and asynchronous approach is needed because it is difficult to keep a number of loosely coupled systems in working order, and replica operations performed in a centralized and synchronous way can degrade the performance of transaction processing. We eliminate the transaction abort by fairly distributing the variation in numerical data to replicas according to their demands and updating the distributed variation using only asynchronously propagated update transactions without calculating the precise global state among reachable replicas. In addition, fairly assigning the variation of data to replicas equalizes the disadvantages of processing update transactions among replicas. Fairness control for assigning the data variation is performed by averaging the variation requested by the replicas. A simulation showed that our system can achieve extremely high performance for processing update transactions and fairness among replicas.
Hirokazu TAKENOUCHI Tatsushi NAKAHARA Kiyoto TAKAHATA Ryo TAKAHASHI Hiroyuki SUZUKI
Asynchronous optical packet switching (OPS) is a promising solution to support the continuous growth of transmission capacity demand. It has been, however, quite difficult to implement key functions needed at the node of such networks with all-optical approaches. We have proposed a new optoelectronic system composed of a packet-by-packet optical clock-pulse generator (OCG), an all-optical serial-to-parallel converter (SPC), a photonic parallel-to-serial converter (PSC), and CMOS circuitry. The system makes it possible to carry out various required functions such as buffering (random access memory), optical packet compression/decompression, and optical label swapping for high-speed asynchronous optical packets.
Takahito MIYAZAKI Masanori HASHIMOTO Hidetoshi ONODERA
This paper discusses performance prediction of clock generation PLLs using a ring oscillator based VCO (RingVCO) and an LC oscillator based VCO (LCVCO). For clock generation, we generally design PLLs using RingVCOs because of their superiority in tunable frequency range, chip area and power consumption, in spite of their poor noise characteristics. In the future, it is predicted that operating frequency will rapidly increase and supply voltage will dramatically decrease. Besides, rigid noise performances will be required. In this condition, it is not clear neither how performances of both PLLs will change nor the performance differences between both PLLs will change. This paper predicts and compares future performances of PLLs using a RingVCO and an LCVCO with a qualitative evaluation by an analytical approach and with design experiments based on predicted process parameters. Our discussion reveals that the relative performance difference between both PLLs will be unchanged. As technology advances, power dissipation and chip area of both PLLs favorably decrease, while, noise characteristics of both PLLs degrade, which indicates low noise PLL circuit design will be more important.
Suk-Jin KIM Jeong-Gun LEE Kiseon KIM
Inter-domain communications on a chip require a synchronizer to resolve the timing problems between an input and a clock of a destination. This paper presents a parallel flop synchronizer and its interface circuit for transferring asynchronous data to the clock domain. The proposed scheme uses a bank of independent two-flops in parallel and supports a two-phase handshake protocol. Compared to the conventional two-flop synchronizer, performance analysis shows that the proposed scheme can reduce latency up to one and a half of clock cycles while retaining its safety to a tolerable level. All designs have been implemented in a 0.25 µm CMOS technology to verify performance analysis of the proposed synchronization.
Jumpei UCHIDA Nozomu TOGAWA Masao YANAGISAWA Tatsuo OHTSUKI
This paper proposes a thread partitioning algorithm in low power high-level synthesis. The algorithm is applied to high-level synthesis systems. In the systems, we can describe parallel behaving circuit blocks (threads) explicitly. First it focuses on a local register file RF in a thread. It partitions a thread into two sub-threads, one of which has RF and the other does not have RF. The partitioned sub-threads need to be synchronized with each other to keep the data dependency of the original thread. Since the partitioned sub-threads have waiting time for synchronization, gated clocks can be applied to each sub-thread. Then we can synthesize a low power circuit with a low area overhead, compared to the original circuit. Experimental results demonstrate effectiveness and efficiency of the algorithm.