Young-Hee KIM Jong-Ki NAM Young-Soo SOHN Hong-June PARK Ki-Bong KU Jae-Kyung WEE Joo-Sun CHOI Choon-Sung PARK
A fully on-chip current controlled open-drain output driver using a bandgap reference current generator was designed for high bandwidth DRAMs. It reduces the overhead of receiving a digital code from an external source for the compensation of the temperature and supply voltage variations. The correct value of the current control register is updated at the end of every auto refresh cycle. The operation at the data rate up to 0.8 Gb/s was verified by SPICE simulation using a 0.22 µm triple-well CMOS technology.
Keiichi YAMAGUCHI Yasuhiko KURIYAMA Eiji TAKAGI Mitsuo KONNO
The optimum bias point for a drain mixer operating on low local oscillator (LO) power was investigated. The bias voltage dependence of the required LO power and the conversion gain in the drain mixer was clarified by a simplified nonlinear model which the drain current characteristics around knee voltage is approximated by two straight line segments. It was found that an optimum gate bias voltage Vgs exists for a given applied LO power, and the optimum gate bias voltage moves toward the pinch-off voltage as the injection LO power level decreases. In order to verify the variation of the optimum gate bias voltage, a 60 GHz MMIC drain mixer adopting the optimum gate bias voltage for low LO power level was fabricated. The fabricated drain mixer exhibited a conversion gain of 0 dB with the injection LO power level of 0 dBm. This value of 0 dBm is the best performance yet obtained for a 60 GHz MMIC drain mixer. The measured optimum gate bias voltage was near the pinch-off voltage. This result was in good agreement with the theoretical analysis. The LO power level of a drain mixer has been improved so that it is on a par with that of a gate mixer.
Asheesh KHARE Preeti R. PANDA Nikil D. DUTT Alexandru NICOLAU
Newer off-chip DRAM families, including Synchronous DRAMs (SDRAMs) and RAMBUS DRAMs (RDRAMs), are becoming standard choices for the design of high-performance systems. Although previous work in High-Level Synthesis (HLS) has addressed exploiting features of page-mode DRAMs, techniques do not exist for exploiting the two key features of these newer DRAM families that boost memory performance and help overcome bandwidth limitations: (1) burst mode access, and (2) interleaved access through multiple banks. We address pre-synthesis optimizations on the input behavior that extract and exploit the burst mode and multiple bank interleaved access modes of these newer DRAM families, so that these features can be exploited fully during the HLS trajectory. Our experiments, run on a suite of memory-intensive benchmarks using a contemporary SDRAM library, demonstrate significant performance improvements of up to 62.5% over the naive approach, and improvements of up to 16.7% over the previous approach that considered only page-mode or extended-data-out (EDO) DRAMS.
Tetsuya UEMURA Pinaki MAZUMDER
A resonant-tunneling-diode (RTD) based sense amplifier circuit design has been proposed for the first time to envision a very high-speed and low-power memory system that also includes refresh-free, compact RTD-based memory cells. By combining RTDs with n-type transistors of conventional complementary metal oxide semiconductor (CMOS) devices, a new quantum MOS (Q-MOS) family of logic circuits, having very low power-delay product and good noise immunity, has recently been developed. This paper introduces the design and analysis of a new QMOS sense amplifier circuit, consisting of a pair of RTDs as pull-up loads in conjunction with n-type pull-down transistors. The proposed QMOS sensing circuit exhibits nearly 20% faster sensing time in comparison to the conventional design of a CMOS sense amplifier. The stability analysis done using phase-plot diagram reveals that the pair of back-to-back connected static QMOS inverters, which forms the core of the sense amplifier, has meta-stable and unstable states which are closely related to the I-V characteristics of the RTDs. The paper also analyzes in details the refresh-free memory cell design, known as tunneling static random access memory (TSRAM). The innovative cell design adds a stack of two RTDs to the conventional one-transistor dynamic RAM (DRAM) cell and thereby the cell can indefinitely hold its charge level without any further periodic refreshing. The analysis indicates that the TSRAM cell can achieve about two orders of magnitude lower stand-by power than a conventional DRAM cell. The paper demonstrates that RTD-based circuits hold high promises and are likely to be the key candidates for the future high-density, high-performance and low-power memory systems.
Tadaaki YAMAUCHI Lance HAMMOND Oyekunle A. OLUKOTUN Kazutami ARIMOTO
A microprocessor integrated with DRAM on the same die has the potential to improve system performance by reducing memory latency and improving memory bandwidth. In this paper we evaluate the performance of a single chip multiprocessor integrated with DRAM when the DRAM is organized as on-chip main memory and as on-chip cache. We compare the performance of this architecture with that of a more conventional chip which only has SRAM-based on-chip cache. The DRAM-based architecture with four processors outperforms the SRAM-based architecture on floating point applications which are effectively parallelized and have large working sets. This performance difference is significantly better than that possible in a uniprocessor DRAM-based architecture, which performs only slightly faster than an SRAM-based architecture on the same applications. In addition, on multiprogrammed workloads, in which independent processes are assigned to every processor in a single chip multiprocessor, the large bandwidth of on-chip DRAM can handle the inter-access contention better. These results demonstrate that a multiprocessor takes better advantage of the large bandwidth provided by the on-chip DRAM than a uniprocessor.
Tae Jung SUH Woong Soon KIM Chang Hun KIM
An efficient algorithm for reconstructing all polyhedral 3D objects from two orthographic views is presented. Since the two-view orthographic representation of a 3D object is ambiguous, it requires a numerous amount of combinatorial searches in the process of reconstruction. Also, multiple number of solutions in contrast to the designers intention can be existed in the problem. This paper proposes a two phase algorithm to reduce the search space and to select the most plausible solution described by the given projections. First, the partially constructed objects are reconstructed from the restricted candidate faces corresponding to each area on the two-view drawings in its first phase. Then the complete objects are obtained from the partially constructed objects by adding other candidates with geometrical validity in the second phase. The algorithm performs a combinatorial search based on the face decision rules along with two heuristics. Decision rules check geometrical validity and heuristic rules enhance the search speed. In addition, the reconstruction finds the most plausible 3D object that human observers are most likely to select first among the given multiple solutions. Several examples from a working implementation are given to show the completeness of the algorithm.
Sakda UNAWONG Shinichi MIYAMOTO Norihiko MORINAGA
In this paper, we investigate the bit error rate (BER) performance of Direct Sequence-Code Division Multiple Access (DS-CDMA) systems under impulsive radio noise environments, and propose a novel DS-CDMA receiver which is designed to be robust against impulsive noise. At first, employing the Middleton's Class-A impulsive noise model as a typical model of impulsive radio noise, we discuss the statistical characteristics of impulsive radio noise and demonstrate that the quadrature components of impulsive noise are statistically dependent. Next, based on the computer simulation, we evaluate the BER performance of a conventional DS-CDMA system under a Class-A impulsive noise environment, and illustrate that the performance of the conventional DS-CDMA system is drastically degraded by the effects of the impulsive noise. To deal with this problem, motivated by the statistical dependence between the quadrature components of impulsive radio noise, we propose a new DS-CDMA receiver which can eliminate the effects of the channel impulsive noise. The numerical result shows that the performance of the DS-CDMA system under the impulsive noise environment is significantly improved by using this proposed receiver. Finally, to confirm the effectiveness of this proposed receiver against actual impulsive radio noise, we evaluate the BER performance of the DS-CDMA system employing the proposed receiver under a microwave oven (MWO) noise environment and discuss the robustness of the proposed receiver against MWO noise.
An advanced characterization method for sub-micron DRAM cell transistors has been proposed for the analysis of transistor test structures using memory cell patterns. When the actual memory cell layout is used as a test structure, the parasitic source and drain resistance of the test structure affected conventional transistor parameters such as threshold voltage. To solve this problem, reduced drain current measurement methods have been proposed to suppress the parasitic resistance voltage drop. In these measurements, two new transistor parameters, Vgoff and Vgsat, have been proposed which are related to off-leakage and full writing, respectively. These parameters are found to be good parameters for monitoring DRAM bit failures. A new threshold voltage measurement methodology has also been proposed for test structures with high parasitic resistance.
Yasuhiro KONISHI Yasunobu NAKASE Katsushi ASAHINA Makoto TANIGUCHI Michihiro YAMADA
Various I/O interface technologies in today's PC platform are classified into four categories, (1) ASIC (memory Controller) from / to Main Memory, (2) MPU from /to ASIC (Memory Controller), (3) ASIC (Memory Controller) from / to ASIC (Graphic Controller) and (4) ASIC from / to Peripherals. As to Category 1, effectiveness of SSTL is shown in DIMM application of SDRAM and DDR SDRAM over 100 MHz frequency. Furthermore a comparison is made between SLDRAM and D- RDRAM from the technology point of view. Concerning Categories 2 through 4, several interfaces such as PCI, AGP, GTL, HSTL and LVDS are reviewed. Interface technologies will keep playing an important role since computer systems require higher and higher speeds.
Fukashi MORISHITA Yasuo YAMAGUCHI Takahisa EIMORI Toshiyuki OASHI Kazutami ARIMOTO Yasuo INOUE Tadashi NISHIMURA Michihiro YAMADA
It is confirmed by simulation that SOI-DRAMs can be operated at high speed by using the floating body structures. Several floating body effects are analyzed. It is described that the dynamic retention characteristics are not dominated by capacitive coupling and hole redistribution. And it is described that those characteristics are determined by the leakage current in the small pn-junction region of the floating body. Reducing pn junction leakage current is important for realizing a long retention time. If the pn junction leakage is suppressed to 10-18 A/µm, a dynamic retention time of 520 sec at a VBSG of 0.5 V can be achieved at 27. On those conditions, the refresh current of SOI-DRAMs is reduced by 54% compared with bulk-Si DRAMs.
Shoji OTAKA Ryuichi FUJIMOTO Hiroshi TANIMOTO
A direct conversion transmitter IC including a proposed frequency doubler, a quadrature modulator, and a 3-bit variable attenuator was fabricated using BiCMOS technology with fT of 12 GHz. This architecture employing frequency doubler is intended for realizing wireless terminals that are low in cost and small in size. The architecture is effective for reducing serious interference between PA and VCO by making the VCO frequency different from that of PA. The proposed frequency doubler comprises a current-driven 90 phase-shifter and an ECL-EXOR circuit for both low power operation and wide input power range of local oscillator (LO). The proposed frequency doubler keeps high output power even when rectangular wave from LO is applied owing to use of the current-driven 90 phase-shifter instead of a voltage-driven 90 phase-shifter. An LO leakage of less than -25 dBc, an image rejection ratio in excess of 45 dBc, and a maximum attenuation of 21 dB were measured. The transmitter IC successfully operates at LO power above -15 dBm and consumes 68 mA from 2.7 V power supply voltage. An active die size is 1.5 mm3 mm.
Hiroyuki KURINO Keiichi HIRANO Taizo ONO Mitsumasa KOYANAGI
We describe a new multiport memory which is called Shared DRAM (SHDRAM) to overcome bus-bottle neck problem in parallel processor system with shared memory. The processors are directly connected to this SHDRAM without conventional common bus. The test chip with 32 kbit memory cells is fabricated using a 1. 5 µm CMOS technology. The basic operation is confirmed by the circuit simulation and experimental results. In addition, it is confirmed by the computer simulation that the system performance with SHDRAM is superior to that with conventional common buses.
Koji INOUE Koji KAI Kazuaki MURAKAMI
Merged DRAM/logic LSIs could provide high on-chip memory bandwidth by interconnecting logic portions and DRAM with wider on-chip buses. For merged DRAM/logic LSIs with the memory hierarchy including cache memory, we can exploit such high on-chip memory bandwidth by means of replacing a whole cache line (or cache block) at a time on cache misses. This approach tends to increase the cache-line size if we attempt to improve the attainable memory bandwidth. Larger cache lines, however, might worsen the system performance if programs running on the LSIs do not have enough spatial locality of references and cache misses frequently take place. This paper describes a novel cache architecture suitable for merged DRAM/logic LSIs, called variable line-size cache or VLS cache, for resolving the above-mentioned dilemma. The VLS cache can make good use of the high on-chip memory bandwidth by means of larger cache lines and, at the same time, alleviate the negative effects of larger cache-line size by partitioning each large cache line into multiple sub-lines and allowing every sub-line to work as an independent cache line. The number of sub-lines involved when a cache replacement occurs can be determined depending on the characteristics of programs. This paper also evaluates the cost/performance improvements attainable by the VLS cache and compares it with those of conventional cache architectures. As a result, it is observed that a VLS cache reduces the average memory-access time by 16. 4% while it increases the hardware cost by only 13%, compared to a conventional direct-mapped cache with fixed 32-byte lines.
Koji KAI Akihiko INOUE Taku OHSAWA Kazuaki MURAKAMI
In merged DRAM/logic LSIs, the DRAM portion could suffer from shorter data retention time because of heat and noise caused by the logic portion. In order to reconsider the DRAM data retention characteristics, this paper formulates and evaluates the performance degradation due to conflicts between normal DRAM accesses and refresh operations. Next, this paper proposes a new DRAM refresh architecture which intends to reduce unnecessary refreshes. This architecture exploits multiple refresh periods. Each row is refreshed with the most appropriate period of them. Reducing the number of refreshes improves the accessibility to DRAM. It is shown that the method reduces the number of refreshes and the degree of the performance degradation of the logic portion.
Taku OHSAWA Koji KAI Kazuaki MURAKAMI
In merged DRAM/logic LSIs, it is necessary to reduce the number of DRAM refreshes because of higher heat dissipation caused by the logic portion on the same chip. In order to overcome this problem, we propose several DRAM refresh architectures. The basic is to eliminate unnecessary DRAM refreshes. In addition to this, we propose a method for reducing the number of DRAM refreshes by relocating data. In order to evaluate these architectures and method, we have estimated the DRAM refresh count in executing benchmark programs under several models which simulate each combination of them. As a result, in the most effective combination, we have obtained more than 80% reduction against a conventional DRAM refresh architecture for most of benchmark programs. In addition to it, we have taken normal DRAM access into account, even then we have obtained more than 50% reduction for several benchmarks.
Tetsuo ENDOH Katsuhisa SHINMEI Hiroshi SAKURABA Fujio MASUOKA
This paper describes the analysis of the Stacked-Surrounding Gate Transistor (S-SGT) DRAM for the high speed and low voltage operation. The S-SGT DRAM is based on the new three dimensional (3D)-building memory array technology. In terms of the bit-lines signal voltage for read operation, it is found that the signal voltage of the S-SGT DRAM is larger than that of the conventional planar DRAM, the NAND-structured DRAM, and the SGT DRAM. The signal voltage of the S-SGT DRAM was found to depend on the pillar radius, the distance between the bit-line and the substrate, and the number of cells connected to one bit-line in comparison with the above three kinds of conventional DRAMs. Especially, with reducing the pillar radius (R), the signal voltage of the S-SGT DRAM becomes larger. In the concrete, in case that R is 0. 25 µm, the signal voltage of the S-SGT DRAM is about 160%, 160% and 120% in comparison with the planar DRAM, the SGT DRAM and the NAND-structured DRAM, respectively. Therefore, the S-SGT DRAM can realize larger S/N ratio. This advantage can realize the high speed and low voltage operation. Moreover, in case that the signal voltage is constant (0.15 V), the maximum number of cells connected to one bit-line for the S-SGT DRAM is about 2 times in comparison with the planar DRAM. This advantage makes it possible to reduce the number of both sense amplifiers and bit-lines. This is very suitable for reducing the total chip size of the S-SGT DRAM. Above all, it was found that the S-SGT DRAM is one of candidates for the high speed and low voltage operation DRAM in the future.
Yoshiharu AIMOTO Tohru KIMURA Yoshikazu YABE Hideki HEIUCHI Youetsu NAKAZAWA Masato MOTOMURA Takuya KOGA Yoshihiro FUJITA Masayuki HAMADA Takaho TANIGAWA Hajime NOBUSAWA Kuniaki KOYAMA
We have developed a parallel image processing RAM (PIP-RAM) which integrates a 16-Mb DRAM and 128 processor elements (PEs) by means of 0. 38-µm CMOS 64-Mb DRAM process technology. It achieves 7. 68-GIPS processing performance and 3. 84-GB/s memory bandwidth with only 1-W power dissipation (@ 30-MHz), and the key to this performance is the DRAM design. This paper presents the key circuit techniques employed in the DRAM design: 1) a paged-segmentation accessing scheme that reduces sense amplifier power dissipation, and 2) a clocked low-voltage-swing differential-charge-transfer scheme that reduces data line power dissipation with the help of a multi-phase synchronization DRAM control scheme. These techniques have general importance for the design of LSIs in which DRAMs and logic are tightly integrated on single chips.
In a secure partially blind signature scheme, the signer assures that the blind signatures issued by him contains the information he desires. The techniques make it possible to minimize the unlimited growth of the bank's database which storing all spent electronic cash in an anonymous electronic cash system. In this paper we propose an efficient partially blind signature scheme for electronic cash. In our scheme, only several modular additions and modular multiplications are required for a signature requester to obtain and verify a signature. It turns out that the proposed scheme is suitable for mobile clients and smart-card applications because no time-consuming computations are required, such as modular exponentiation and inverse computations. Comparing with the existing blind signature schemes proposed in the literatures, our method reduces the amount of computations for signature requesters by almost 98%.
Multi-recast techniques make it possible for a voter to participate in a sequence of different designated votings by using only one ticket. In a multi-recastable ticket scheme for electronic voting, every voter of a group can obtain an m-castable ticket (m-ticket), and through the m-ticket, the voter can participate in a sequence of m different designated votings held in this group. The m-ticket contains all possible intentions of the voter in the sequence of votings, and in each of the m votings, a voter casts his vote by just making appropriate modifications to his m-ticket. The authority cannot produce both the opposite version of a vote cast by a voter in one voting and the succeeding uncast votes of the voter. Only one round of registration action is required for a voter to request an m-ticket from the authority. Moreover, the size of such an m-ticket is not larger than that of an ordinary vote. It turns out that the proposed scheme greatly reduces the network traffic between the voters and the authority during the registration stages in a sequence of different votings, for example, the proposed method reduces the communication traffic by almost 80% for a sequence of 5 votings and by nearly 90% for a sequence of 10 votings.
Akira YAMAZAKI Tadato YAMAGATA Yutaka ARITA Makoto TANIGUCHI Michihiro YAMADA
The features for the integration of 1Tr/1C DRAM and logic for graphic and multimedia applications are surveyed. The key circuit/process technology for large scale embedded DRAM cores is described. The methods to improve transistor performance and gate density are shown. Noise immunity design and easy customization techniques are also introduced.