IEICE global.ieice.org Site

Keyword Search Result

[Keyword] DDR(124hit)

101-120hit(124hit)

Memory Allocation Method for Indirect Addressing DSPs with 2 Update Operations
Nakaba KOGURE Nobuhiko SUGINO Akinori NISHIHARA

PAPER

Vol:
E81-A No:3
Page(s):
420-428
Digital signal processors (DSPs) usually employ indirect addressing using an address register (AR) to indicate their memory addresses, which often introduces overhead codes in AR updates for next memory accesses. In this paper, AR update scheme is extended such that address can be efficiently modified by 2 in addition to conventional 1 updates. An automatic address allocation method of program variables for this new addressing model is presented. The method formulates program variables and AR modifications by a graph, and extracts a maximum chained triangle graph, which is accessed only by AR 1 and 2 operations, so that the estimated number of overhead codes is minimized. The proposed methods are applied to a DSP compiler, and memory allocations derived for several examples are compared with memory allocations by other methods.
Design of a New Multicast Addressing Scheme for Self-Routing ATM Tree Networks
Jin-Seek CHOI Kye-Sang LEE Soo-Hyeon SOHN

PAPER-Multicasting in ATM switch

Vol:
E81-B No:2
Page(s):
297-299
In this paper, we propose a new multicast address scheme based on bit map address (BA) and vertex isolation address (VIA) schemes. The proposed scheme can be utilized by the self-routing switch in a speed manner, while preserving the multicast capability. We analyze the processing delay of the proposed scheme and show the efficiency.
An Address-Based Queue Mechanism for Shared Buffer ATM Switches with Multicast Function
Hiroshi INAI Jiro YAMAKITA

LETTER-Switching and Communication Processing

Vol:
E81-B No:1
Page(s):
104-106
The address-based queues are widely used in shared buffer ATM switches to guarantee the order of the cell delivery. In this paper, we propose an address-based queue mechanism to achieve an efficient use of the shared memory under a multicast service. In the switch, both cells and the address queues share the common memory. Each queue length changes flexibly according to the number of the stored cells. Our approach significantly reduces the cell loss probability as compared with the previously proposed approaches.
A Switched Virtual-GND Level Technique for Fast and Low Power SRAM's
Nobutaro SHIBATA

PAPER-Integrated Electronics

Vol:
E80-C No:12
Page(s):
1598-1607
Fast and low-power circuit techniques suitable for size-configurable SRAM macrocells are described. An SRAM cell architecture using virtual-GND lines along bitlines is proposed; each virtual-GND line switches the potential by inner read-enable and column-address-decoded signals. Reducing the active power dissipation in the memory array and shortening the time for writing data are simultaneously accomplished. The range of available supply voltages is enhanced by adoptive higher virtual-GND level control with a simple voltage limiter. An SRAM-macrocell test chip is designed and fabricated with 0.5-µm CMOS technology. A 4K-word6-bit organization SRAM demonstrates 186-MHz operation at a 3.3-V typical power supply. Its power dissipation at a practical operating frequency, 100-MHz, is reduced to 29% (25-mW) by the proposed virtual-GND line techniques.
DSP Code Optimization Methods Utilizing Addressing Operations at the Codes without Memory Accesses
Nobuhiko SUGINO Hironobu MIYAZAKI Akinori NISHIHARA

PAPER-Digital Signal Processing

Vol:
E80-A No:12
Page(s):
2562-2571
Many digital signal processors (DSPs) employ indirect addressing using address registers (ARs) to indicate their memory addresses, which often leads to overhead. This paper presents methods to efficiently allocate addresses for variables in a given program so that overhead in AR update operations is reduced. Memory addressing model is generalized in such a way that AR can be updated at the codes without memory accesses. An efficient memory address allocation is obtained by a method based on the graph linearization algorithm, which takes account of the number of possible AR update operations for every memory access. In order to utilize multiple ARs, methods to assign variables into ARs are also investigated. The proposed methods are applied to the compiler for µPD77230 (NEC) and generated codes for several examples prove effectiveness of these methods.
The Stability of Randomly Addressed Polling Protocol
Jiang-Whai DAI

PAPER-Communication protocol

Vol:
E80-B No:10
Page(s):
1502-1508
In this paper, we first prove that the Randomly Addressed Polling (RAP) protocol is unstable under the random access channel with heavy traffic. We also show that network stability can be ensured by controlling the arrival rate λ or by expanding the available addresses p on the assumption that there are M finite stations within the coverage of the controller (the base station). From analyses and results, we see the equilibrium of arrival rate is inversely proportional to the product of users (stations) and the exponent of stations. We also see that the maximum throughput can be derived at the point of λ1/M. This maximum performance can be easily obtained under the consideration of RAP protocol's stability. It also implies that the maximum throughput is independent of the available addresses of RAP protocol when pM.
CAM-Based Highly-Parallel Image Processing Hardware
Takeshi OGURA Mamoru NAKANISHI

INVITED PAPER

Vol:
E80-C No:7
Page(s):
868-874
This paper describes content addressable memory (CAM) -based hardware that serves as a highly parallel, compact and real-time image-processing system. The novel concept of a highly-parallel integrated circuits and system (HiPIC), in which a large-capacity CAM tuned for parallel data processing is a key element, is introduced. Several hardware algorithms for highly-parallel image processing based on a HiPIC with a CAM are presented in order to demonstrate that the HiPIC concept is effective for compact and real-time image processing. Two kinds of HiPIC-dedicated CAM have been developed. One is embedded on a 0.5-µm CMOS gate array. An embedded CAM up to 64 kbit and logic up to 40 kgate can be integrated on a single chip. The other is a 0.5-µm CMOS full-custom CAM LSI tuned for parallel data processing. A fully-parallel 336-kbit CAM LSI has been successfully developed. The HiPIC concept and CAM-based hardware described here promises to be an important step towards the realization of a compact and real-time image-processing system.
Hiding Data Cache Latency with Load Address Prediction
Toshinori SATO Hiroshige FUJII Seigo SUZUKI

PAPER-Computer Systems

Vol:
E79-D No:11
Page(s):
1523-1532
A new prediction method for the effective address is presented. This method works with the buffer named the address prediction buffer, and allows the data cache to be accessed speculatively. As a consequence of the trend toward increasing clock frequency, the internal cache is no longer able to fill the speed gap between the processor and the external memory, and the data cache latency degrades the processor performance. In order to hide this latency, the prediction method is proposed. By this method, the load address is predicted, and the data is fetched earlier than the memory access stage. In the case that the prediction is correct, the latency is hidden. Even if the prediction is incorrect, the performance is not degraded by any miss penalties. We have found that the prediction accuracy is 81.9% on average, and thus the performance is improved by 6.6% on average and a maximum of 12.1% for the integer programs.
DSP Code Optimization Utilizing Memory Addressing Operation
Nobuhiko SUGINO Satoshi IIMURO Akinori NISHIHARA Nobuo FUJII

PAPER

Vol:
E79-A No:8
Page(s):
1217-1224
In this paper, DSPs, of which memory addresses are pointed by special purpose registers (address registers: ARs), are assumed, and methods to derive an efficient memory access pattern for those DSPs proposed. In such DSPs, programmers must take care for efficient allocation of memory space as well as effective use of registers, in order to derive an efficient program in the sense of execution period. In this paper, memory addresses and AR update operations are modeled by an access graph, and a novel memory allocation method is presented. This method removes cycles and forks in a given access graph, and decides an address location of variables in memory space with less overhead. In order to utileze multiple ARs, methods to assign variables into ARs are investigated. The proposed methods are applied to the compiler for DSP56000 and are proved to be effective by generated codes for several examples.
A CAM-Based Parallel Fault Simulation Algorithm with Minimal Storage Size
Shinsuke OHNO Masao SATO Tatsuo OHTSUKI

PAPER

Vol:
E78-A No:12
Page(s):
1755-1764
CAMs (Content Addressable Memories) are functional memories which have functions such as word-parallel equivalence search, bilateral 1-bit data shifting between consecutive words, and word-parallel writing. Since CAMs can be integrated because of their regular structure, massively parallel CAM functions can be executed. Taking advantage of CAMs, Ishiura and Yajima have proposed a parallel fault simulation algorithm using a CAM. This algorithm, however, requires a large amount of CAM storage to simulate large-scale circuits. In this paper, we propose a new massively parallel fault simulation algorithm requiring less CAM storage, and compare it with Ishiura and Yajima's algorithm. Experimental results of the algorithm on CHARGE --the CAM-based hardware engine developed in our laboratory--are also reported.
MFSK/FH-CDMA System with Two-Stage Address Coding and Error Correcting Coding and Decoding
Weidong MAO Ryuji KOHNO Hideki IMAI

PAPER

Vol:
E78-A No:9
Page(s):
1117-1126
In this paper we propose a two-stage address coding scheme to transmit two data symbols at once within a frame in a MFSK/FH-CDMA system. We compare it with the conventional system using single-stage address coding. Assumed that the address codes of all users are known in the receiver. A multiuser detection scheme is applied and the performance is evaluated by computer simulations to show the improvement in bit error rate (BER) compairing to the conventional system. We also investigate the performance of error-correcting coding and decoding in the two-stage address coded MFSK/FH-CDMA system. An erasure decoding scheme is modified for the two-stage address coded system and is utilized to improve spectral efficiency or to increase user capacity in the MFSK/FH-CDMA system. Finally, we investigate a hybrid scheme of combining the multi-user detection scheme and the error-correcting decoding scheme for the two-stage address coded MFSK/FH-CDMA system. The performance is evaluated by computer simulations.
Light Scattering and Reflection Properties in Polymer Dispersed Liquid Crystal Cells with Memory Effects
Rumiko YAMAGUCHI Susumu SATO

PAPER-Electronic Displays

Vol:
E78-C No:1
Page(s):
106-110
Memory type polymer dispersed liquid crystal (PDLC) can be applied to a thermal addressing display device cell. Making use of its easy fabrication of large area display using flexible film substrate, the PDLC film can be used as reusable paper for direct-view mode display. In this study, memory type PDLC cells are prepared with an aluminum reflector deposited onto one side of the substrate and the reflection property in the PDLC cell with the reflector is clarified and compared to that without the reflector in the off-, on- and memory-states. The increase of contrast ratio and the decrease of driving voltage can be concurrently realized by decreasing the cell thickness by attaching the reflector. In addition, the reflected light in the off-state is bright and colorless due to the reflector, as compared with the weak, bluish reflected light in the cell without the reflector. Reflected light in the on-state and the memory-state are tinged with blue.
A Flexible Search Managing Circuitry for High-Density Dynamic CAMs
Takeshi HAMAMOTO Tadato YAMAGATA Masaaki MIHARA Yasumitsu MURAI Toshifumi KOBAYASHI Hideyuki OZAKI

PAPER-General Technology

Vol:
E77-C No:8
Page(s):
1377-1384
New circuit techniques were proposed to realize a high-density and high-performance content addressable memory (CAM). A dynamic register which functions as a status flag, and some logic circuits are organically combined and flexibly perform complex search operations, despite the compact layout area. Any kind of logic operations for the search results, that are AND, OR, INVERT, and the combinations of them, can be implemented in every word simultaneously. These circuits are implemented in an experimental 288 kbit dynamic CAM using 0.8 µm CMOS process technology. We consider these techniques to be indispensable for high-density and high-performance dynamic CAM.
Design of a CAM-Based Collision Detection VLSI Processor for Robotics
Masanori HARIYAMA Michitaka KANEYAMA

PAPER

Vol:
E77-C No:7
Page(s):
1108-1115
Real-time collision detection is one of the most important intelligent processings in robotics. In collision detection, a large storage capasity is usually required to store the 3-dimensional information on the obstacles located in a workspace. Moreover, high-computational power is essential in not only coordinate transformation but also matching operation. In the proposed collision detection VLSI processor, the matching operation is drastically accelerated by using a content-addressable memory (CAM). A new obstacle representation based on a union of rectangular solids is also used to reduce the obstacle memory capacity, so that the collision detection can be performed by only magnitude comparison in parallel. Parallel architecture using several identical processor elements (PEs) is employed to perform the coordinate transformation at high speed, and each PE performs coordinate transformation at high speed based on the COordinate Rotation DIgital Computation (CORDIC) algorithms. When the 16 PEs and 144-kb CAM are used, the performance is evaluated to be 90 ms.
Datagram Delivery in an ATM-Internet
Hiroshi ESAKI Yoshiyuki TSUDA Takeshi SAITO Shigeyasu NATSUBORI

PAPER

Vol:
E77-B No:3
Page(s):
314-326
This paper proposes a datagram delivery (class D service) architecture in an ATM-Internet, which is the network interconnecting ATM-LANs through the IWUs, Inter-Working Unit. We can provide a fast datagram delivery system through the following techniques. The datagram delivery to the destination terminal is performed by the datagram delivery server, so called CLS, which is located in the ATM-LAN where the destination terminal belongs to. Each CLS only manages the addresses for the terminals belonging to the corresponding ATM-LAN. The cells belonging to a certain datagram are transferred through a single (seamless) ATM connection from the source terminal to the CLS in the ATM-LAN where the destination terminal belongs to. The source terminal only resolves the access point address corresponding to the ATM-LAN where the destination terminal belongs to, when it submits the cells to the network to transfer the datagram to the corresponding destination terminal. The proposed datagram delivery architecture can be applied to the ATM-LAN system based on VPI routing architecture, easily. The number of the required ATM connections so as to provide datagram delivery through the proposed architecture is less than 1.0% of the ATM connections that the ATM-Internet can provide. Also, the required address space at UNI to provide datagram delivery are less than 1.0% of the UNI address space which is available to be used as an ATM connection identifier.
A Collision Detection Processor for Intelligent Vehicles
Masanori HARIYAMA Michitaka KAMEYAMA

PAPER

Vol:
E76-C No:12
Page(s):
1804-1811
Since carelessness in driving causes a terrible traffic accident, it is an important subject for a vehicle to avoid collision autonomously. Real-time collision detection between a vehicle and obstacles will be a key target for the next-generation car electronics system. In collision detection, a large storage capacity is usually required to store the 3-D information on the obstacles lacated in a workspace. Moreover, high-computational power is essential not only in coordinate transformation but also in matching operation. In the proposed collision detection VLSI processor, the matching operation is drastically accelerated by using a Content-Addressable Memory (CAM) which evaluates the magnitude relationships between an input word and all the stored words in parallel. A new obstacle representation based on a union of rectangular solids is also used to reduce the obstacle memory capacity, so that the collision detection can be parformed only by parallel magnitude comparison. Parallel architecture using several identical processor elements (PEs) is employed to perform the coordinate transformation at high speed based on the COordinate Rotation DIgital Computation (CORDIC) algorithms. The collision detection time becomes 5.2 ms using 20 PEs and five CAMs with a 42-kbit capacity.
The Trend of Functional Memory Development
Keikichi TAMARU

INVITED PAPER

Vol:
E76-C No:11
Page(s):
1545-1554
The concept of functional memory was proposed over nearly four decades ago. However, the actually usable products have not appeared until the 1980s instead of the long history of development. Functional memory is classified into three categories; there are a general functional memory, a processing element array with small size memory and a special purpose memory. Today a majority of functional memory is an associative memory or a content addressable memory (CAM) and a special purpose memory based on CAM. Due to advances in fablication capability,the capacity of CAM LSI has increased over 100 K bits. A general purpose CAM was developed based on SRAM cell and DRAM cell, respectively. The typical CAM LSI of both types, 20 K bits SRAM based CAM and 288 K bits DRAM based CAM, are introduced. DRAM based CAM is attractive for the large capacity. A parallel processor architecture based on CAM cell is proposed which is called a Functional Memory Type Parallel Processor (FMPP). The basic feature is a dual character of a higher performance CAM and a tiny processor array. It can perform a highly parallel operation to the stored data.
A Bitline Control Circuit Scheme and Redundancy Technique for High-Density Dynamic Content Addressable Memories
Tadato YAMAGATA Masaaki MIHARA Takeshi HAMAMOTO Yasumitsu MURAI Toshifumi KOBAYASHI Michihiro YAMADA Hideyuki OZAKI

PAPER-Application Specific Memory

Vol:
E76-C No:11
Page(s):
1657-1664
This paper describes a bitline control circuit and redundancy technique for high-density dynamic content addressable memories (CAMs). The proposed bitline control circuit can efficiently manage a dynamic CAM cell accompanied by complex operations; that is, a refresh operation, a masked search operation, and partial writing, in addition to normal read/write/search operations. By adding a small supplementary circuit to the bitline control circuit, a circuit scheme with redundancy which prevents disabled column circuits from affecting a match operation can also be obtained. These circuit technologies achieve higher-density dynamic CAMs than conventional static CAMs. These technologies have been successfully applied to a 288-kbit CAM with a typical cycle time of 150 ns.
A High-Density Multiple-Valued Content-Addressable Memory Based on One Transistor Cell
Satoshi ARAGAKI Takahiro HANYU Tatsuo HIGUCHI

PAPER-Application Specific Memory

Vol:
E76-C No:11
Page(s):
1649-1656
This paper presents a high-density multiple-valued content-addressable memory (MVCAM) based on a floating-gate MOS device. In the proposed CAM, a basic operation performed in each cell is a threshold function that is a kind of inverter whose threshold value is programmable. Various multiple-valued operations for data retrieval can be easily performed using threshold functions. Moreover, each cell circuit in the MVCAM can be implemented using only a single floating-gate MOS transistor. As a result, the cell area of the four-valued CAM are reduced to 37% in comparison with that of the conventional dynamic CAM cell.
Hardware Architecture for Kohonen Network
Hidetoshi ONODERA Kiyoshi TAKESHITA Keikichi TAMARU

PAPER-Neural Networks and Chips

Vol:
E76-C No:7
Page(s):
1159-1166
We propose a fully digital architecture for Kohonen network suitable for VLSI implementation. The proposed architecture adopts a functional memory type parallel processor (FMPP) architecture which has a structure similar to a content addressable memory (CAM). One word of CAM is regarded as a processing element and a group of elements forms a neuron. All processing elements execute the same operation in bit-serial but in processor-parallel. Thus the number of instructions for realizing the network algorithm is independent of the number of neurons in the network. With reference to a previously reported CAM, we estimate a network with 96 neurons for speech recognition could be integrated on three chips using a 1.2 µm process, and it operates 50 times faster than a sequential hardware. Owing to its highly regular structure of memories, the proposed hardware architecture is well compatible with current VLSI technology.