Tsuyoshi KAWAGUCHI Yoshinori TAMURA Kouichi UTSUMIYA
The linear array processor architecture is an important class of interconnection structures that are suitable for VLSI. In this paper we study the problem of mapping a task tree onto a linear array to minimize the total execution time. First, an optimization algorithm is presented for a message scheduling probrem which occurs in the task tree mapping problem. Next, we give a heuristic algorithm for the task tree mapping problem. The algorithm partitions the node set of a task tree into clusters and maps these clusters onto processors. Simulation experiments showed that the proposed algorithm is much more efficient than a conventional algorithm.
Kunio SAKAKIBARA Jiro HIROKAWA Makoto ANDO Naohisa GOTO
Resonant slots are widely used for conventional slotted waveguide array. Reflection from each slot causes a standing wave in the waveguide and beam tilting technique is essential to suppress the reflection at the antenna input port. But the slot reflection narrows the overall frequency bandwidth and the design taking it into account is complicated. This paper proposes a reflection cancelling slot pair as an array element, which consists of two slots spaced by 1/4λg. Round trip path-length difference between them is 1/2λg and reflection waves from a pair disappear and traveling-wave excitation in the waveguide is realized. The full wave analysis reveals that mutual coupling between paired slots is large and seriously reduces the radiation from a pair. Offset arrangement of slots in a pair is recommended to decrease the mutual coupling and to realize strong coupling. In practical array design, the mutual couplings from other pairs were simulated by imposing periodic boundary conditions above the aperture. To clarify the advantages of the slot pair over a conventional resonant slot, the predicted characteristics are compared. Reflection characteristics of the array using the slot pair is excellent and a boresite beam array can be realized. In addition, a slot pair can realize stronger coupling than the conventional resonant slot, while the bandwidth of the former in terms of the aperture field phase illumination is narrower than that of the latter. These suggests that the slot pair array is much more suitable for a small array than conventional one. Finally, the predicted characteristics are confirmed by experiments.
Takanori SAEKI Eiichiro KAKEHASHI Hidemitu MORI Hiroki KOGA Kenji NODA Mamoru FUJITA Hiroshi SUGAWARA Kyoichi NAGATA Shozo NISHIMOTO Tatsunori MUROTANI
A design rule relaxation approach is one of the most important requirements for high density DRAMs. The approach relaxes the design rule of a element in comparison with the memory cell size and provides high density DRAMs with the minimum development of a scaled-down MOS structure and a fine patterning lithography process. This paper describes two design rule relaxation approaches, a close-packed folded (CPF) bit-line cell array layout and a Boosted Dual Word-Line scheme. The CPF cell array provides 1.26 times wider active area pitch and maximum 1.5 times wider isolation width. The Boosted Dual Word-Line scheme provides 2n times wider 1st Al pitch on memory cell array, double word-line driver pitch and 1.5 times larger design rule for 1st Al and contacts under 1st Al. Especially wide design rule of the Boosted Dual Word-Line scheme provides several times depth of focus (DOF) for 1st Al wiring which gives several times higher storage node and larger capacitance for capacitor over bit-line (COB) stacked capacitor cells. These approaches are successfully implemented in a 4 Mb DRAM test chip with a 0.91.8 µm2 memory cell.
Shin'ichi TAKEYA Mitsuyoshi SHINONAGA Yoshitaka SASAKI Hiroshi MIYAUCHI Masanori MATSUMURA Tasuku MOROOKA
This paper describes a DBF (Digital Beamforming) technique as a spatial filtering in the radar systems. DBF for a beamformer and an adaptive processor are discussed. An architecture for the beamformer is proposed. The beamformer discussed consists of systolic arrays that can form beams arbitrarily. Antenna radiation patterns measured in an open site are shown. For the adaptive processor, Gram-Schmidt transformation method is attained by using systolic arrays. Proposed is a means to prevent target signals from being suppressed in cells of the systolic arrays and to achieve the convergent characteristics independent of the magnitude of undesired signal power. In order to demonstrate the performance of the proposed processor, a test model of the adaptive processor was developed and tested in multiple undesired signal environment. Test results are indicated.
Masayasu YAMAGUCHI Ken-ichi YUKIMATSU
This paper briefly reviews recent studies on free-space photonic switches, and discusses classifications, applications and technical issues to be solved. The free-space photonic switch is a switch that uses light beam interconnections based on free-space optics instead of guided-wave optics. A feature of the free-space switch is its high-density three-dimensional structure that enables compact large-scale switches to be created. In this paper, the free-space switches are classified by their various attributes such as logical network configuration, path-establishment method, number of physical stages, signal-waveform transmission form, interconnection optics and so on. The logical network configuration (topological geometry or topology) is strongly related to the advantages of the free-space switches over the guided-wave switches. The path-establishment method (path-shifting/branching-and-gating) and the number of physical stages (single-stage/multistage) are related to physical switching characteristics. Signal-waveform transmission form (analog/digital) is related to switch application. Interconnection optics (imaging system/micro-beam system) is related to the density and volume of the switching fabric. Examples of the free-space switches (single-stage, analog multistage, digital multistage and photonic ATM switches) are described. Possible applications for analog switches are subscriber-line concentrators, inter-module connectors, and switching networks for parallel or distributed computer systems. Those for digital switches include multistage space-division switches in time-division circuit-switching or packet switching systems (including asynchronous transfer mode [ATM] switching system) for both communications switching systems and parallel/distributed computer systems. Technical issues of the free-space switches (system, device, assembly technique) must be solved before creating practical systems. In particular, the assembly technique is a key issue of the free-space switches.
Hideki HAYASHI Goro SASAKI Hiroshi YANO Naoki NISHIYAMA Michio MURATA
Ultrahigh speed and low crosstalk four-channel receiver optoelectronic integrated circuit (OEIC) arrays comprising GaInAs pin PDs and A1InAs/GaInAs HEMTs have been successfully fabricated on an InP substrate. These arrays were designed to have good crosstalk characteristics which are the most critical issue in array devices. The resistive-load OEIC arrays exhibited high speed operation up to 5 Gb/s and low crosstalk of less than -38 dB between whole adjacent channels over entire frequency range below 4.0 GHz. The average sensitivity of resistive-load OEIC arrays was -18.5 dBm at 3 Gb/s for a bit-error-rate of 10-9 over four channels. Good uniformity of device characteristics was obtained over 2-inch InP wafer. These results suggest that receiver OEIC arrays are quite promising for the application to high-speed multi-channel optical interconnections.
Toshio KONDO Yoshimasa KIMURA Noboru SONEHARA
We have developed an SIMD processor on a double-height VME board. We achieved a good balance between cost and performance by combining four identical gate-array LSIs in the processor array with a 16-bit degital signal processor (DSP), standard dynamic random-access memories (DRAMs) and other peripherals. The gate-array LSIs have 168-bit processing elements (PEs), each containing a one-bit processing block and a serial multiplier. This PE structure offers high-level bit processing capability and peak performance of 512 million operations per second (MOPS) for 8-bit multiply and accumulate operations. Effective performance of more than 300 MOPS for 8-bit array data processing is achieved by using an LSI structure tuned to the DRAM access rate, although the processing speed is reduced by the DRAM access bottleneck. The LSIs also have two unique additional hardware structures that speed up various array data processes. One is an inter-PE routing register array for supporting a transmission, rotation and memory access path. The other is a tree-structure network for propagating operations among PEs. With these cost-effective structures, the SIMD processor is expected to be widely used for two-dimensional data processing, such as image processing and pattern recognition.
Mohammed HIMDI Jean-Pierre DANIEL Koichi ITO
Conical beam pattern is well suited for low mobile or maritime mobile antennas used in cheap and low G/T satellite communication system. Various solutions have been already proposed to generate circular polarized conical patterns; some authors use single microstrip patch working on higher order modes [1], [2], while others have built arrays of patches [3]-[5]. The present letter describes the design of an array of slot fed patches with its feed network and the experimental results which have been obtained in S-band.
This paper presents a parallel sorting algorithm which sorts n elements on O(n/w+n log n/p) time using p(n) processors arranged in a 1-dimensional grid with w(n1-ε) buses for every fixed ε>0. Furthermore, it is shown that np elements can be sorted in O(n/w+n log n/p) time on pp (pn) processors arranged in a 2-dimensional grid with w(n1-ε) buses in each column and in each row. These algorithms are optimal because their time complexities are equal to the lower bounds.
The concept of functional memory was proposed over nearly four decades ago. However, the actually usable products have not appeared until the 1980s instead of the long history of development. Functional memory is classified into three categories; there are a general functional memory, a processing element array with small size memory and a special purpose memory. Today a majority of functional memory is an associative memory or a content addressable memory (CAM) and a special purpose memory based on CAM. Due to advances in fablication capability,the capacity of CAM LSI has increased over 100 K bits. A general purpose CAM was developed based on SRAM cell and DRAM cell, respectively. The typical CAM LSI of both types, 20 K bits SRAM based CAM and 288 K bits DRAM based CAM, are introduced. DRAM based CAM is attractive for the large capacity. A parallel processor architecture based on CAM cell is proposed which is called a Functional Memory Type Parallel Processor (FMPP). The basic feature is a dual character of a higher performance CAM and a tiny processor array. It can perform a highly parallel operation to the stored data.
Yoshihiko KUWAHARA Toru ISHITA Yoshihiko MATSUZAWA Yasunori KADOWAKI
Monopulse technique is widely used for tracking radars. For tracking at a low elevation angle, a narrow beam is required in the elevation plane to reduce multipath signals such as gound reflections. In this case, an elliptical aperture is desired. We have developed an antenna with a high tracking accuracy and a high aperture efficiency which is composed of a monopulse feed and an elliptical aperture. In this paper we discuss a design of the feed through lens array with an elliptical aperture and a new monopulse feed. Evaluation test results of a production model proved validity of our design and showed good performance.
Masataka AJIRO Hiroyuki MIYATA Takashi KAN Masakazu SOGA Makoto ONO
Since its successful launch in February of 1992, the Japan Earth Resources Satellite-1 (JERS-1) has been sending back high resolution images of the earth for various studies, including the investigation of earth resources, the preservation of environments and the observation of coastal lines. Currently, received images are processed using the Earth Resources Satellite Data Information System (ERSDIS). The ERSDIS is a high speed image processing system utilizing an extended cellular array processor as its main processing module. The extended cellular array processor (CAP), consisting of 4096 processing elements configured into a two-dimensional array, is designed to have many parallel processing optimizing capabilities targetting large-scale image processing at a high speed. This paper desctribes a typical image processing flow, the structure of the ERSDIS, and the details of the CAP design.
Mitsuhisa SATO Masayuki SUGANO Kazuo IKEBA Koichi FUKUTANI Atushi TERADA Tsugio YAMAZAKI
A cylindrical active phased array antenna was developed. A primary surveillance radar (PSR) antenna and a secondary surveillance radar (SSR) antenna are integrated conformally. The PSR antenna employs two-dimensional electronic beam scanning. The SSR antenna employs electronic beam scanning in azimuth. Advantages of this antenna, design architecture employed and measured characteristics are described.
Dao Heng YU Jiyou JIA Shinsaku MORI
In this paper, a definitce relation between the TSP's optimal solution and the attracting region in the parameters space of TSP's energy function is discovered. An many attracting region relating to the global optimal solution for TSP is founded. Then a neural network algorithm with the optimized parameters by using Orthogonal Array Table Method is proposed and used to solve the Travelling Salesman Problem (TSP) for 30, 31 and 300 cities and Map-coloring Problem (MCP). These results are very satisfactory.
Farhad Fuad ISLAM Keikichi TAMARU
High speed multiplication of two n-bit numbers plays an important role in many digital signal processing applications. Traditional array and Wallace multipliers are the most widely used multipliers implemented in VLSI. The area and time (=latency) of these two multipliers depend on operand bit-size, n. For a particular bit-size, they occupy fixed positions in some graph which has area and time along the x and y-axes respectively. However, many applications require a multiplier which has an 'intermediate' area-time characteristics with the above two traditional multipliers occupying two extreme ends of above mentioned area-time curve. In this paper, we propose such an intermediate multiplier which trades off area for time. It has higher speed (i.e., less latendy) but more area than a traditional array multiplier. Whereas when compared with a traditional Wallace multiplier, it has lower speed and area. The attractive point of our multiplier is that, it resembles an array multiplier in terms of regularity in placement and inter-connection of unit computation cells. And its interesting feature is that, in contrast to a traditional array multiplier, it computes by introducing multiple computation wave fronts among its computation cells. In this paper, we investigate on the area-time complexity of our proposed multiplier and discuss on its characteristics while comparing with some contemporary multiplers in terms of latency, area and wiring complexity.
Kazuhiro UEHARA Kenichi KAGOSHIMA
We analyze the mutual coupling between two microstrip antennas (MSAs) with the finite-difference time-domain (FDTD) method. It is suitable for substrates which have a complex configuration or include feed line structures. The mutual coupling between two MSAs on discontinuous orthogonal substrates is successfully calculated.
Wataru CHUJO Masayuki FUJISE Hiroyuki ARAI Naohisa GOTO
In a two-layer self-diplexing antenna fed at two ports, theoretical analysis has already shown that the isolation characteristics can be improved by adjusting the angle between the feed locations of the transmitting and receiving antennas. In this letter, we experimentally investigate the isolation characteristics of the self-diplexing array antenna. First, calculated and experimental results for each feed location of the element antenna are compared and good agreement is found. Second, experimental results with a 19-element planar array indicate that a self-diplexing antenna with suitably chosen feed configuration is effective in improving the isolation in a phased array antenna.
Masaharu TAKAHASHI Jun-ichi TAKADA Makoto ANDO Naohisa GOTO
A radial line slot antenna (RLSA) is a high gain and high efficiency planar array. A single-layered RLSA is much simple in structure but the slot length must be varied to synthesize uniform aperture illumination. These are now commercialized for 12GHz band DBS reception. In RLSAs, considerable power is dissipated in the termination as is common to other traveling wave antennas; the uniform aperture illumination is not the optimum condition for high gain in RLSAs. Authors proposed a theoretical method reducing the termination loss for further efficiency enhancement. This paper presents the measured performances of the SL-RLSAs of this design with non-uniform aperture illumination. The efficiency enhancement of about 10% is observed; the measured gain of 36.7dBi (87%) and 32.9dBi (81%) for a 0.6mφ and 0.4mφ antennas respectively verify this technique.
This paper presents a hardware architecture design methodology for hidden markov model based recognition systems. With the aim of realizing more advanced and user-friendly systems, an effective architecture has been studied not only for decoding, but also learning to make it possible for the system to adapt itself to the user. Considering real-time decoding and the efficient learning procedures, a bi-directional ring array processor is proposed, that can handle various kinds of data and perform a large number of computations efficiently using parallel processing. With the array architecture, HMM sub-algorithms, the forward-backward and Baum-Welch algorithms for learning and the Viterbi algorithm for decoding, can be performed in a highly parallel manner. The indispensable HMM implementation techniques of scaling, smoothing, and estimation for multiple observations can be also carried out in the array without disturbing the regularity of parallel processing. Based on the array processor, we propose the configuration of a system that can realize all HMM processes including vector quantization. This paper also describes that a high PE utilization efficiency of about 70% to 90% can be achieved for a practical left-to-right type HMMs.
Somchai KITTICHAIKOONKIT Michitaka KAMEYAMA
In the applications of the fast Fourier transform (FFT) to real-world computation such as robot vision, high-speed processing with small latency is an important issue. In this paper, we propose a linear array processor for the minimum-latency FFT computation. The processor is constructed by identical butterfly elements (BE's). The key concept to minimize the latency is that each BE generates its output data immediately after its input data become available, with 100% utilization of its arithmetic unit. We also introduce the real-valued FFT to perform the complex-valued FFT. We utilize a double linear array structure so that the parallel processing can be realized without communication between the linear arrays. As a result, the hardware amount of a single BE is reduced to half that of conventional designs. The latency of the proposed FFT processor is greatly reduced in comparison with conventional linear array FFT processors.