1-16hit |
Tadayoshi HORITA Yuuji KATOU Itsuo TAKANAMI
This paper deals with redundant 3D mesh processor arrays using 1.5-track switches, considering track and switch faults together with processor faults. Four variants are defined based on the distributions of spare PEs, and arrays of three variants have the same PE redundancies among them, but the fabrication-time costs are different. We investigate in detail how the reliability of a total system changes according to the reliabilities of tracks and switches as well as PEs, and show the concrete values of Mt and Ms, when the reliability of array are almost the same even if its variant is changed, and when it is not so, respectively, where Mt and Ms are the ratio of the hardware complexities of a PE and a track, and that of a PE and a contact point of a switch, respectively. Other results which are effective basis for the design of fault-tolerant 3D PE arrays using 1.5-TSs are given.
Tadayoshi HORITA Itsuo TAKANAMI
The E-1-track switch torus array model and the "EAR" reconfiguration method are proposed for fault tolerance of mesh or torus-connected processor arrays, where the original idea of EAR is in EAM. The comparison among these and others is described in terms of the (run-time) array reliability, hardware overhead, and/or reconfiguration time. When a designer chooses one among fault tolerant methods, he should consider their features synthetically case by case, and we consider that the results given by this paper are useful for the choice.
Takashi MAEBA Mitsuyoshi SUGAYA Shoji TATSUMI Ken'ichi ABE
This letter presents parallel algorithms for matrix multiplication and the fast Fourier transform (FFT) that are significant problems arising in engineering and scientific applications. The proposed algorithms are designed on a 3-dimensional processor array with separable buses (PASb). We show that a PASb consisting of N N h processors can compute matrix multiplication of size N N and the FFT of size N in O(N/h+log N) time, respectively. In order to examine ease of hardware implementation, we also evaluate the VLSI complexity of the algorithms. A result obtained achieves an optimal bound on area-time complexity when h=O(N/log N).
Noritaka SHIGEI Hiromi MIYAJIMA
This paper considers a reconfiguration problem on a processor array model based on single-and-half-track switches, which is proposed for a fault tolerance technique at the fabrication time. The focus of this paper is to achieve the optimal reconfigurability, which means that whenever there exists a solution for successful reconfiguration, the designed method can find the solution. The paper consists of two parts. In the first part, we show two essential constraints that have been assumed in most of the previous studies, and make four reconfiguration classes that differ in the assumed essential constraints. Then, we present some inclusion relations among the four reconfiguration classes. As a result, it becomes clear that the most restrictive class including most of the previous methods never achieves the truly optimal reconfigurability. In the second part, we present a reconfiguration method based on sequential routing (RMSR). Although the worst-case time complexity of the RMSR is exponential in the number of processing elements, the reconfigurability of the RMSR is optimal within the most restrictive reconfiguration class. The effectiveness of the RMSR is shown by a computer simulation.
First, we give a graph-theoretic formalization for the spare assignment problems for two cases of reconfiguring NN mesh-connected processor arrays with spares on a diagonal line in the array or two orthogonal lines at the edges of the array. Second, we discuss the problems for minimizing the numbers of "dangerous processors" for the cases. Here, a dangerous processor is a nonfaulty one for which there remains no spare processor to be assigned if it becomes faulty, without modifying the spare assignments to other faulty processors. The problem for the latter case, originally presented by Melhem, has already been discussed and solved by the O(N2) algorithm in [3], but it's procedure is very complicated. Using the above graph-theoretic formalization, we give efficient plain algorithms for minimizing the numbers of dangerous processors by which the problems for both the cases can be solved in O(N) time.
Tadayoshi HORITA Itsuo TAKANAMI
We gave in [1] the software and hardware algorithms for reconfiguring 1 1/2-track switch 2-D mesh arrays with faults of processing elements, avoiding them. This paper shows an implementation of the hardware algorithm using an FPGA device, and by the logical simulation confirms the correctness of the behavior and evaluates reconfiguration time. From the result it is found that a self-repairable system is realizable and the system is useful for the run-time as well as fabrication-time reconfiguration because it requires no host computer to execute the reconfiguration algorithm and the reconfiguration time is very short.
Noritaka SHIGEI Hiromi MIYAJIMA
A reconfiguration method for processor array is proposed in this paper. In the method, genetic algorithm (GA) is used for searching effective spare arrangement, which leads to successful reconfiguration. The effectiveness of the method is demonstrated by computer simulations.
Noritaka SHIGEI Hiromi MIYAJIMA Sadayuki MURASHIMA
To enhance fabrication yield for processor arrays, many reconfiguration schemes for replacing faulty processing elements (PE's) with spare PE's have been proposed. An array grid model based on single-track switches is one of such models. For this model, some algorithms for reconfiguring processor arrays have been proposed. However, any algorithm which can reconfigure the array, whenever the array is reconfigurable, has not been proposed as yet. This paper describes reconfiguration methods of processor arrays with faulty PE's. The methods use indirect replacements for reconfiguring arrays. First, we introduce a concept of fatal fault pattern, which makes an array unreconfigurable. Then, for the reconfiguration method with fixed spare arrangement, efficient spare arrangements are given by evaluating the probability of an occurring fatal fault pattern. Furher, we present reconfiguration algorithm with relocating spare. In the algorithm, fatal fault patterns are eliminated by relocating spare. Computer simulations show that the method has good performance of reconfiguration.
In contrast to previous algorithms for reconfiguring processor arrays under the assumption that spare rows and columns are placed on the perimeter of the array or on fixed positions, our new algorithm employs movable and partitionable spare rows and columns. The objective of moving and partitioning spare rows and/or columns is the elimination of faulty processors each of which is blocked in all directions to spare processors. The results of our computer simulation indicate that reconfigurability can significantly be improved.
Memory Sharing Processor Array (MSPA) architecture has been developed as an effective array processing architecture for both reduced data storages and increased processor cell utilization efficiency [1]. In this paper, the MSPA design methodology is extended to the VLSI synthesis of a serial input processor array (Pa). Then, a new bit-serial input multiplier and a new data serial input matrix multiplier are derived from the new PA. These multipliers are superior to the conventional multipliers by their smaller number of logic-gate count.
In this paper, a design of a new processor array architecture with effective data storage schemes which meets the practical requirement of a reduced number of processor elements is proposed. Its design method is shown to be drastically simpler than the popular systolic arrays. This processor array which we call Memory Sharing Processor Array (MSPA) consists of a processor array, several memory units, and some address generation hardware units used to minimize the number of I/O ports. MSPA architecture with its design methodology tries to overcome overlapping data storages, idle processing time and I/O bottleneck problems, which mostly degrade the performance of systolic architecture. It has practical advantages over the systolic array in the view of area-efficiency, high throughput and practical input schemes.
Noritaka SHIGEI Hiromi MIYAJIMA Takayuki ISHIZAKA Sadayuki MURASHIMA
To enhance fabrication yield for processor arrays, many reconfiguration schemes for replacing faulty processing elements (PE's) with spare PE's have been proposed. An array grid model based on single-tracks is one of such models. For this model, some algorithms for reconfiguring processor arrays have been proposed. However, an algorithm which can reconfigure the array, whenever the array is reconfigurable, has not been proposed yet. This paper presents two types of methods for reconfiguration of processor arrays. Both the types use indirect replacements for reconfiguring arrays. For an indirect replacement of a faulty non-spare PE, one has a fixed direction, the other has at most four directions among which one is chosen. For the former, we consider the several distribution of spare PE's, and computer simulations show a tendency in the term of difference in the distributions. The latter algorithms consist of two phases. In the first phase, rows and columns of spare PE's are decided in accordance with a rule. Several rules for deciding spare PE's are considered in this paper. In the second phase, faulty non-spare PE's are replaced with healthy spare PE's. By simulations the performance of the algorithms are evaluated and a tendency is shown in the terms of difference in disposition of spare PE's.
Kensuke MIYASHITA Yoshihiro TSUJINO Nobuki TOKURA
Processor networks connected by buses have attracted considerable attention. Since a reconfigurable array is more powerful than a PRAM and more practical, it becomes the focus of attention. The Processor Array with Reconfigurable Bus System (PARBS) and the Reconfigurable Multiple Bus Machine (RMBM) are both models of parallel computation based on reconfigurable bus and processor array. The PARBS is a processor array that consists of processors arranged to a 2-dimensional grid with a reconfigurable bus system. The RMBM is also made of processors and reconfigurable bus system, but the processors are located in a row and the number of processors and the number of buses are independent of each other. Four versions of RMBM has been proposed and Extended RMBM (E-RMBM) is regarded as the most powerful one among them. In this paper, we describe that a PARBS of size N M can be simulated in constant time by a E-RMBM of 4NM processors, M + 3 buses and 1 read buffer, and that a E-RMBM of P processors, B buses and D read buffers can be also simulated in constant time by a PARBS of size B P. A PARBS or RMBM that solves a computational problem of size n is polynomially bounded iff the product of the number of processors and buses and red and write ports is O (nc), for some constant c. When a PARBS is polynomially bounded, the E-RMBM which simulates it is also polynomially bounded, and vice versa.
The concept of functional memory was proposed over nearly four decades ago. However, the actually usable products have not appeared until the 1980s instead of the long history of development. Functional memory is classified into three categories; there are a general functional memory, a processing element array with small size memory and a special purpose memory. Today a majority of functional memory is an associative memory or a content addressable memory (CAM) and a special purpose memory based on CAM. Due to advances in fablication capability,the capacity of CAM LSI has increased over 100 K bits. A general purpose CAM was developed based on SRAM cell and DRAM cell, respectively. The typical CAM LSI of both types, 20 K bits SRAM based CAM and 288 K bits DRAM based CAM, are introduced. DRAM based CAM is attractive for the large capacity. A parallel processor architecture based on CAM cell is proposed which is called a Functional Memory Type Parallel Processor (FMPP). The basic feature is a dual character of a higher performance CAM and a tiny processor array. It can perform a highly parallel operation to the stored data.
This paper presents a parallel sorting algorithm which sorts n elements on O(n/w+n log n/p) time using p(n) processors arranged in a 1-dimensional grid with w(n1-ε) buses for every fixed ε>0. Furthermore, it is shown that np elements can be sorted in O(n/w+n log n/p) time on pp (pn) processors arranged in a 2-dimensional grid with w(n1-ε) buses in each column and in each row. These algorithms are optimal because their time complexities are equal to the lower bounds.
This paper addresses fault tolerance of a processor array that is reconfigurable by replacing faulty processors with spare processors. The fault tolerance of such a reconfigurable array depends on not only an algorithm for spare processor assignment but also the folloving factor of an organization of spare processors in the reconfigurable array: the number of spare processors; the number of processors that can be replaced by each spare processor; and how spare processors are connected with processors. We discuss a relationship between fault tolerance of reconfigurable arrays and their organizations of spare processors in terms of the smallest size of fatal sets and the reliability function. The smallest size of fatal sets is the smallest number of faulty processors for which the reconfigurable array cannot be failure-free as a processor array system no matter what reconfiguration is used. The reliability function is a function of time t whose value is the probability that the reconfigurable array is failure-free as a processor array system by time t when the best possible reconfiguration is used. First, we show that the larger smallest size of fatal sets a reconfigurable array has, the larger reliability function it has by some time. It suggests that it is important to maximize the smallest size of fatal sets in orer to improve the reliability function as well. Second, we present the best possible smallest size of fatal sets for nn reconfigurable arrays using 2n spare processor each of which is connected with n processors. Third, we show that the nn reconfigurable array previously presented in a literature achieves the best smallest size of fatal sets. That is, it is optimum with respect to the smallest size of fatal sets. Fourth, we present an uppr bound of the reliability function of the optimum nn reconfigurable array using 2n spare processors.