IEICE global.ieice.org Site

Keyword Search Result

[Keyword] Y(22683hit)

20461-20480hit(22683hit)

Message-Based Efficient Remote Memory Access on a Highly Parallel Computer EM-X
Yuetsu KODAMA Hirohumi SAKANE Mitsuhisa SATO Hayato YAMANA Shuichi SAKAI Yoshinori YAMAGUCHI

PAPER-Architectures

Vol:
E79-D No:8
Page(s):
1065-1071
Communication latency is central to multiprocessor design. This study presents the design principles of the EM-X distributed-memory multiprocessor towards tolerating communication latency. The EM-X overlaps computation with communication for latency tolerance by multithreading. In particular, we present two types of hardware support for remote memory access: (1) priority-based packet scheduling for thread invocation, and (2) direct remote memory access. The priority-based scheduling policy extends a FIFO ordered thread invocation policy to adopt to different computational needs. The direct remote memory access is designed to overlap remote memory operations with thread execution. The 80-processor prototype of EM-X is developed and is operational since December 1995. We execute several programs on the machine and evaluate how the EM-X effectively overlaps computation with communication toward tolerating communication latency for high performance parallel computing.
Striping in a Disk Array with Data/Parity Placement Scheme RM2 Tolerating Double Disk Failures^*
Chan-Ik PARK

PAPER-Disk array

Vol:
E79-D No:8
Page(s):
1072-1085
There is a growing demand for high reliability beyond what current RAID can provide and there are various levels of user demand for data reliability. An efficient data placement scheme called RM2 has been proposed in [10], which makes a disk array system resistant to double disk failures. In this paper, we consider how to choose an optimal striping unit for RM2 particularly when no workload information is available except read/write ratio. For experimental purposes, we develop a disk array simulator incorporating RM2 as one of the data placement schemes including other schemes of RAID levels. In the case of disk read operations, it is shown that RM2 has an optimal striping unit of 4/3T for large requests and 8/3T for small requests, where T represents the size of a single track. We have also shown that, if any disk write operations are involved, an optimal striping unit becomes 1/3T for large requests and 8/3T for small requests.
Virtual Striping: A Storage Management Scheme with Dynamic Striping
Kazuhiko MOGI Masaru KITSUREGAWA

PAPER-Disk array

Vol:
E79-D No:8
Page(s):
1086-1092
RAID5 disk arrays provide high performance and high reliability for reasonable cost. However RAID5 suffers a performance penalty during block updates. In this paper, we propose a method to improve the small write performance of RAID5 disk arrays, named Virtual Striping. Instead of updating each block independently, this method buffers a number of updates, generates a new stripe composed of the newly updated blocks, then writes the full stripe back to disk. In order to make free space for write operations, new garbage collection strategy is employed, where the linkage of blocks in a parity stripe is changed in Virtual Striping. The LFS (log-structured file system) based storage management scheme also writes new block onto large free area, which uses copying garbage collection. In this paper, we compare the performance of both methods through simulation. Although the write cost of Virtual Striping is more than that of the LFS based method, Virtual Striping has better performance than the LFS based method. This is due to the high efficiency of garbage collection in Virtual Striping.
hMDCE: The Hierarchical Multidimensional Directed Cycles Ensemble Network
Takashi YOKOTA Hiroshi MATSUOKA Kazuaki OKAMOTO Hideo HIRONO Shuichi SAKAI

PAPER-Interconnection Networks

Vol:
E79-D No:8
Page(s):
1099-1106
This paper discusses a massively parallel interconnection scheme for multithreaded architecture and introduces a new class of direct interconnection networks called the hierarchical Multidimensional Directed Cycles Ensemble (hMDCE). Its suitability for massively parallel systems is discussed. The network is evolved from the Multidimensional Directed Cycles Ensemble (MDCE) network, where each node is substituted by lower-level sub-networks. The new network addresses some serious problems caused by the increasing scale of parallel systems, such as longer latency, limited throughput and high implementation cost. This paper first introduces the MDCE network and then presents and examines in detail the hierarchical MDCE network. Bisection bandwidth of hMDCE is considerably reduced from its ancestor MDCE and the network performs significantly higher throughput and lower latency under some practical implementation constraints. The gate count and delay time of the compiled circuit for the routing function are insignificant. These results reveal that the hMDCE network is an important candidate for massively parallel systems interconnection.
An Acoustically Oriented Vocal-Tract Model
Hani C. YEHIA Kazuya TAKEDA Fumitada ITAKURA

PAPER-Speech Processing and Acoustics

Vol:
E79-D No:8
Page(s):
1198-1208
The objective of this paper is to find a parametric representation for the vocal-tract log-area function that is directly and simply related to basic acoustic characteristics of the human vocal-tract. The importance of this representation is associated with the solution of the articulatory-to-acoustic inverse problem, where a simple mapping from the articulatory space onto the acoustic space can be very useful. The method is as follows: Firstly, given a corpus of log-area functions, a parametric model is derived following a factor analysis technique. After that, the articulatory space, defined by the parametric model, is filled with approximately uniformly distributed points, and the corresponding first three formant frequencies are calculated. These formants define an acoustic space onto which the articulatory space maps. In the next step, an independent component analysis technique is used to determine acoustic and articulatory coordinate systems whose components are as independent as possible. Finally, using singular value decomposition, acoustic and articulatory coordinate systems are rotated so that each of the first three components of the articulatory space has major influence on one, and only one, component of the acoustic space. An example showing how the proposed model can be applied to the solution of the articulatory-to-acoustic inverse problem is given at the end of the paper.
A Local Property of the Phasor Model of Neural Networks
Masahiro AGU Kazuo YAMANAKA Hiroki TAKAHASHI

LETTER-Bio-Cybernetics and Neurocomputing

Vol:
E79-D No:8
Page(s):
1209-1211
Stable phase locked states" are found amongst the equiliblia of the phasor model known as a generalized Hopfield model having complex-valued local states on the unit circle with centre at the origin. The asynchronous updating rule is assumed, and the energy decreasing characteristic is used to investigate a property of the equilibrium states. Some of the equilibria are shown to be fragile" in the sense that the energy is not locally convex. It is also shown that the local convexity of the energy is assured by a sort of consistency between the equilibrium and the connection weights.
A Built-In Self-Reconstruction Approach for Partitioned Mesh-Arrays Using Neural Algorithm
Tadayoshi HORITA Itsuo TAKANAMI

PAPER-Fault Diagnosis/Tolerance

Vol:
E79-D No:8
Page(s):
1160-1167
Various reconfiguration schemes against faults of mesh-connected processor arrays have been proposed. As one of them, the mesh-connected processor arrays model based on single-track switches was proposed in [1]. The model has an advantage of its inherent simplicity of the routing hardware. Furthermore, the 2 track switch model [2] and the multiple track switch model [3] were proposed to enhance yields and reliabilities of arrays. However, in these models, Simplicity of the routing hardware is somewhat lost because multiple tracks are used for each row and column. In this paper, we present a builtin self-reconstruction approach for mesh-connected processor arrays which are partitioned into sub-arrays each using single-track switches. Spare PEs which are located on the boundaries of the sub-arrays compensate faulty PEs in these sub-arrays. First, we formulate a reconfigulation algorithm for partitioned mesh-arrays using a Hopfield-type neural network, and then its performance for reconfigulation in terms of survival rates and reliabilities of arrays and processing time are investigated by computer simulations. From the results, we can see that high reliabilites are achieved while processing time is a little and hardware overhead (links and switches) required for reconstruction is as same as that for the track switch model. Next, we present a hardware implementation of the neural algorithm so that a built-in self-reconfigurable scheme may be realized.
Implantable Temperature Measurement System Using the Parametron Phenomenon
Yoshiaki SAITOH Akira KANKE Isamu SHINOZAKI Tohru KIRYU Jun'ichi HORI

PAPER-Measurement and Metrology

Vol:
E79-B No:8
Page(s):
1129-1134
Adapting the principle of parametron oscillation, a small implantable temperature sensor requiring no internal power supply is described. Since this sensor's oscillation frequency is half that of the excitation frequency, the oscillated signal can be measured from the reception side, free of any signal, interference, simply by positioning the sensor and the excitation antenna so that; 1) they are separated up to 95 cm in the air; 2) a 41 cm gap, the phantom equivalent of the thickness of the human abdomen maintain between them. In the temperature-dependent quartz resonator sensor, oscillation occurs only when frequency and temperature correspond. The excitation power is then adjusted so that the frequency bandwidth narrows. As a result, the margin of error in measuring the temperature is minimized; (0.07).
Optically Compensated Bend Mode(OCB Mode) with Wide Viewing Angle and Fast Response
Tetsuya MIYASHITA Tatsuo UCHIDA

INVITED PAPER

Vol:
E79-C No:8
Page(s):
1076-1082
To overcome the problem of narrow viewing angle in active matrix liquid crystal displasy(LCDs) in the twisted nematic mode(TN mode), we have proposed a new LCD mode using a bend-alignment cell with an optical compensator. In this new mode, we have successfully obtained a black state with almost no leakage over a wide viewing angle range with very fast response. We describe the fundamental principle and design rule of the optical compensator and discuss the properties obtained in theoretical and experimental term.
Stability of Terminated Two Port Networks
Yoshihiro MIWA

LETTER-Electronic Circuits

Vol:
E79-C No:8
Page(s):
1171-1176
The purpose of this letter is to investigate the stability of the active two port networks having some restrictions on load and source terminations, and the stability conditions having two inequalities have been obtained. As the terminations making the active two port networks stable can be obtained from these inequalities, these stability conditions are very useful for designing high frequency amplifiers, especially, tuned amplifiers.
Multiplierless Arrays for Realization of Lowpass and Highpass Linear Phase FIR Digital Filters
Saed SAMADI Akinori NISHIHARA Nobuo FUJII

PAPER

Vol:
E79-A No:8
Page(s):
1112-1119
A classs of type 1 linear phase FIR digital filters is proposed. The filter can be realized using a parallel, modular and regular array structure. It is shown that, under some simple constraints, the consisting modules of the array can be realized free of multiplier coefficients. Such two dimensional mesh arrays are specially suitable for realization with special-purpose systolic hardware for high-speed digital signal processing tasks. Compared to the array structure, proposed by the authors, for multiplierless realization of maximally flat FIR digital filters, this class needs less adders to fulfill the same magnitude response requirements. Another attractive property of the proposed array is that a number of highpass or lowpass filters with different passband widths can be realized simultaneously in a very economical way.
Software Cache Techniques for Memory Nodes in Distributed Memory Parallel Production Systems
Jun MIYAZAKI Haruo YOKOTA

PAPER-Architectures

Vol:
E79-D No:8
Page(s):
1046-1054
Because the match phase in OPS5-type production systems requires most of the system's execution time and memory accesses, we proposed hash-based parallel production systems, CPPS (Clustered Parallel Production Systems), based on the RETE algorithm for distributed memory parallel computers, or multicomputers to reduce such a bottleneck. CPPS was effective in speeding up the match phase, but still left room for optimizations. In this paper, we introduce software cache techniques to memory nodes in the CPPS as one of the optimizations, and implement it on a multicomputer, nCUBE2. The benchmark results show that the CPPS with the software cache is about 2-fold faster than the original, and more than 7-fold faster than the simple hash method proposed by Acharya et al. for a large scale problem. The speed-up can be attributed to decreased communication costs.
Attenuation Correction for X-Ray Emission Computed Tomography of Laser-Produced Plasma
Yen-Wei CHEN Zensho NAKAO Shinichi TAMURA

LETTER-Image Theory

Vol:
E79-A No:8
Page(s):
1287-1290
An attenuation correction method was proposed for laser-produced plasma emission computed tomography (ECT), which is based on a relation of the attenuation coefficient and the emission coefficient in plasma. Simulation results show that the reconstructed images are dramatically improved in comparison to the reconstructions without attenuation correction.
Fluorinated Liquid Crystalline Materials for AM-LCD Applications
Hideo SAITO Etsuo NAKAGAWA Tetsuya MATSUSHITA Fusayuki TAKESHITA Yasuhiro KUBO Shuichi MATSUI Kazutoshi MIYAZAWA Yasuyuki GOTO

PAPER

Vol:
E79-C No:8
Page(s):
1027-1034
Flurorinated liquid crystal compounds having fluorophenyl, difluorophenyl and trifluorophenyl moieties combined with ester linkages, 1,2-ethylenes and covalent bonds were prepared and checked for their physical properties i.e. mesophases, dielectric and optical anisotropy. viscosity, pretilt angle and threshold voltage. By introducing fluorine atom(s) into the molecules, optical anisotropy and threshold voltage decreased, though the nematic temperature range diminished. The investigated compounds were all chemically stable and by using the compounds nematic liquid crystalline mixtures having low threshold voltage, low viscosity, large optical anisotropy and wide nematic ranges which were suitable for AM-LCDs, could be obtained.
On Methods for Reconfiguring Processor Arrays
Noritaka SHIGEI Hiromi MIYAJIMA Takayuki ISHIZAKA Sadayuki MURASHIMA

PAPER-Interconnection Networks

Vol:
E79-D No:8
Page(s):
1139-1146
To enhance fabrication yield for processor arrays, many reconfiguration schemes for replacing faulty processing elements (PE's) with spare PE's have been proposed. An array grid model based on single-tracks is one of such models. For this model, some algorithms for reconfiguring processor arrays have been proposed. However, an algorithm which can reconfigure the array, whenever the array is reconfigurable, has not been proposed yet. This paper presents two types of methods for reconfiguration of processor arrays. Both the types use indirect replacements for reconfiguring arrays. For an indirect replacement of a faulty non-spare PE, one has a fixed direction, the other has at most four directions among which one is chosen. For the former, we consider the several distribution of spare PE's, and computer simulations show a tendency in the term of difference in the distributions. The latter algorithms consist of two phases. In the first phase, rows and columns of spare PE's are decided in accordance with a rule. Several rules for deciding spare PE's are considered in this paper. In the second phase, faulty non-spare PE's are replaced with healthy spare PE's. By simulations the performance of the algorithms are evaluated and a tendency is shown in the terms of difference in disposition of spare PE's.
Fault-Tolerant Graphs for Hypercubes and Tori^*
Toshinori YAMADA Koji YAMAMOTO Shuichi UENO

PAPER-Fault Diagnosis/Tolerance

Vol:
E79-D No:8
Page(s):
1147-1152
Motivated by the design of fault-tolerant multiprocessor interconnection networks, this paper considers the following problem: Given a positive integer t and a graph H, construct a graph G from H by adding a minimum number Δ(t, H) of edges such that even after deleting any t edges from G the remaining graph contains H as a subgraph. We estimate Δ(t, H) for the hypercube and torus, which are well-known as important interconnection networks for multiprocessor systems. If we denote the hypercube and the square torus on N vertices by QN and DN respectively, we show, among others, that Δ(t, QN) = O(tN log(log N/t + log 2e)) for any t and N (t 2), and Δ(1, DN) = N/2 for N even.
Periodic Boundary Condition for Evaluation of External Mutual Couplings in a Slotted Waveguide Array
Kunio SAKAKIBARA Jiro HIROKAWA Makoto ANDO Naohisa GOTO

PAPER-Antennas and Propagation

Vol:
E79-B No:8
Page(s):
1156-1164
In the design of a large slotted waveguide array, evaluation of mutual couplings between the slots is time consuming. This paper proposes an effective approximation analysis of the external mutual couplings using periodic boundary condition. Simple design procedure is verified for two-dimensional slot array.
Efficient Parallel Algorithms on Proper Circular Arc Graphs
Selim G. AKL Lin CHEN

PAPER-Algorithms

Vol:
E79-D No:8
Page(s):
1015-1020
Efficient parallel algorithms for several problems on proper circular arc graphs are presented in this paper. These problems include finding a maximum matching, partitioning into a minimum number of induced subgraphs each of which has a Hamiltonian cycle (path), partitioning into induced subgraphs each of which has a Hamiltonian cycle (path) with at least k vertices for a given k, and adding a minimum number of edges to make the graph contain a Hamiltonian cycle (path). It is shown here that the above problems can all be solved in logarithmic time with a linear number of EREW PRAM processors, or in constant time with a linear number of BSR processors. A more important part of this work is perhaps the extension of basic BSR to allow simultaneous multiple BROADCAST instructions.
A 50 MHz CMOS Pipelined Majority Logic Decoder for (1057, 813) Difference-Set Cyclic Code
Kazumasa KOBAYASHI Kouji YAMANO Hideki KOKUBUN Kiichi KOBAYASHI

PAPER-VLSI Design Technology and CAD

Vol:
E79-A No:7
Page(s):
1060-1067
A new high-speed decoding algorithm for Difference-set cyclic codes, and the design and implementation of a 50 MHz CMOS LSI for decoding the (1057, 813) DSCC, are presented. The algorithm, called modified threshold decoding, makes it possible to introduce an arbitrary number of pipeline stages into feedback loops in decoding circuits. A prototype LSI containing about 13k logic gates was fabricated using 1 µm CMOS gate-array technology. The power consumption is less than 750 mW at a 50 MHz clock rate. It is available for digital data transmission systems having an I/O data rate of up to 25 MBPS. It is being used in experimental set-ups targeted at future digital broadcasting systems. The proposed algorithm has an important advantage for much longer codes as it has the potential to be used in the high-speed decoding of DSCCs having a code length longer than 1057.
CDMA Unslotted ALOHA Systems with Packet Retransmission Control
Hiraku OKADA Takeshi SATO Takaya YAMAZATO Masaaki KATAYAMA Akira OGAWA

PAPER

Vol:
E79-A No:7
Page(s):
1004-1010
In this paper, we analyze the throughput and delay performances of the CDMA unslotted ALOHA system considering packet retransmisson. We also clarify the stability of the system. Based on these results, we propose the optimal retransmission control (ORC) to improve the performances. The ORC is the scheme to prevent the system from drifting to an undesirable operating point by controlling the birth rate of retransmitted packets. As a result, it is shown that the throughput and delay performances of the system with the ORC are better than without the ORC and the system does not drift to an undesirable operating point.

20461-20480hit(22683hit)

Keyword Search Result

[Keyword] Y(22683hit)

Message-Based Efficient Remote Memory Access on a Highly Parallel Computer EM-X

Striping in a Disk Array with Data/Parity Placement Scheme RM2 Tolerating Double Disk Failures^*

Virtual Striping: A Storage Management Scheme with Dynamic Striping

hMDCE: The Hierarchical Multidimensional Directed Cycles Ensemble Network

An Acoustically Oriented Vocal-Tract Model

A Local Property of the Phasor Model of Neural Networks

A Built-In Self-Reconstruction Approach for Partitioned Mesh-Arrays Using Neural Algorithm

Implantable Temperature Measurement System Using the Parametron Phenomenon

Optically Compensated Bend Mode(OCB Mode) with Wide Viewing Angle and Fast Response

Stability of Terminated Two Port Networks

Multiplierless Arrays for Realization of Lowpass and Highpass Linear Phase FIR Digital Filters

Software Cache Techniques for Memory Nodes in Distributed Memory Parallel Production Systems

Attenuation Correction for X-Ray Emission Computed Tomography of Laser-Produced Plasma

Fluorinated Liquid Crystalline Materials for AM-LCD Applications

On Methods for Reconfiguring Processor Arrays

Fault-Tolerant Graphs for Hypercubes and Tori^*

Periodic Boundary Condition for Evaluation of External Mutual Couplings in a Slotted Waveguide Array

Efficient Parallel Algorithms on Proper Circular Arc Graphs

A 50 MHz CMOS Pipelined Majority Logic Decoder for (1057, 813) Difference-Set Cyclic Code

CDMA Unslotted ALOHA Systems with Packet Retransmission Control

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles