The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] multicomputer(9hit)

1-9hit
  • Node-Disjoint Paths Problems in Directed Bijective Connection Graphs

    Keiichi KANEKO  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2019/09/26
      Vol:
    E103-D No:1
      Page(s):
    93-100

    In this paper, we extend the notion of bijective connection graphs to introduce directed bijective connection graphs. We propose algorithms that solve the node-to-set node-disjoint paths problem and the node-to-node node-disjoint paths problem in a directed bijective connection graph. The time complexities of the algorithms are both O(n4), and the maximum path lengths are both 2n-1.

  • Probabilistic Fault Diagnosis and its Analysis in Multicomputer Systems

    Manabu KOBAYASHI  Toshiyasu MATSUSHIMA  Shigeichi HIRASAWA  

     
    PAPER-Coding theory and techniques

      Vol:
    E101-A No:12
      Page(s):
    2072-2081

    F.P. Preparata et al. have proposed a fault diagnosis model to find all faulty units in the multicomputer system by using outcomes which each unit tests some other units. In this paper, for probabilistic diagnosis models, we show an efficient diagnosis algorithm to obtain a posteriori probability that each of units is faulty given the test outcomes. Furthermore, we propose a method to analyze the diagnostic error probability of this algorithm.

  • Stochastic Fault-Tolerant Routing in Dual-Cubes

    Junsuk PARK  Nobuhiro SEKI  Keiichi KANEKO  

     
    LETTER-Dependable Computing

      Pubricized:
    2017/05/10
      Vol:
    E100-D No:8
      Page(s):
    1920-1921

    In the topologies for interconnected nodes, it is desirable to have a low degree and a small diameter. For the same number of nodes, a dual-cube topology has almost half the degree compared to a hypercube while increasing the diameter by just one. Hence, it is a promising topology for interconnection networks of massively parallel systems. We propose here a stochastic fault-tolerant routing algorithm to find a non-faulty path from a source node to a destination node in a dual-cube.

  • Node-to-Node Disjoint Paths Problem in Möbius Cubes

    David KOCIK  Keiichi KANEKO  

     
    PAPER-Dependable Computing

      Pubricized:
    2017/04/25
      Vol:
    E100-D No:8
      Page(s):
    1837-1843

    The Möbius cube is a variant of the hypercube. Its advantage is that it can connect the same number of nodes as a hypercube but with almost half the diameter of the hypercube. We propose an algorithm to solve the node-to-node disjoint paths problem in n-Möbius cubes in polynomial-order time of n. We provide a proof of correctness of the algorithm and estimate that the time complexity is O(n2) and the maximum path length is 3n-5.

  • Node-to-Set Disjoint Paths Problem in a Möbius Cube

    David KOCIK  Yuki HIRAI  Keiichi KANEKO  

     
    PAPER-Dependable Computing

      Pubricized:
    2015/12/14
      Vol:
    E99-D No:3
      Page(s):
    708-713

    This paper proposes an algorithm that solves the node-to-set disjoint paths problem in an n-Möbius cube in polynomial-order time of n. It also gives a proof of correctness of the algorithm as well as estimating the time complexity, O(n4), and the maximum path length, 2n-1. A computer experiment is conducted for n=1,2,...,31 to measure the average performance of the algorithm. The results show that the average time complexity is gradually approaching to O(n3) and that the maximum path lengths cannot be attained easily over the range of n in the experiment.

  • Essential Cycle Calculation Method for Irregular Array Redistribution

    Sheng-Wen BAI  Chu-Sing YANG  

     
    PAPER-Computation and Computational Models

      Vol:
    E89-D No:2
      Page(s):
    789-797

    In many parallel programs, run-time array redistribution is usually required to enhance data locality and reduce remote memory access on the distributed memory multicomputers. In general, array distribution can be classified into regular distribution and irregular distribution according to the distribution fashion. Many methods for performing regular array redistribution have been presented in the literature. However, for the heterogeneous computation environment, irregular array redistributions can be used to adjust data assignment at run-time. In this paper, an Essential Cycle Calculation method for unequal block sizes array redistribution is presented. In the ECC method, a processor first computes the source/destination processor/data sets of array elements in the first essential cycle of the local array it owns. From the source/destination processor/data sets of array elements in the first essential cycle, we can construct packing/unpacking pattern tables. Since each essential cycle has the same communication pattern, based on the packing/unpacking pattern tables, a processor can pack/unpack array elements efficiently. To evaluate the performance of the ECC method, we have implemented this method on an IBM SP2 parallel machine and compare it with the Sequence method. The cost models for these methods are also presented. The experimental results show that the ECC method greatly outperforms the Sequence method for all test samples.

  • Efficient Loop Partitioning for Parallel Codes of Irregular Scientific Computations

    Minyi GUO  

     
    PAPER-Software Systems

      Vol:
    E86-D No:9
      Page(s):
    1825-1834

    In most cases of distributed memory computations, node programs are executed on processors according to the owner computes rule. However, owner computes rule is not best suited for irregular application codes. In irregular application codes, use of indirection in accessing left hand side array makes it difficult to partition the loop iterations, and because of use of indirection in accessing right hand side elements, we may reduce total communication by using heuristics other than owner computes rule. In this paper, we propose a communication cost reduction computes rule for irregular loop partitioning, called least communication computes rule. We partition a loop iteration to a processor on which the minimal communication cost is ensured when executing that iteration. Then, after all iterations are partitioned into various processors, we give global vs. local data transformation rule, indirection arrays remapping and communication optimization methods. The experimental results show that, in most cases, our approaches achieved better performance than other loop partitioning rules.

  • Round Optimal Parallel Algorithms for the Convex Hull of Sorted Points

    Naoki OSHIGE  Akihiro FUJIWARA  

     
    PAPER

      Vol:
    E84-A No:5
      Page(s):
    1152-1160

    In this paper, we present deterministic parallel algorithms for the convex hull of sorted points and their application to a related problem. The algorithms are proposed for the coarse grained multicomputer (CGM) model. We first propose a cost optimal parallel algorithm for computing the problem with a constant number of communication rounds for n/p p2, where n is the size of an input and p is the number of processors. Next we propose a cost optimal algorithm, which is more complicated, for n/p pε, where 0 < ε < 2. From the above two results, we can compute the convex hull of sorted points with O(n/p) computation time and a constant number of communication rounds for n/p pε, where ε > 0. Finally we show an application of our convex hull algorithms. We solve the convex layers for d lines in O((n log n)/p) computation time with a constant number of communication rounds. The algorithm is also cost optimal for the problem.

  • An Efficient Task Scheduling Scheme for Mesh Multicomputers

    Oh Han KANG  

     
    PAPER-Computer Systems

      Vol:
    E80-D No:6
      Page(s):
    646-652

    In this paper, we propose an efficient task scheduling scheme, called CTS (Class-based Task Scheduling), to obtain high performance in terms of high system utilization and low waiting times for tasks. While a better submesh allocation scheme can improve system performance, an allocation policy alone cannot improve performance significantly. This is due to the fact that the FCFS task scheduling policy leads to large external fragmentation. The CTS strategy maintains four separate queues, one for each incoming task class. This avoids the blacking property incurred in the FCFS scheduling. To reduce the external fragmentation, a job tends to wait for an occupied submesh of the same size instead of using a new submesh in the CTS strategy. Simulation results indicate that the proposed scheduling strategy improves the performance compared to the FCFS scheduling policy by reducing the average waiting delay significantly.