1-16hit |
Ryuta KAWANO Hiroshi NAKAHARA Seiichi TADE Ikki FUJIWARA Hiroki MATSUTANI Michihiro KOIBUCHI Hideharu AMANO
Inter-switch networks for HPC systems and data-centers can be improved by applying random shortcut topologies with a reduced number of hops. With minimal routing in such networks; however, deadlock-freedom is not guaranteed. Multiple Virtual Channels (VCs) are efficiently used to avoid this problem. However, previous works do not provide good trade-offs between the number of required VCs and the time and memory complexities of an algorithm. In this work, a novel and fast algorithm, named ACRO, is proposed to endorse the arbitrary routing functions with deadlock-freedom, as well as consuming a small number of VCs. A heuristic approach to reduce VCs is achieved with a hash table, which improves the scalability of the algorithm compared with our previous work. Moreover, experimental results show that ACRO can reduce the average number of VCs by up to 63% when compared with a conventional algorithm that has the same time complexity. Furthermore, ACRO reduces the time complexity by a factor of O(|N|⋅log|N|), when compared with another conventional algorithm that requires almost the same number of VCs.
WonHee LEE Samuel Sangkon LEE Dong-Un AN
Clustering methods are divided into hierarchical clustering, partitioning clustering, and more. K-Means is a method of partitioning clustering. We improve the performance of a K-Means, selecting the initial centers of a cluster through a calculation rather than using random selecting. This method maximizes the distance among the initial centers of clusters. Subsequently, the centers are distributed evenly and the results are more accurate than for initial cluster centers selected at random. This is time-consuming, but it can reduce the total clustering time by minimizing allocation and recalculation. Compared with the standard algorithm, F-Measure is more accurate by 5.1%.
Masahiro KIMOTO Tatsuhiro TSUCHIYA Tohru KIKUNO
The exact time complexity of Hsu and Huan's self-stabilizing maximal matching algorithm is provided. It is n2 + n - 2 if the number of nodes n is even and n2 + n - if n is odd.
Meiling ZHANG Weiguo ZHANG Jingmei LIU Xinmei WANG
Impossible differential attack (IDA) uses impossible differential characteristics extracted from enough plaintext pairs to retrieve subkeys of the first and the last several rounds of AES. In this paper, a general IDA on 7-round AES is proposed. Such attack takes the number of all-zero columns of the 7th and the 6th round as parameters (α,β). And a trade-off relation between the number of plaintexts and times of encryptions in the process of the attack is derived, which makes only some values of (α,β) allowed in the attack for different key length.
Shinobu NAGAYAMA Tsutomu SASAO Yukihiro IGUCHI Munehiro MATSUURA
This paper considers Quasi-Reduced ordered Multi-valued Decision Diagrams with k bits (QRMDD(k)s) to represent binary logic functions. Experimental results show relations between the values of k and the numbers of nodes, the memory sizes, the numbers of memory accesses, and area-time complexity for QRMDD(k). For many benchmark functions, the numbers of nodes and memory accesses for QRMDD(k)s are nearly equal to of the corresponding Quasi-Reduced ordered Binary Decision Diagrams (QRBDDs), and the memory sizes and the area-time complexities for QRMDD(k)s are minimum when k = 2 and k = 3-6, respectively.
This paper presents a strictly time- and communication-optimal distributed sorting algorithm in a line network. A strictly time-optimal distributed sorting algorithm in a line network has already been designed. However, its communication complexity is not strictly optimal and it seems to be difficult to extend it to other problems, such as that related to multiple elements in a process, and also the dynamic sorting problem where the number of elements each process should have as its solution is not the same as that in the initial state. Therefore, the algorithm in this paper was designed by an alternative approach to make it strictly time- and communication-optimal. Moreover, an extension to the dynamic sorting problem is described.
This paper defines the distributed arrangement selection problem in a line network in a distributed context and describes the design of a strictly-time-optimal algorithm which solves the problem with a limited local memory space. The problem is regarded as a combined distributed sorting and k-selection problem, namely a problem of sorting elements that are not larger than the kth minimum element in predetermined processes. The algorithm also provides a solution to a resource allocation problem in a line network in a strictly-optimal time.
Takashi MAEBA Mitsuyoshi SUGAYA Shoji TATSUMI Ken'ichi ABE
This letter presents parallel algorithms for matrix multiplication and the fast Fourier transform (FFT) that are significant problems arising in engineering and scientific applications. The proposed algorithms are designed on a 3-dimensional processor array with separable buses (PASb). We show that a PASb consisting of N N h processors can compute matrix multiplication of size N N and the FFT of size N in O(N/h+log N) time, respectively. In order to examine ease of hardware implementation, we also evaluate the VLSI complexity of the algorithms. A result obtained achieves an optimal bound on area-time complexity when h=O(N/log N).
The subject of the paper is to give an overview and latest results on the Legal Firing Sequence Problem of Petri nets (LFS for short). LFS is very fundamental in the sense that it appears as a subproblem or a simpler form of various basic problems in Petri net theory, such as the well-known marking reachability problem, the minimum initial resource allocation problem, the liveness (of level 4) problem, the scheduling problem and so on. However, solving LFS generally is not easy: it is NP -hard even for Petri nets having very simple structures. This intractability of LFS may have been preventing us from producing efficient algorithms for those problems. So research on LFS from computational complexity point of view seems to be rewarding.
Informally, private information retrieval for k 1 databases (k-PIR) is an interactive scheme that enables a user to make access to (separated) k replicated copies of a database and privately retrieve any single bit out of the n bits of data stored in the database. In this model, "privacy" implies that the user retrieves the bit he is interested in but releases to each database nothing about which bit he really tries to get. Chor et. al. proposed 2-PIR with communication complexity 12 n1/32 that is based on the covering codes. Then Ambainis recursively extended the scheme by Chor et. al. and showed that for each k 2, there exists k-PIR with communication complexity at most ckn1/(2k-1) some constant ck > 0. In this paper, we relax the condition for the covering codes and present time-efficient 2-PIR with communication complexity 12 n1/3. In addition, we generally formulate the recursive scheme by Ambainis and show that for each k 4, there exists k-PIR with communication complexity at most ck' n1/(2k-1) for some constant ck' << ck.
The time-memory trade-off cryptanalysis for block ciphers with a search space of size 2N (N: key length) cannot achieve a success probability excceding 63%. This is caused by some unavoidable overlapping of keys in the space. For elavating the success probability of finding the correct key, a larger search space is necessary. That is, the increase of time complexity for precomputation would be inevitable. This paper theoretically shows, however, no further price is required for the size of look-up tables for the number of encryptions for searching for the key that matches the given ciphertext - plaintext pairs. This theory is confirmed by some empilical results.
The firing squad synchronization problem is considered for defective cellular automata. A lower bound of time tf for the problem is derived. The state complexity to solve the problem is investigated and it is shown that the state set has to be arbitrary large to obtain solutions of time complexity
Toshimasa WATANABE Takenobu TANIDA Masahiro YAMAUCHI Kenji ONAGA
The subject of the paper is the minimum initial marking problem for scheduling in timed Petri net PN: given a vector X of nonnegative integers, a P-invariant Y of PN and a nonnegative integer π, find an initial marking M minimizing the value Ytr
In this paper we study the problem of scheduling parallel program modules onto an MPS (message passing multiprocessor system) so as to minimize the total execution time. Each node in the interconnection network of the MPS has buffers at its input ports to store messages waiting for the transmission. An algorithm for finding a route which minimizes the communication delay of a message to be sent between a processor-pair is first given. Next, we present heuristic algorithms for scheduling program modules onto the MPS. These algorithms use the above routing algorithm. The performances of the proposed algorithms are estimated by using simulation experiments.
In this paper we study the problem of scheduling a tree-structured program on multiprocessors so as to minimize the total execution time, which includes communication delay between processors. It is assumed in the problem that a sufficiently large number of processors are available. It is known that if the program structures are restricted to be out-trees, the problem can be solved in O(n2) time, where n denotes the number of modules of a program. However, this problem is known to be NP-hard if the program structures are allowed to be in-trees. Up to now, no optimal algorithm, except an obvious one, was known for the latter case while some approximation algorithms were shown. We present an optimization algorithm with a nontrivial time bound O((1.52)nn log n) for the in-tree case.
There have been several studies related to a reduction of the amount of computational resources used by Turing machines. As consequences, Linear speed-up theorem", tape compression theorem" and reversal reduction theorem" have been obtained. In this paper, we discuss a leaf reduction theorem on alternating Turing machines. Recently, the result that one can reduce the number of leaves by a constant factor without increasing the space complexity was shown for space- and leaf-bounded alternating Turing machines. We show that for time- and leaf-bounded alternating Turing machines, the number of leaves can be reduced by a constant factor without increasing time used by the machine. Therefore, our result says that a constant factor on the leaf complexity does not affect the power of time- and leaf-bounded alternating Turing machines.