Rikuya SASAKI Hiroyuki ICHIDA Htoo Htoo Sandi KYAW Keiichi KANEKO
The increasing demand for high-performance computing in recent years has led to active research on massively parallel systems. The interconnection network in a massively parallel system interconnects hundreds of thousands of processing elements so that they can process large tasks while communicating among others. By regarding the processing elements as nodes and the links between processing elements as edges, respectively, we can discuss various problems of interconnection networks in the framework of the graph theory. Many topologies have been proposed for interconnection networks of massively parallel systems. The hypercube is a very popular topology and it has many variants. The cross-cube is such a topology, which can be obtained by adding one extra edge to each node of the hypercube. The cross-cube reduces the diameter of the hypercube, and allows cycles of odd lengths. Therefore, we focus on the cross-cube and propose an algorithm that constructs disjoint paths from a node to a set of nodes. We give a proof of correctness of the algorithm. Also, we show that the time complexity and the maximum path length of the algorithm are O(n3 log n) and 2n - 3, respectively. Moreover, we estimate that the average execution time of the algorithm is O(n2) based on a computer experiment.
Nowadays, a rapid increase of demand on high-performance computation causes the enthusiastic research activities regarding massively parallel systems. An interconnection network in a massively parallel system interconnects a huge number of processing elements so that they can cooperate to process tasks by communicating among others. By regarding a processing element and a link between a pair of processing elements as a node and an edge, respectively, many problems with respect to communication and/or routing in an interconnection network are reducible to the problems in the graph theory. For interconnection networks of the massively parallel systems, many topologies have been proposed so far. The hypercube is a very popular topology and it has many variants. The bicube is a such topology and it can interconnect the same number of nodes with the same degree as the hypercube while its diameter is almost half of that of the hypercube. In addition, the bicube keeps the node-symmetric property. Hence, we focus on the bicube and propose an algorithm that gives a minimal or shortest path between an arbitrary pair of nodes. We give a proof of correctness of the algorithm and demonstrate its execution.
The paper deals with the shortest path-based in-trees on a grid graph. There a root is supposed to move among all vertices. As such a spanning mobility pattern, root trajectories based on a Hamilton path or cycle are discussed. Along such a trajectory, each vertex randomly selects the next hop on the shortest path to the root. Under those assumptions, this paper shows that a root trajectory termed an S-path provides the minimum expected symmetric difference. Numerical experiments show that another trajectory termed a Right-cycle also provides the minimum result.
Hiroyuki GOTO Yohei KAKIMOTO Yoichi SHIMAKAWA
Given a network G(V,E), a lightweight method to calculate overlaid origin-destination (O-D) traffic flows on all edges is developed. Each O-D trip shall select the shortest path. While simple implementations for single-source/all-destination and all-pair trips need O(L·n) and O(L·n2) in worst-case time complexity, respectively, our technique is executed with O(m+n) and O(m+n2), where n=|V|, m=|E|, and L represents the maximum arc length. This improvement is achieved by reusing outcomes of priority queue-based algorithms. Using a GIS dataset of a road network in Tokyo, Japan, the effectiveness of our technique is confirmed.
Jinfa WANG Siyuan JIA Hai ZHAO Jiuqiang XU Chuan LIN
Detecting anomalies, such as network failure or intentional attack in Internet, is a vital but challenging task. Although numerous techniques have been developed based on Internet traffic, detecting anomalies from the perspective of Internet topology structure is going to be possible because the anomaly detection of structured datasets based on complex network theory has become a focus of attention recently. In this paper, an anomaly detection method for the large-scale Internet topology is proposed to detect local structure crashes caused by the cascading failure. In order to quantify the dynamic changes of Internet topology, the network path changes coefficient (NPCC) is put forward which highlights the Internet abnormal state after it is attacked continuously. Furthermore, inspired by Fibonacci Sequence, we proposed the decision function that can determine whether the Internet is abnormal or not. That is the current Internet is abnormal if its NPCC is out of the normal domain calculated using the previous k NPCCs of Internet topology. Finally the new Internet anomaly detection method is tested against the topology data of three Internet anomaly events. The results show that the detection accuracy of all events are over 97%, the detection precision for three events are 90.24%, 83.33% and 66.67%, when k=36. According to the experimental values of index F1, larger values of k offer better detection performance. Meanwhile, our method has better performance for the anomaly behaviors caused by network failure than those caused by intentional attack. Compared with traditional anomaly detection methods, our work is more simple and powerful for the government or organization in items of detecting large-scale abnormal events.
Teruaki KITASUKA Takayuki MATSUZAKI Masahiro IIDA
The order/degree problem consists of finding the smallest diameter graph for a given order and degree. Such a graph is beneficial for designing low-latency networks with high performance for massively parallel computers. The average shortest path length (ASPL) of a graph has an influence on latency. In this paper, we propose a novel order adjustment approach. In the proposed approach, we search for Cayley graphs of the given degree that are close to the given order. We then adjust the order of the best Cayley graph to meet the given order. For some order and degree pairs, we explain how to derive the smallest known graphs from the Graph Golf 2016 and 2017 competitions.
Given a sequence of k convex polygons in the plane, a start point s, and a target point t, we seek a shortest path that starts at s, visits in order each of the polygons, and ends at t. We revisit this touring polygons problem, which was introduced by Dror et al. (STOC 2003), by describing a simple method to compute the so-called last step shortest path maps, one per polygon. We obtain an O(kn)-time solution to the problem for a sequence of pairwise disjoint convex polygons and an O(k2n)-time solution for possibly intersecting convex polygons, where n is the total number of vertices of all polygons. A major simplification is made on the operation of locating query points in the last step shortest path maps. Our results improve upon the previous time bounds roughly by a factor of log n.
Fumito TAKEUCHI Masaaki NISHINO Norihito YASUDA Takuya AKIBA Shin-ichi MINATO Masaaki NAGATA
This paper deals with the constrained DAG shortest path problem (CDSP), which finds the shortest path on a given directed acyclic graph (DAG) under any logical constraints posed on taken edges. There exists a previous work that uses binary decision diagrams (BDDs) to represent the logical constraints, and traverses the input DAG and the BDD simultaneously. The time and space complexity of this BDD-based method is derived from BDD size, and tends to be fast only when BDDs are small. However, since it does not prioritize the search order, there is considerable room for improvement, particularly for large BDDs. We combine the well-known A* search with the BDD-based method synergistically, and implement several novel heuristic functions. The key insight here is that the ‘shortest path’ in the BDD is a solution of a relaxed problem, just as the shortest path in the DAG is. Experiments, particularly practical machine learning applications, show that the proposed method decreases search time by up to 2 orders of magnitude, with the specific result that it is 2,000 times faster than a commercial solver. Moreover, the proposed method can reduce the peak memory usage up to 40 times less than the conventional method.
Guillermo IBÁÑEZ Iván MARSÁ-MAESTRE Miguel A. LOPEZ-CARMONA Ignacio PÉREZ-IBÁÑEZ Jun TANAKA Jon CROWCROFT
This paper describes Path-Moose, a scalable tree-based shortest path bridging protocol. Both ARP-Path and Path-Moose protocols belong to a new category of bridges that we name All-path, because all paths of the network are explored simultaneously with a broadcast frame distributed over all network links to find a path or set a multicast tree. Path-Moose employs the ARP-based low latency routing mechanism of the ARP-Path protocol on a bridge basis instead of a per-single-host basis. This increases scalability by reducing forwarding table entries at core bridges by a factor of fifteen times for big data center networks and achieves a faster reconfiguration by an approximate factor of ten. Reconfiguration time is significantly shorter than ARP-Path (zero in many cases) because, due to the sharing of network paths by the hosts connected to same edge bridges, when a host needs the path it has already been recovered by another user of the path. Evaluation through simulations shows protocol correctness and confirms the theoretical evaluation results.
Kazuya MATSUMOTO Naohito NAKASATO Stanislav G. SEDUKHIN
This paper presents a blocked united algorithm for the all-pairs shortest paths (APSP) problem. This algorithm simultaneously computes both the shortest-path distance matrix and the shortest-path construction matrix for a graph. It is designed for a high-speed APSP solution on hybrid CPU-GPU systems. In our implementation, two most compute intensive parts of the algorithm are performed on the GPU. The first part is to solve the APSP sub-problem for a block of sub-matrices, and the other part is a matrix-matrix “multiplication” for the APSP problem. Moreover, the amount of data communication between CPU (host) memory and GPU memory is reduced by reusing blocks once sent to the GPU. When a problem size (the number of vertices in a graph) is large enough compared to a block size, our implementation of the blocked algorithm requires CPU
Cloud data center services, such as video on demand (VoD) and sensor data monitoring, have become popular. The quality of service (QoS) between a client and a cloud data center should be assured by satisfying each service's required bandwidth and delay. Multipath traffic engineering is effective for dispersing traffic flows on a network; therefore, an improved k-shortest paths first (k-SPF) algorithm is applied to these cloud data center services to satisfy their required QoS. k-SPF can create a set of multipaths between a cloud data center and all edge routers, to which client nodes are connected, within one algorithm process. Thus, k-SPF can produce k shortest simple paths between a cloud data center and every access router faster than with conventional Yen's algorithm. By using a parameter in the algorithm, k-SPF can also impartially use links on a network and shorten the average hop-count and number of necessary MPLS labels for multiple paths that comprise a multipath.
Kouji SUGISONO Hirofumi YAMAZAKI Hideaki IWATA Atsushi HIRAMATSU
A packet network architecture called “functionally distributed transport networking” is being studied, where control elements (CEs) are separated from the forwarding elements (FEs) of all routers in a network, and a centralized CE manages the control functions for all FEs. A crucial issue to be addressed in this network architecture is the occurrence of bottlenecks in the CE performance, and rapid network restoration after failures is the main problem to be solved. Thus, we propose here a fast backup route determination method suitable for this network architecture, and we also show the practicality of this architecture. Most failures can be categorized as single-node or single-link failures. The proposed method prepares backup routes for all possible single-node failures in advance and computes backup routes for single-link failures after the failure occurs. The number of possible single-node failures is much less than that of possible single-link failures, and the preparation of backup routes for single-node failures is practical under the memory requirements. Two techniques are used in computing backup routes for single-link failures in order to reduce the computation time. One is to calculate only the routes affected by the link failure. The other is to use an algorithm to compute backup routes for single-link failures based on preplanned backup routes for single-node failures. To demonstrate the practicality of our method, we evaluated the amount of memory and computation time needed to prepare backup routes for all single-node failures, and we carried out simulations with various network topologies to evaluate the route computation time required for a single-link failure.
Yusuke SAKUMOTO Hiroyuki OHSAKI Makoto IMASE
In this paper, we investigate the performance of Thorup's algorithm by comparing it to Dijkstra's algorithm for large-scale network simulations. One of the challenges toward the realization of large-scale network simulations is the efficient execution to find shortest paths in a graph with N vertices and M edges. The time complexity for solving a single-source shortest path (SSSP) problem with Dijkstra's algorithm with a binary heap (DIJKSTRA-BH) is O((M + N) log N). An sophisticated algorithm called Thorup's algorithm has been proposed. The original version of Thorup's algorithm (THORUP-FR) has the time complexity of O(M + N). A simplified version of Thorup's algorithm (THORUP-KL) has the time complexity of O(M α(N) + N) where α(N) is the functional inverse of the Ackerman function. In this paper, we compare the performances (i.e., execution time and memory consumption) of THORUP-KL and DIJKSTRA-BH since it is known that THORUP-FR is at least ten times slower than Dijkstra's algorithm with a Fibonaccii heap. We find that (1) THORUP-KL is almost always faster than DIJKSTRA-BH for large-scale network simulations, and (2) the performances of THORUP-KL and DIJKSTRA-BH deviate from their time complexities due to the presence of the memory cache in the microprocessor.
Qing CHANG Yongqiang LIU Huagang XIONG
Research of the shortest path problem in time-dependent networks has important practical value. An improved pheromone update strategy suitable for time-dependent networks was proposed. Under this strategy, the residual pheromone of each road can accurately reflect the change of weighted value of each road. An improved selection strategy between adjacent cities was used to compute the cities' transfer probabilities, as a result, the amount of calculation is greatly reduced. To avoid the algorithm converging to the local optimal solution, the ant colony algorithm was combined with genetic algorithm. In this way, the solutions after each traversal were used as the initial species to carry out single-point crossover. An improved ant colony algorithm for the shortest path problem in time-dependent networks based on these improved strategies was presented. The simulation results show that the improved algorithm has greater probability to get the global optimal solution, and the convergence rate of algorithm is better than traditional ant colony algorithm.
Kazuya MATSUMOTO Stanislav G. SEDUKHIN
The All-Pairs Shortest Paths (APSP) problem is a graph problem which can be solved by a three-nested loop program. The Cell Broadband Engine (Cell/B.E.) is a heterogeneous multi-core processor that offers the high single precision floating-point performance. In this paper, a solution of the APSP problem on the Cell/B.E. is presented. To maximize the performance of the Cell/B.E., a blocked algorithm for the APSP problem is used. The blocked algorithm enables reuse of data in registers and utilizes the memory hierarchy. We also describe several optimization techniques for effective implementation of the APSP problem on the Cell/B.E. The Cell/B.E. achieves the performance of 8.45 Gflop/s for the APSP problem by using one SPE and 50.6 Gflop/s by using six SPEs.
Satoshi TAOKA Daisuke TAKAFUJI Takashi IGUCHI Toshimasa WATANABE
An edge-weighted directed graph is referred to as a network in this paper, and an edge operation is an operation that increases or decreases an edge weight. Decreasing an edge weight from the infinite to a finite value or increasing any edge weight from a finite one to the infinite corresponds to addition or deletion of this edge, respectively. The dynamic shortest path problem (DSPP for short) is defined by "Given any network with a specified vertex (denoted as s), and any sequence of edge operations, construct a shortest path tree of each network obtained by executing those edge operations one by one in the order of the sequence." As an application, fast routing for an interior network using link state protocols, such as OSPF and IS-IS, requires solving DSPP efficiently. In this paper, among as many existing algorithms as possible, including those which execute several edge operations simultaneously, fundamental and/or important algorithms are implemented and their capability is evaluated based on the results of computational experiments.
In this paper, a practical clock-scheduling engine is introduced. The minimum feasible clock-period is obtained by using a modified Bellman-Ford shortest path algorithm. Then an optimum cost clock-schedule is obtained by using a bipartite matching algorithm. It also provides useful information to circuit synthesis tools. The experiment to a circuit with about 10000 registers and 100000 signal paths shows that a result is obtained within a few minutes. The computation time is almost linear to the circuit size in practice.
Koichi ITO Masahiko HIRATSUKA Takafumi AOKI Tatsuo HIGUCHI
This paper presents a shortest path search algorithm using a model of excitable reaction-diffusion dynamics. In our previous work, we have proposed a framework of Digital Reaction-Diffusion System (DRDS)--a model of a discrete-time discrete-space reaction-diffusion system useful for nonlinear signal processing tasks. In this paper, we design a special DRDS, called an "excitable DRDS," which emulates excitable reaction-diffusion dynamics and produces traveling waves. We also demonstrate an application of the excitable DRDS to the shortest path search problem defined on two-dimensional (2-D) space with arbitrary boundary conditions.
Paola FLOCCHINI Antonio Mesa ENRIQUES Linda PAGLI Giuseppe PRENCIPE Nicola SANTORO
We consider the problem of computing the optimal swap edges of a shortest-path tree. This problem arises in designing systems that offer point-of-failure shortest-path rerouting service in presence of a single link failure: if the shortest path is not affected by the failed link, then the message will be delivered through that path; otherwise, the system will guarantee that, when the message reaches the node where the failure has occurred, the message will then be re-routed through the shortest detour to its destination. There exist highly efficient serial solutions for the problem, but unfortunately because of the structures they use, there is no known (nor foreseeable) efficient distributed implementation for them. A distributed protocol exists only for finding swap edges, not necessarily optimal ones. We present two simple and efficient distributed algorithms for computing the optimal swap edges of a shortest-path tree. One algorithm uses messages containing a constant amount of information, while the other is tailored for systems that allow long messages. The amount of data transferred by the protocols is the same and depends on the structure of the shortest-path spanning-tree; it is no more, and sometimes significantly less, than the cost of constructing the shortest-path tree.
Kazuaki YAMAGUCHI Sumio MASUDA
The 2-terminal shortest path problem is to find a shortest path between two specified vertices in a given graph G. In this paper, we consider this problem in the following situation: G is given before the two vertices are specified so that some preprocessing is allowed to reduce the response time. We present a method for calculating lower bounds of the length of the shortest path for any pair of vertices. Experimental results show that the A* algorithm with our method performs much better than previous methods.