The search functionality is under construction.

Keyword Search Result

[Keyword] distributed(734hit)

41-60hit(734hit)

  • Joint Multi-Layered User Clustering and Scheduling for Ultra-Dense RAN Using Distributed MIMO

    Ryo TAKAHASHI  Hidenori MATSUO  Fumiyuki ADACHI  

     
    PAPER

      Pubricized:
    2021/03/29
      Vol:
    E104-B No:9
      Page(s):
    1097-1109

    Ultra-densification of radio access network (RAN) is essential to efficiently handle the ever-increasing mobile data traffic. In this paper, a joint multi-layered user clustering and scheduling is proposed as an inter-cluster interference coordination scheme for ultra-dense RAN using cluster-wise distributed MIMO transmission/reception. The proposed joint multi-layered user clustering and scheduling consists of user clustering using the K-means algorithm, user-cluster layering (called multi-layering) based on the interference-offset-distance (IOD), cluster-antenna association on each layer, and layer-wise round-robin-type scheduling. The user capacity, the sum capacity, and the fairness are evaluated by computer simulations to show the effectiveness of the proposed joint multi-layered user clustering and scheduling. Also shown are uplink and downlink capacity comparisons and optimal IOD setting considering the trade-off between inter-cluster interference mitigation and transmission opportunity.

  • Hybrid Electrical/Optical Switch Architectures for Training Distributed Deep Learning in Large-Scale

    Thao-Nguyen TRUONG  Ryousei TAKANO  

     
    PAPER-Information Network

      Pubricized:
    2021/04/23
      Vol:
    E104-D No:8
      Page(s):
    1332-1339

    Data parallelism is the dominant method used to train deep learning (DL) models on High-Performance Computing systems such as large-scale GPU clusters. When training a DL model on a large number of nodes, inter-node communication becomes bottle-neck due to its relatively higher latency and lower link bandwidth (than intra-node communication). Although some communication techniques have been proposed to cope with this problem, all of these approaches target to deal with the large message size issue while diminishing the effect of the limitation of the inter-node network. In this study, we investigate the benefit of increasing inter-node link bandwidth by using hybrid switching systems, i.e., Electrical Packet Switching and Optical Circuit Switching. We found that the typical data-transfer of synchronous data-parallelism training is long-lived and rarely changed that can be speed-up with optical switching. Simulation results on the Simgrid simulator show that our approach speed-up the training time of deep learning applications, especially in a large-scale manner.

  • Cyclic LRCs with Availability from Linearized Polynomials

    Pan TAN  Zhengchun ZHOU   Haode YAN  Yong WANG  

     
    LETTER-Coding Theory

      Pubricized:
    2021/01/18
      Vol:
    E104-A No:7
      Page(s):
    991-995

    Locally repairable codes (LRCs) with availability have received considerable attention in recent years since they are able to solve many problems in distributed storage systems such as repairing multiple node failures and managing hot data. Constructing LRCs with locality r and availability t (also called (r, t)-LRCs) with new parameters becomes an interesting research subject in coding theory. The objective of this paper is to propose two generic constructions of cyclic (r, t)-LRCs via linearized polynomials over finite fields. These two constructions include two earlier ones of cyclic LRCs from trace functions and truncated trace functions as special cases and lead to LRCs with new parameters that can not be produced by earlier ones.

  • Preliminary Performance Analysis of Distributed DNN Training with Relaxed Synchronization

    Koichi SHIRAHATA  Amir HADERBACHE  Naoto FUKUMOTO  Kohta NAKASHIMA  

     
    BRIEF PAPER

      Pubricized:
    2020/12/01
      Vol:
    E104-C No:6
      Page(s):
    257-260

    Scalability of distributed DNN training can be limited by slowdown of specific processes due to unexpected hardware failures. We propose a dynamic process exclusion technique so that training throughput is maximized. Our evaluation using 32 processes with ResNet-50 shows that our proposed technique reduces slowdown by 12.5% to 50% without accuracy loss through excluding the slow processes.

  • Action Recognition Using Pose Data in a Distributed Environment over the Edge and Cloud

    Chikako TAKASAKI  Atsuko TAKEFUSA  Hidemoto NAKADA  Masato OGUCHI  

     
    PAPER

      Pubricized:
    2021/02/02
      Vol:
    E104-D No:5
      Page(s):
    539-550

    With the development of cameras and sensors and the spread of cloud computing, life logs can be easily acquired and stored in general households for the various services that utilize the logs. However, it is difficult to analyze moving images that are acquired by home sensors in real time using machine learning because the data size is too large and the computational complexity is too high. Moreover, collecting and accumulating in the cloud moving images that are captured at home and can be used to identify individuals may invade the privacy of application users. We propose a method of distributed processing over the edge and cloud that addresses the processing latency and the privacy concerns. On the edge (sensor) side, we extract feature vectors of human key points from moving images using OpenPose, which is a pose estimation library. On the cloud side, we recognize actions by machine learning using only the feature vectors. In this study, we compare the action recognition accuracies of multiple machine learning methods. In addition, we measure the analysis processing time at the sensor and the cloud to investigate the feasibility of recognizing actions in real time. Then, we evaluate the proposed system by comparing it with the 3D ResNet model in recognition experiments. The experimental results demonstrate that the action recognition accuracy is the highest when using LSTM and that the introduction of dropout in action recognition using 100 categories alleviates overfitting because the models can learn more generic human actions by increasing the variety of actions. In addition, it is demonstrated that preprocessing using OpenPose on the sensor side can substantially reduce the transfer quantity from the sensor to the cloud.

  • Distributed Observer Design on Sensor Networks with Random Communication

    Yuh YAMASHITA  Haruka SUMITA  Ryosuke ADACHI  Koichi KOBAYASHI  

     
    PAPER-Systems and Control

      Pubricized:
    2020/09/09
      Vol:
    E104-A No:3
      Page(s):
    613-621

    This paper proposes a distributed observer on a sensor network, where communication on the network is randomly performed. This work is a natural extension of Kalman consensus filter approach to the cases involving random communication. In both bidirectional and unidirectional communication cases, gain conditions that guarantee improvement of estimation error convergence compared to the case with no communication are obtained. The obtained conditions are more practical than those of previous studies and give appropriate cooperative gains for a given communication probability. The effectiveness of the proposed method is confirmed by computer simulations.

  • Time Synchronization Method for ARM-Based Distributed Embedded Linux Systems Using CCNT Register

    Young-Woo KWON  Sung-Mun PARK  Joon-Young CHOI  

     
    LETTER-Software System

      Pubricized:
    2020/10/29
      Vol:
    E104-D No:2
      Page(s):
    322-326

    We propose a system time synchronization method between ARM-based embedded Linux systems. The master Linux with reference clock sends its own system time to the slave Linux via Transmission Control Protocol communication along with a general-purpose input/output (GPIO) signal, and then the slave Linux corrects its own system time by the difference between its own system time at receiving the GPIO signal and the received reference time. The synchronization performance is significantly improved by compensating for the GPIO signal detection latency and the system time acquisition and setting latencies in Linux. These latencies are precisely measured by exploiting the function of Cycle Counter register in ARM coprocessor. Extensive experiments are performed with two ARM-based embedded Linux systems, and the results demonstrate the validity and performance of the proposed synchronization method.

  • Effectiveness and Limitation of Blockchain in Distributed Optimization: Applications to Energy Management Systems Open Access

    Daiki OGAWA  Koichi KOBAYASHI  Yuh YAMASHITA  

     
    INVITED PAPER

      Vol:
    E104-A No:2
      Page(s):
    423-429

    A blockchain, which is well known as one of the distributed ledgers, has attracted in many research fields. In this paper, we discuss the effectiveness and limitation of a blockchain in distributed optimization. In distributed optimization, the original problem is decomposed, and the local problems are solved by multiple agents. In this paper, ADMM (Alternating Direction Method of Multipliers) is utilized as one of the powerful methods in distributed optimization. In ADMM, an aggregator is basically required for collecting the computation result in each agent. Using blockchains, the function of an aggregator can be contained in a distributed ledger, and an aggregator may not be required. As a result, tampering from attackers can be prevented. As an application, we consider energy management systems (EMSs). By numerical experiments, the effectiveness and limitation of blockchain-based distributed optimization are clarified.

  • An Actual Stadium Verification of WLAN Using a Distributed Smart Antenna System (D-SAS) Open Access

    Tomoki MURAKAMI  Koichi ISHIHARA  Hirantha ABEYSEKERA  Yasushi TAKATORI  

     
    PAPER-Terrestrial Wireless Communication/Broadcasting Technologies

      Pubricized:
    2020/07/14
      Vol:
    E104-B No:1
      Page(s):
    109-117

    Dense deployments of wireless local area network (WLAN) access points (APs) are accelerating to accommodate the massive wireless traffic from various mobile devices. The AP densification improves the received power at mobile devices; however, total throughput in a target area is saturated by inter-cell interference (ICI) because of the limited number of frequency channels available for WLANs. To substantially mitigate ICI, we developed and described a distributed smart antenna system (D-SAS) proposed for dense WLAN AP deployment in this paper. We also describe a system configuration based on our D-SAS approach. In this approach, the distributed antennas externally attached to each AP can be switched so as to make the transmit power match the mobile device's conditions (received power and packet type). The gains obtained by the antenna switching effectively minimize the transmission power required of each AP. We also describe experimental measurements taken in a stadium using a system prototype, the results show that D-SAS offers double the total throughput attained by a centralized smart antenna system (C-SAS).

  • An Efficient Method for Training Deep Learning Networks Distributed

    Chenxu WANG  Yutong LU  Zhiguang CHEN  Junnan LI  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2020/09/07
      Vol:
    E103-D No:12
      Page(s):
    2444-2456

    Training deep learning (DL) is a computationally intensive process; as a result, training time can become so long that it impedes the development of DL. High performance computing clusters, especially supercomputers, are equipped with a large amount of computing resources, storage resources, and efficient interconnection ability, which can train DL networks better and faster. In this paper, we propose a method to train DL networks distributed with high efficiency. First, we propose a hierarchical synchronous Stochastic Gradient Descent (SGD) strategy, which can make full use of hardware resources and greatly increase computational efficiency. Second, we present a two-level parameter synchronization scheme which can reduce communication overhead by transmitting parameters of the first layer models in shared memory. Third, we optimize the parallel I/O by making each reader read data as continuously as possible to avoid the high overhead of discontinuous data reading. At last, we integrate the LARS algorithm into our system. The experimental results demonstrate that our approach has tremendous performance advantages relative to unoptimized methods. Compared with the native distributed strategy, our hierarchical synchronous SGD strategy (HSGD) can increase computing efficiency by about 20 times.

  • DVNR: A Distributed Method for Virtual Network Recovery

    Guangyuan LIU  Daokun CHEN  

     
    LETTER-Information Network

      Pubricized:
    2020/08/26
      Vol:
    E103-D No:12
      Page(s):
    2713-2716

    How to restore virtual network against substrate network failure (e.g. link cut) is one of the key challenges of network virtualization. The traditional virtual network recovery (VNR) methods are mostly based on the idea of centralized control. However, if multiple virtual networks fail at the same time, their recovery processes are usually queued according to a specific priority, which may increase the average waiting time of users. In this letter, we study distributed virtual network recovery (DVNR) method to improve the virtual network recovery efficiency. We establish exclusive virtual machine (VM) for each virtual network and process recovery requests of multiple virtual networks in parallel. Simulation results show that the proposed DVNR method can obtain recovery success rate closely to centralized VNR method while yield ~70% less average recovery time.

  • Example Phrase Adaptation Method for Customized, Example-Based Dialog System Using User Data and Distributed Word Representations

    Norihide KITAOKA  Eichi SETO  Ryota NISHIMURA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2020/07/30
      Vol:
    E103-D No:11
      Page(s):
    2332-2339

    We have developed an adaptation method which allows the customization of example-based dialog systems for individual users by applying “plus” and “minus” operations to the distributed representations obtained using the word2vec method. After retrieving user-related profile information from the Web, named entity extraction is applied to the retrieval results. Words with a high term frequency-inverse document frequency (TF-IDF) score are then adopted as user related words. Next, we calculate the similarity between the distrubuted representations of selected user-related words and nouns in the existing example phrases, using word2vec embedding. We then generate phrases adapted to the user by substituting user-related words for highly similar words in the original example phrases. Word2vec also has a special property which allows the arithmetic operations “plus” and “minus” to be applied to distributed word representations. By applying these operations to words used in the original phrases, we are able to determine which user-related words can be used to replace the original words. The user-related words are then substituted to create customized example phrases. We evaluated the naturalness of the generated phrases and found that the system could generate natural phrases.

  • Algorithms for Distributed Server Allocation Problem

    Takaaki SAWA  Fujun HE  Akio KAWABATA  Eiji OKI  

     
    PAPER-Network

      Pubricized:
    2020/05/08
      Vol:
    E103-B No:11
      Page(s):
    1341-1352

    This paper proposes two algorithms, namely Server-User Matching (SUM) algorithm and Extended Server-User Matching (ESUM) algorithm, for the distributed server allocation problem. The server allocation problem is to determine the matching between servers and users to minimize the maximum delay, which is the maximum time to complete user synchronization. We analyze the computational time complexity. We prove that the SUM algorithm obtains the optimal solutions in polynomial time for the special case that all server-server delay values are the same and constant. We provide the upper and lower bounds when the SUM algorithm is applied to the general server allocation problem. We show that the ESUM algorithm is a fixed-parameter tractable algorithm that can attain the optimal solution for the server allocation problem parameterized by the number of servers. Numerical results show that the computation time of ESUM follows the analyzed complexity while the ESUM algorithm outperforms the approach of integer linear programming solved by our examined solver.

  • Reach Extension of 10G-EPON Upstream Transmission Using Distributed Raman Amplification and SOA

    Ryo IGARASHI  Masamichi FUJIWARA  Takuya KANAI  Hiro SUZUKI  Jun-ichi KANI  Jun TERADA  

     
    PAPER

      Pubricized:
    2020/06/08
      Vol:
    E103-B No:11
      Page(s):
    1257-1264

    Effective user accommodation will be more and more important in passive optical networks (PONs) in the next decade since the number of subscribers has been leveling off as well and it is becoming more difficult for network operators to keep sufficient numbers of maintenance workers. Drastically reducing the number of small-scale communication buildings while keeping the number of accommodated users is one of the most attractive solutions to meet this situation. To achieve this, we propose two types of long-reach repeater-free upstream transmission configurations for PON systems; (i) one utilizes a semiconductor optical amplifier (SOA) as a pre-amplifier and (ii) the other utilizes distributed Raman amplification (DRA) in addition to the SOA. Our simulations assuming 10G-EPON specifications and transmission experiments on a 10G-EPON prototype confirm that configuration (i) can add a 17km trunk fiber to a normal PON system with 10km access reach and 1 : 64 split (total 27km reach), while configuration (ii) can further expand the trunk fiber distance to 37km (total 47km reach). Network operators can select these configurations depending on their service areas.

  • Phase Selection in Round-Robin Scheduling Sequence for Distributed Antenna System Open Access

    Go OTSURU  Yukitoshi SANADA  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2020/03/25
      Vol:
    E103-B No:10
      Page(s):
    1155-1163

    One of key technologies in the fifth generation mobile communications is a distributed antenna system (DAS). As DAS creates tightly packed antenna arrangements, inter-user interference degrades its spectrum efficiency. Round-robin (RR) scheduling is known as a scheme that achieves a good trade-off between computational complexity and spectrum efficiency. This paper proposes a user equipment (UE) allocation scheme for RR scheduling. The proposed scheme offers low complexity as the phase of UE allocation sequences are predetermined. Four different phase selection criteria are compared in this paper. Numerical results obtained through computer simulation show that maximum selection, which sequentially searches for the phase with the maximum tentative throughput realizes the best spectrum efficiency next to full search. There is an optimum number of UEs which obtains the largest throughput in single-user allocation while the system throughput improves as the number of UEs increases in 2-user RR scheduling.

  • Construction of an Efficient Divided/Distributed Neural Network Model Using Edge Computing

    Ryuta SHINGAI  Yuria HIRAGA  Hisakazu FUKUOKA  Takamasa MITANI  Takashi NAKADA  Yasuhiko NAKASHIMA  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2020/07/02
      Vol:
    E103-D No:10
      Page(s):
    2072-2082

    Modern deep learning has significantly improved performance and has been used in a wide variety of applications. Since the amount of computation required for the inference process of the neural network is large, it is processed not by the data acquisition location like a surveillance camera but by the server with abundant computing power installed in the data center. Edge computing is getting considerable attention to solve this problem. However, edge computing can provide limited computation resources. Therefore, we assumed a divided/distributed neural network model using both the edge device and the server. By processing part of the convolution layer on edge, the amount of communication becomes smaller than that of the sensor data. In this paper, we have evaluated AlexNet and the other eight models on the distributed environment and estimated FPS values with Wi-Fi, 3G, and 5G communication. To reduce communication costs, we also introduced the compression process before communication. This compression may degrade the object recognition accuracy. As necessary conditions, we set FPS to 30 or faster and object recognition accuracy to 69.7% or higher. This value is determined based on that of an approximation model that binarizes the activation of Neural Network. We constructed performance and energy models to find the optimal configuration that consumes minimum energy while satisfying the necessary conditions. Through the comprehensive evaluation, we found that the optimal configurations of all nine models. For small models, such as AlexNet, processing entire models in the edge was the best. On the other hand, for huge models, such as VGG16, processing entire models in the server was the best. For medium-size models, the distributed models were good candidates. We confirmed that our model found the most energy efficient configuration while satisfying FPS and accuracy requirements, and the distributed models successfully reduced the energy consumption up to 48.6%, and 6.6% on average. We also found that HEVC compression is important before transferring the input data or the feature data between the distributed inference processes.

  • A Reactive Reporting Scheme for Distributed Sensing in Multi-Band Wireless LAN System

    Rui TENG  Kazuto YANO  Yoshinori SUZUKI  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2020/02/18
      Vol:
    E103-B No:8
      Page(s):
    860-871

    A multi-band wireless local area network (WLAN) enables flexible use of multiple frequency bands. To efficiently monitor radio resources in multi-band WLANs, a distributed-sensing system that employs a number of stations (STAs) is considered to alleviate sensing constraints at access points (APs). This paper examines the distributed sensing that expands the sensing coverage area and monitors multiple object channels by employing STA-based sensing. To avoid issuing unnecessary reports, each STA autonomously judges whether it should make a report by comparing the importance of its own sensing result and that of the overheard report. We address how to efficiently collect the necessary sensing information from a large number of STAs. We propose a reactive reporting scheme that is highly scalable by the number of STAs to collect such sensing results as the channel occupancy ratio. Evaluation results show that the proposed scheme keeps the number of reports low even if the number of STAs increases. Our proposed sensing scheme provides large sensing coverage.

  • Participating-Domain Segmentation Based Server Selection Scheme for Real-Time Interactive Communication Open Access

    Akio KAWABATA  Bijoy CHAND CHATTERJEE  Eiji OKI  

     
    PAPER-Network

      Pubricized:
    2020/01/17
      Vol:
    E103-B No:7
      Page(s):
    736-747

    This paper proposes an efficient server selection scheme in successive participation scenario with participating-domain segmentation. The scheme is utilized by distributed processing systems for real-time interactive communication to suppress the communication latency of a wide-area network. In the proposed scheme, users participate for server selection one after another. The proposed scheme determines a recommended server, and a new user selects the recommended server first. Before each user participates, the recommended servers are determined assuming that users exist in the considered regions. A recommended server is determined for each divided region to minimize the latency. The new user selects the recommended available server, where the user is located. We formulate an integer linear programming problem to determine the recommended servers. Numerical results indicate that, at the cost additional computation, the proposed scheme offers smaller latency than the conventional scheme. We investigate different policies to divide the users' participation for the recommended server finding process in the proposed scheme.

  • A Server-Based Distributed Storage Using Secret Sharing with AES-256 for Lightweight Safety Restoration

    Sanghun CHOI  Shuichiro HARUTA  Yichen AN  Iwao SASASE  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2020/04/20
      Vol:
    E103-D No:7
      Page(s):
    1647-1659

    Since the owner's data might be leaked from the centralized server storage, the distributed storage schemes with the server storage have been investigated. To ensure the owner's data in those schemes, they use Reed Solomon code. However, those schemes occur the burden of data capacity since the parity data are increased by how much the disconnected data can be restored. Moreover, the calculation time for the restoration will be higher since many parity data are needed to restore the disconnected data. In order to reduce the burden of data capacity and the calculation time, we proposed the server-based distributed storage using Secret Sharing with AES-256 for lightweight safety restoration. Although we use Secret Sharing, the owner's data will be safely kept in the distributed storage since all of the divided data are divided into two pieces with the AES-256 and stored in the peer storage and the server storage. Even though the server storage keeps the divided data, the server and the peer storages might know the pair of divided data via Secret Sharing, the owner's data are secure in the proposed scheme from the inner attack of Secret Sharing. Furthermore, the owner's data can be restored by a few parity data. The evaluations show that our proposed scheme is improved for lightweight, stability, and safety.

  • An Efficient Routing Method for Range Queries in Skip Graph

    Ryohei BANNO  Kazuyuki SHUDO  

     
    PAPER

      Pubricized:
    2019/12/09
      Vol:
    E103-D No:3
      Page(s):
    516-525

    Skip Graph is a promising distributed data structure for large scale systems and known for its capability of range queries. Although several methods of routing range queries in Skip Graph have been proposed, they have inefficiencies such as a long path length or a large number of messages. In this paper, we propose a novel routing method for range queries named Split-Forward Broadcasting (SFB). SFB introduces a divide-and-conquer approach, enabling nodes to make full use of their routing tables to forward a range query. It brings about a shorter average path length than existing methods, as well as a smaller number of messages by avoiding duplicate transmission. We clarify the characteristics and effectiveness of SFB through both analytical and experimental comparisons. The results show that SFB can reduce the average path length roughly 30% or more compared with a state-of-the-art method.

41-60hit(734hit)