The search functionality is under construction.

Author Search Result

[Author] Masato OGUCHI(8hit)

1-8hit
  • A Proposition and Evaluation of DSM Models Suitable for a Wide Area Distributed Environment Realized on High Performance Networks

    Masato OGUCHI  Hitoshi AIDA  Tadao SAITO  

     
    PAPER-Communication Networks and Services

      Vol:
    E79-B No:2
      Page(s):
    153-162

    Distributed shared memory is an attractive option for realizing functionally distributed computing in a wide area distributed environment, because of its simplicity and flexibility in software programming. However, up till now, distributed shared memory has mainly been studied in a local environment. In a widely distributed environment, latency of communication greatly affects system performance. Moreover, bandwidth of networks available in a wide area is dramatically increasing recently. DSM architecture using high performance networks must be different from the case of low speed networks being used. In this paper, distributed shared memory models in a widely distributed environment are discussed and evaluated. First, existing distributed shared memory models are examined: They are shared virtual memory and replicated shared memory. Next, an improved replicated shared memory model, which uses internal machine memory, is proposed. In this model, we assume the existence of a seamless, multi-cast wide area network infrastructure - for example, an ATM network. A prototype of this model using multi-thread programming have been implemented on multi-CPU SPARCstations and an ATM-LAN. These DSM models are compared with SCRAMNetTM, whose mechanism is based on replicated shared memory. Results from this evaluation show the superiority of the replicated shared memory compared to shared virtual memory when the length of the network is large. While replicated shared memory using external memory is influenced by the ratio of local and global accesses, replicated shared memory using internal machine memory is suitable for a wide variety of cases. The replicated shared memory model is considered to be suitable particularly for applications which impose real time operation in a widely distributed environment, since some latency hiding techniques such as context switching or data prefetching are not effective for real time demands.

  • Capacity Control of Social Media Diffusion for Real-Time Analysis System

    Miki ENOKI  Issei YOSHIDA  Masato OGUCHI  

     
    PAPER

      Pubricized:
    2017/01/17
      Vol:
    E100-D No:4
      Page(s):
    776-784

    In Twitter-like services, countless messages are being posted in real-time every second all around the world. Timely knowledge about what kinds of information are diffusing in social media is quite important. For example, in emergency situations such as earthquakes, users provide instant information on their situation through social media. The collective intelligence of social media is useful as a means of information detection complementary to conventional observation. We have developed a system for monitoring and analyzing information diffusion data in real-time by tracking retweeted tweets. A tweet retweeted by many users indicates that they find the content interesting and impactful. Analysts who use this system can find tweets retweeted by many users and identify the key people who are retweeted frequently by many users or who have retweeted tweets about particular topics. However, bursting situations occur when thousands of social media messages are suddenly posted simultaneously, and the lack of machine resources to handle such situations lowers the system's query performance. Since our system is designed to be used interactively in real-time by many analysts, waiting more than one second for a query results is simply not acceptable. To maintain an acceptable query performance, we propose a capacity control method for filtering incoming tweets using extra attribute information from tweets themselves. Conventionally, there is a trade-off between the query performance and the accuracy of the analysis results. We show that the query performance is improved by our proposed method and that our method is better than the existing methods in terms of maintaining query accuracy.

  • Power-Effective File Layout Based on Large Scale Data-Intensive Application in Virtualized Environment

    Shunsuke YAGAI  Masato OGUCHI  Miyuki NAKANO  Saneyasu YAMAGUCHI  

     
    PAPER-Database system

      Pubricized:
    2017/07/14
      Vol:
    E100-D No:12
      Page(s):
    2761-2770

    In data centers, large numbers of computers are run simultaneously. These computers consume an enormous amount of energy. Several challenges related to this issue have been published. An energy-efficient storage management method that cooperates with applications was one effective approach. In this method, data and storage devices are managed using application support and the power consumption of storage devices is significantly decreased. However, existing studies do not take the virtualized environment into account. Recently, many data-intensive applications have been run in a virtualized environment, such as the cloud computing environment. In this paper, we focus on a virtualized environment wherein multiple virtual machines run on a physical computer and a data intensive application runs on each virtual machine. We discuss a method for reducing storage device power consumption using application support. First, we propose two storage management methods using application information. One method optimizes the inter-HDD file layout. This method removes frequently-accessed files from a certain HDD and switches the HDD to power-off mode. To balance loads and reduce seek distances, this method separates a heavily accessed file and consolidates files in a virtual machine with low access frequency. The other method optimizes the intra-HDD file layout, in addition to performing inter-HDD optimization. This method places frequently accessed files near each other. Second, we present our experimental results and demonstrate that the proposed methods can create sufficiently long HDD access intervals that power-off mode can be used, and thereby, reduce the power consumption of storage devices.

  • Performance Evaluation of Pipeline-Based Processing for the Caffe Deep Learning Framework

    Ayae ICHINOSE  Atsuko TAKEFUSA  Hidemoto NAKADA  Masato OGUCHI  

     
    PAPER

      Pubricized:
    2018/01/18
      Vol:
    E101-D No:4
      Page(s):
    1042-1052

    Many life-log analysis applications, which transfer data from cameras and sensors to a Cloud and analyze them in the Cloud, have been developed as the use of various sensors and Cloud computing technologies has spread. However, difficulties arise because of the limited network bandwidth between such sensors and the Cloud. In addition, sending raw sensor data to a Cloud may introduce privacy issues. Therefore, we propose a pipelined method for distributed deep learning processing between sensors and the Cloud to reduce the amount of data sent to the Cloud and protect the privacy of users. In this study, we measured the processing times and evaluated the performance of our method using two different datasets. In addition, we performed experiments using three types of machines with different performance characteristics on the client side and compared the processing times. The experimental results show that the accuracy of deep learning with coarse-grained data is comparable to that achieved with the default parameter settings, and the proposed distributed processing method has performance advantages in cases of insufficient network bandwidth between realistic sensors and a Cloud environment. In addition, it is confirmed that the process that most affects the overall processing time varies depending on the machine performance on the client side, and the most efficient distribution method similarly differs.

  • Deeply Programmable Application Switch for Performance Improvement of KVS in Data Center Open Access

    Satoshi ITO  Tomoaki KANAYA  Akihiro NAKAO  Masato OGUCHI  Saneyasu YAMAGUCHI  

     
    PAPER

      Pubricized:
    2024/01/17
      Vol:
    E107-D No:5
      Page(s):
    659-673

    The concepts of programmable switches and software-defined networking (SDN) give developers flexible and deep control over the behavior of switches. We expect these concepts to dramatically improve the functionality of switches. In this paper, we focus on the concept of Deeply Programmable Networks (DPN), where data planes are programmable, and application switches based on DPN. We then propose a method to improve the performance of a key-value store (KVS) through an application switch. First, we explain the DPN and application switches. The DPN is a network that makes not only control planes but also data planes programmable. An application switch is a switch that implements some functions of network applications, such as database management system (DBMS). Second, we propose a method to improve the performance of Cassandra, one of the most popular key-value based DBMS, by implementing a caching function in a switch in a dedicated network such as a data center. The proposed method is expected to be effective even though it is a simple and traditional way because it is in the data path and the center of the network application. Third, we implement a switch with the caching function, which monitors the accessed data described in packets (Ethernet frames) and dynamically replaces the cached data in the switch, and then show that the proposed caching switch can significantly improve the KVS transaction performance with this implementation. In the case of our evaluation, our method improved the KVS transaction throughput by up to 47%.

  • A Study of Effective Replica Reconstruction Schemes for the Hadoop Distributed File System

    Asami HIGAI  Atsuko TAKEFUSA  Hidemoto NAKADA  Masato OGUCHI  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2015/01/13
      Vol:
    E98-D No:4
      Page(s):
    872-882

    Distributed file systems, which manage large amounts of data over multiple commercially available machines, have attracted attention as management and processing systems for Big Data applications. A distributed file system consists of multiple data nodes and provides reliability and availability by holding multiple replicas of data. Due to system failure or maintenance, a data node may be removed from the system, and the data blocks held by the removed data node are lost. If data blocks are missing, the access load of the other data nodes that hold the lost data blocks increases, and as a result, the performance of data processing over the distributed file system decreases. Therefore, replica reconstruction is an important issue to reallocate the missing data blocks to prevent such performance degradation. The Hadoop Distributed File System (HDFS) is a widely used distributed file system. In the HDFS replica reconstruction process, source and destination data nodes for replication are selected randomly. We find that this replica reconstruction scheme is inefficient because data transfer is biased. Therefore, we propose two more effective replica reconstruction schemes that aim to balance the workloads of replication processes. Our proposed replication scheduling strategy assumes that nodes are arranged in a ring, and data blocks are transferred based on this one-directional ring structure to minimize the difference in the amount of transfer data for each node. Based on this strategy, we propose two replica reconstruction schemes: an optimization scheme and a heuristic scheme. We have implemented the proposed schemes in HDFS and evaluate them on an actual HDFS cluster. We also conduct experiments on a large-scale environment by simulation. From the experiments in the actual environment, we confirm that the replica reconstruction throughputs of the proposed schemes show a 45% improvement compared to the HDFS default scheme. We also verify that the heuristic scheme is effective because it shows performance comparable to the optimization scheme. Furthermore, the experimental results on the large-scale simulation environment show that while the optimization scheme is unrealistic because a long time is required to find the optimal solution, the heuristic scheme is very efficient because it can be scalable, and that scheme improved replica reconstruction throughput by up to 25% compared to the default scheme.

  • Action Recognition Using Pose Data in a Distributed Environment over the Edge and Cloud

    Chikako TAKASAKI  Atsuko TAKEFUSA  Hidemoto NAKADA  Masato OGUCHI  

     
    PAPER

      Pubricized:
    2021/02/02
      Vol:
    E104-D No:5
      Page(s):
    539-550

    With the development of cameras and sensors and the spread of cloud computing, life logs can be easily acquired and stored in general households for the various services that utilize the logs. However, it is difficult to analyze moving images that are acquired by home sensors in real time using machine learning because the data size is too large and the computational complexity is too high. Moreover, collecting and accumulating in the cloud moving images that are captured at home and can be used to identify individuals may invade the privacy of application users. We propose a method of distributed processing over the edge and cloud that addresses the processing latency and the privacy concerns. On the edge (sensor) side, we extract feature vectors of human key points from moving images using OpenPose, which is a pose estimation library. On the cloud side, we recognize actions by machine learning using only the feature vectors. In this study, we compare the action recognition accuracies of multiple machine learning methods. In addition, we measure the analysis processing time at the sensor and the cloud to investigate the feasibility of recognizing actions in real time. Then, we evaluate the proposed system by comparing it with the 3D ResNet model in recognition experiments. The experimental results demonstrate that the action recognition accuracy is the highest when using LSTM and that the introduction of dropout in action recognition using 100 categories alleviates overfitting because the models can learn more generic human actions by increasing the variety of actions. In addition, it is demonstrated that preprocessing using OpenPose on the sensor side can substantially reduce the transfer quantity from the sensor to the cloud.

  • High Performance Parallel Query Processing on a 100 Node ATM Connected PC Cluster

    Takayuki TAMURA  Masato OGUCHI  Masaru KITSUREGAWA  

     
    PAPER-Query Processing

      Vol:
    E82-D No:1
      Page(s):
    54-63

    We developed a PC cluster system which consists of 100 PCs as a test bed for massively parallel query processing. Each PC employs the 200 MHz Pentium Pro CPU and is connected with others through an ATM switch. Because the query processing applications are insensitive to the communication latency and mainly perform integer operations, the ATM connected PC cluster approach can be considered a reasonable solution for high performance database servers with low costs. However, there has been no challenge to construct large scale PC clusters for database applications, as far as the authors know. Though we employed commodity components as much as possible, we developed the DBMS itself, because that was a key component for obtaining high performance in parallel query processing, and there seemed no system which could meet our demand. On each PC node, a server program which acts as a database kernel is running to process the queries in cooperation with other nodes. The kernel was designed to execute pipelined operators and handle voluminous data efficiently, to achieve high performance on complex decision support type queries. We used the standard benchmark, TPC-D, on a 100 GB database to verify the feasibility of our approach, through comparison of our system with commercial parallel systems. As a whole, our system exhibited sufficiently high performance which was competitive with the current TPC-D top records, in spite of not using indices. For some heavy queries in the benchmark, which have high selectivity and joinability, our system performed much better. In addition, we applied transposed file organization to the database for further performance improvement. The transposed file organization vertically partitions the tuples, enabling attribute-by-attribute access to the relations. This resulted in significant performance improvement by reducing the amount of disk I/O and shifting the bottleneck to computation.