IEICE TRANSACTIONS on Information

Impact Factor

0.59
Eigenfactor

0.002
article influence

0.1
Cite Score

1.4

To the Advance publication
To the Archives

Advance publication (published online immediately after acceptance)

Lightweight Neural Data Sequence Modeling by Scale Causal Blocks
Hiroaki AKUTSU Ko ARAI

Pubricized:
2024/11/08
PAPER
- Summary
- Free PDF (1.6MB)
Recaptured Image Detection Based on Multi-Scale Residual Features of Discriminative Regions
Lanxi LIU Pengpeng YANG Suwen DU Sani M. ABDULLAHI

Pubricized:
2024/11/08
PAPER
- Summary
- Free PDF (5.9MB)
Learn Discriminative Features for Small Object Detection through Multi-scale Image Degradation with Contrastive Learning
Xiaoguang TU Zhi HE Gui FU Jianhua LIU Mian ZHONG Chao ZHOU Xia LEI Juhang YIN Yi HUANG Yu WANG

Pubricized:
2024/11/05
PAPER
- Summary
- Free PDF (1.4MB)
Joint Distribution-Aligned Dual-Sparse Linear Regression for Cross-Stimulus Speech-Based Depression Detection
Yingying LU Cheng LU Yuan ZONG Feng ZHOU Chuangao TANG

Pubricized:
2024/11/01
LETTER
- Summary
- Free PDF (229.4KB)
Multi-grained Guaranteeable Requirement Analysis for Iterative Adaptation
Jialong LI Takuto YAMAUCHI Takanori HIRANO Jinyu CAI Kenji TEI

Pubricized:
2024/10/31
PAPER
- Summary
- Free PDF (567.6KB)
A fully digital transmitting-receiving platform for MIMO radar waveform diversity experiment
Wei LEI Yue ZHANG Hanfeng XIE Zebin CHEN Zengping CHEN Weixing LI

Pubricized:
2024/10/30
PAPER
- Summary
- Free PDF (7MB)
Leveraging Different Boolean Function Decompositions to Reduce T-Count in LUT-based Quantum Circuit Synthesis
David CLARINO Naoya ASADA Atsushi MATSUO Shigeru YAMASHITA

Pubricized:
2024/10/30
PAPER
- Summary
- Free PDF (1.3MB)
Criticality and Tolerance in Injection Timing in Cup-Stacking Method for Collective Communication
Takashi YOKOTA Kanemitsu OOTSU

Pubricized:
2024/10/28
PAPER
- Summary
- Free PDF (1.5MB)
An anchor-free Siamese tracker with multi-attention and corner detection mechanism
Xiaokang Jin Benben Huang Hao Sheng Yao Wu

Pubricized:
2024/10/28
PAPER
- Summary
- Free PDF (3MB)
Effect of Politeness on Trust in Re-enter Requests to User by Smart Speaker -Pilot Study-
Tomoki MIYAMOTO

Pubricized:
2024/10/23
LETTER
- Summary
- Free PDF (2.2MB)
Fine-tuning Models for Final Disagreement Anticipation in Negotiation Mid-Dialogues
Ken WATANABE Katsuhide FUJITA

Pubricized:
2024/10/10
PAPER
- Summary
- Free PDF (3.8MB)
Deepfake speech detection: approaches from acoustic features related to auditory perception to deep neural networks
Masashi UNOKI Kai LI Anuwat CHAIWONGYEN Quoc-Huy NGUYEN Khalid ZAMAN

Pubricized:
2024/10/07
INVITED PAPER
- Summary
- Free PDF (965KB)
Video Watermarking Method Based on 3D U-Net Robust Against Re-shooting
Takaharu TSUBOYAMA Ryota TAKAHASHI Motoi IWATA Koichi KISE

Pubricized:
2024/10/07
PAPER
- Summary
- Free PDF (4MB)
UTStyleCap4K: Generating Image Captions with Sentimental Styles
Chi ZHANG Li TAO Toshihiko YAMASAKI

Pubricized:
2024/10/02
PAPER
- Summary
- Free PDF (2.5MB)
FP-GNN: A Graph Neural Network for Hardware Trojan Detection in Gate-Level Netlist
Ann Jelyn TIEMPO Yong-Jin JEONG

Pubricized:
2024/10/01
LETTER
- Summary
- Free PDF (532.3KB)
Adaptive Merge Candidate Selection based on Geometric Partitioning Mode beyond Versatile Video Coding
Haruhisa KATO Yoshitaka KIDANI Kei KAWAMURA

Pubricized:
2024/09/24
PAPER
- Summary
- Free PDF (4MB)
A Multi-Agent Deep Reinforcement Learning Algorithm for Task offloading in future 6G V2X Network
Jiakun LI Jiajian LI Yanjun SHI Hui LIAN Haifan WU

Pubricized:
2024/09/24
PAPER
- Summary
- Free PDF (1.2MB)
Dalio: In-Kernel Centralized Replication for Key-Value Stores
Gyuyeong KIM

Pubricized:
2024/09/20
LETTER
- Summary
- Free PDF (138.7KB)
Detecting Textual Backdoor Attacks via Class Difference for Text Classification System
Hyun KWON Jun LEE

Pubricized:
2024/09/19
PAPER
- Summary
- Free PDF (680.5KB)
D2PT: Density to Point Transformer with Knowledge Distillation for Crowd Counting and Localization
Fan LI Enze YANG Chao LI Shuoyan LIU Haodong WANG

Pubricized:
2024/09/17
LETTER
- Summary
- Free PDF (1.8MB)
Incremental learning for network traffic classification using generative adversarial networks
Guangjin Ouyang Yong Guo Yu Lu Fang He

Pubricized:
2024/09/13
PAPER
- Summary
- Free PDF (1.3MB)
Multi-Scale Rail Surface Anomaly Detection Based on Weighted Multivariate Gaussian Distribution
Yuyao LIU Qingyong LI Shi BAO Wen WANG

Pubricized:
2024/09/12
PAPER
- Summary
- Free PDF (7.8MB)
BP-CRN: A Lightweight Two-Stage Convolutional Recurrent Network For Multi-channel Speech Enhancement
Cong PANG Ye NI Jia Ming CHENG Lin ZHOU Li ZHAO

Pubricized:
2024/09/10
LETTER
- Summary
- Free PDF (2.6MB)
Building Defect Prediction Models by Online Learning Considering Defect Overlooking
Nikolay FEDOROV Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI

Pubricized:
2024/09/09
LETTER
- Summary
- Free PDF (96.5KB)
The Impact of Defect (Re) Prediction on Software Testing
Yukasa MURAKAMI Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI

Pubricized:
2024/09/09
LETTER
- Summary
- Free PDF (148.4KB)
Deterministic and Probabilistic Certified Defenses for Content-Based Image Retrieval
Kazuya KAKIZAKI Kazuto FUKUCHI Jun SAKUMA

Pubricized:
2024/09/05
PAPER
- Summary
- Free PDF (3.4MB)
Fault-tolerant Routing in Bicubes
Yitong WANG Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO

Pubricized:
2024/09/05
PAPER
- Summary
- Free PDF (759.2KB)
Integrating Cyber-Physical Modeling for Pandemic Surveillance: A Graph-Based Approach for Disease Hotspot Prediction and Public Awareness
Waqas NAWAZ Muhammad UZAIR Kifayat ULLAH KHAN Iram FATIMA

Pubricized:
2024/08/29
PAPER
- Summary
- Free PDF (2.2MB)
Real-time Interactions with Photos and Texts in Large Classrooms
Haeyoung Lee

Pubricized:
2024/08/28
LETTER
- Summary
- Free PDF (372.3KB)
CNN-based feature integration network for speech enhancement in microphone arrays
Ji XI Pengxu JIANG Yue XIE Wei JIANG Hao DING

Pubricized:
2024/08/26
LETTER
- Summary
- Free PDF (2MB)
Partial Enhancement and Channel Aggregation for Visible-Infrared Person Re-Identification
Weiwei JING Zhonghua LI

Pubricized:
2024/08/26
PAPER
- Summary
- Free PDF (1.4MB)
Practical APT Group Hash Unit Profiling Framework Using TTPs
Sena LEE Chaeyoung KIM Hoorin PARK

Pubricized:
2024/08/20
LETTER
- Summary
- Free PDF (716.5KB)
Bilaterally Colored Finite Automata and Bilaterally Colored Regular Expressions
Akira ITO Yoshiaki TAKAHASHI

Pubricized:
2024/08/20
PAPER
- Summary
- Free PDF (652.7KB)
Strategies and Equilibria on Indistinguishability of Winning Objectives and Related Decision Problems
Rindo NAKANISHI Yoshiaki TAKATA Hiroyuki SEKI

Pubricized:
2024/08/20
PAPER
- Summary
- Free PDF (1.5MB)
Computational Complexity of Yajisan-Kazusan and Stained Glass
Chuzo IWAMOTO Ryo TAKAISHI

Pubricized:
2024/08/16
PAPER
- Summary
- Free PDF (682.2KB)
A clustering-based deep learning method for water level prediction
Chih-Ping Wang Duen-Ren Liu

Pubricized:
2024/08/14
LETTER
- Summary
- Free PDF (778.9KB)
Stochastic Dual Coordinate Ascent for Learning Sign Constrained Linear Predictors
Yuya TAKADA Rikuto MOCHIDA Miya NAKAJIMA Syun-suke KADOYA Daisuke SANO Tsuyoshi KATO

Pubricized:
2024/08/08
PAPER
- Summary
- Free PDF (483.5KB)
Multi-dimensional and Multi-task Facial Expression Recognition for Academic Outcomes Prediction
Yi Huo Yun Ge

Pubricized:
2024/08/08
LETTER
- Summary
- Free PDF (458.2KB)
Mixup SVM Learning for Compound Toxicity Prediction Using Human Pluripotent Stem Cells
Rikuto MOCHIDA Miya NAKAJIMA Haruki ONO Takahiro ANDO Tsuyoshi KATO

Pubricized:
2024/08/08
LETTER
- Summary
- Free PDF (170.9KB)
A Bigram Based ILP Formulation for Break Minimization in Sports Scheduling Problems
Koichi FUJII Tomomi MATSUI

Pubricized:
2024/08/08
PAPER
- Summary
- Free PDF (1MB)
Dendritic Learning-based Feature Fusion for Deep Networks
Yaotong SONG Zhipeng LIU Zhiming ZHANG Jun TANG Zhenyu LEI Shangce GAO

Pubricized:
2024/08/07
LETTER
- Summary
- Free PDF (289.3KB)
Applying Run-Length Compression to the Configuration Data of SLM Fine-Grained Reconfigurable Logic
Souhei TAKAGI Takuya KOJIMA Hideharu AMANO Morihiro KUGA Masahiro IIDA

Pubricized:
2024/08/07
PAPER
- Summary
- Free PDF (2.6MB)
Imperceptible Trojan Attacks to the Graph-based Big Data Processing in Smart Society
Jun ZHOU Masaaki KONDO

Pubricized:
2024/08/07
PAPER
- Summary
- Free PDF (825.5KB)
Feasibility Study of Applying Spatial Crowd Smoothing Without Economic Incentives on Ticket Reservation System that Applies Nudges
Tetsuya MANABE Wataru UNUMA

Pubricized:
2024/08/05
PAPER
- Summary
- Free PDF (1.3MB)
(15/14)n Flips are (almost) Sufficient to Sort Heydari and Sudborough's Pancake Stack
Kazuyuki AMANO

Pubricized:
2024/08/05
LETTER
- Summary
- Free PDF (288.7KB)
Overlapping of Lattice Unfolding for Cuboids
Takumi SHIOTA Tonan KAMATA Ryuhei UEHARA

Pubricized:
2024/08/05
PAPER
- Summary
- Free PDF (722.9KB)
An FPT Algorithm for the Exact Matching Problem and NP-hardness of Related Problems
Hitoshi MURAKAMI Yutaro YAMAGUCHI

Pubricized:
2024/08/01
PAPER
- Summary
- Free PDF (704.5KB)
Recognition of Vibration Dampers Based on Deep Learning Method in UAV Images
Jingjing Liu Chuanyang Liu Yiquan Wu Zuo Sun

Pubricized:
2024/07/30
PAPER
- Summary
- Free PDF (3.2MB)
Temporal correlation-based end-to-end rate control in DCVC
Zhenglong YANG Weihao DENG Guozhong WANG Tao FAN Yixi LUO

Pubricized:
2024/07/29
LETTER
- Summary
- Free PDF (574.4KB)
A Subclass of Mu-Calculus with the Freeze Quantifier Equivalent to Büchi Register Automata
Yoshiaki TAKATA Akira ONISHI Ryoma SENDA Hiroyuki SEKI

Pubricized:
2024/07/26
LETTER
- Summary
- Free PDF (137.6KB)
Degraded image classification using knowledge distillation and robust data augmentations
Dinesh DAULTANI Masayuki TANAKA Masatoshi OKUTOMI Kazuki ENDO

Pubricized:
2024/07/26
PAPER
- Summary
- Free PDF (4.4MB)
Escape from the Room
Kento KIMURA Tomohiro HARAMIISHI Kazuyuki AMANO Shin-ichi NAKANO

Pubricized:
2024/07/11
PAPER
- Summary
- Free PDF (1.1MB)
Online combinatorial linear optimization via a Frank-Wolfe-based metarounding algorithm
Ryotaro MITSUBOSHI Kohei HATANO Eiji TAKIMOTO

Pubricized:
2024/07/11
PAPER
- Summary
- Free PDF (1.1MB)
A Flip-count-based Dynamic Temperature Control Method for Constrained Combinatorial Optimization by Parallel Annealing Algorithms
Genta INOUE Daiki OKONOGI Satoru JIMBO Thiem Van CHU Masato MOTOMURA Kazushi KAWAMURA

Pubricized:
2024/07/11
PAPER
- Summary
- Free PDF (1.7MB)
Performance evaluation of CAIN model frame interpolation using training data limited by fixed camera scene detection
Hikaru USAMI Yusuke KAMEDA

Pubricized:
2024/07/11
LETTER
- Summary
- Free PDF (708.9KB)
Towards Superior Pruning Performance in Federated Learning with Discriminative Data
Yinan YANG

Pubricized:
2024/06/27
- Summary
- Free PDF (7.9MB)
Design and implementation of opto-electrical hybrid floating-point multipliers
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI

Pubricized:
2024/06/26
- Summary
- Free PDF (2.5MB)
HDR-VDA: A Full Stage Data Augmentation Method for HDR Video Reconstruction
Fengshan ZHAO Qin LIU Takeshi IKENAGA

Pubricized:
2024/06/17
- Summary
- Free PDF (1.2MB)
Space-efficient FPT Algorithms for Degeneracy
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI

Pubricized:
2024/05/31
- Summary
- Free PDF (101.3KB)
The Least Core of Routing Game Without Triangle Inequality
Tomohiro KOBAYASHI Tomomi MATSUI

Pubricized:
2024/05/30
- Summary
- Free PDF (232.9KB)
Enumerating floorplans with Aligned Columns
Shin-ichi NAKANO

Pubricized:
2024/05/30
- Summary
- Free PDF (365.4KB)
An IP Core Protection Scheme Based on Hybrid Lightweight Encryption for Neuromorphic Computing System
Ming PAN

The aritcle processing charge of this paper has not been paid.

Pubricized:
2022/09/14
- Summary

Whole issue

Volume E80-D No.4 (Publication Date:1997/04/25)

Special Issue on Parallel and Distributed Supercomputing

FOREWORD
Masaaki SHIMASAKI

FOREWORD

Page(s):
401-402
- HTML
- PDF(91.9KB) >> Buy this Article
The Effect of Optimizing Compilers on Architecture and Programs
Michael WOLFE

INVITED PAPER

Page(s):
403-408
The first optimizing compiler was developed at IBM in order to prove that high level language programming could be as efficient as hand-coded machine language. Computer architecture and compiler optimization interacted through a feedback loop, from the high-level language computer architectures of the 1970s to the RISC machines of the 1980s. In the supercomputing community, the availability of effective vectorizing compilers delivered easy-to-use performance in the 1980s to the present. These compilers were successful at least in part because they could predict poor performance spots in the program and report these to users. This fostered a feedback loop between programmers and compilers to develop high performance programs. Future optimizing compilers for high performance computers and supercomputers will have to take advantage of both feedback loops.
Vienna Fortran and the Path Towards a Standard Parallel Language
Barbara M. CHAPMAN Piyush MEHROTRA Hans P. ZIMA

INVITED PAPER

Page(s):
409-416
Highly parallel scalable multiprocessing systems (HMPs) are powerful tools for solving large-scale scientific and engineering problems. However, these machines are difficult to program since algorithms must exploit locality in order to achieve high performance. Vienna Fortran was the first fully specified data-parallel language for HMPs that provided features for the specification of data distribution and alignment at a high level of abstraction. In this paper we outline the major elements of Vienna Fortran and compare it to High Performance Fortran (HPF), a de-facto standard in this area. A significant weakness of HPF is its lack of support for many advanced applications, which require irregular data distributions and dynamic load balancing. We introduce HPF +, an extension of HPF based on Vienna Fortran, that provides the required functionality.
Nonuniform Output Traffic Distributions in the Multipath Crossbar Network
Byungho KIM Boseob KWON Hyunsoo YOON Jung Wan CHO

PAPER

Page(s):
417-424
Multipath interconnection networks can support higher bandwidth than those of nonblocking networks by passing multiple packets to the same output simultaneously and these packets are buffered in the output buffer. The delay-throughput performance of the output buffer in multipath networks is closely related to output traffic distribution, packet arrival process at each output link connected to a given output buffer. The output traffic distributions are different according to the various input traffic patterns. Focusing on nonuniform output traffic distributions, this paper develops a new, general analytic model of the output buffer in multipath networks, which enables us to investigate the delay-throughput performance of the output buffer under various input traffic patterns. This paper also introduces Multipath Crossbar network as a representative multipath network which is the base architecture of our analysis. It is shown that the output buffer performances such as packet loss probability and delay improve as nonuniformity of the output traffic distribution becomes larger.
Node-to-Set Disjoint Paths with Optimal Length in Star Graphs
Qian-Ping GU Shietung PENG

PAPER

Page(s):
425-433
In this paper, we consider the following node-to-set disjoint paths problem: given a node s and a set T = {t₁,...,t_k} of k nodes in a k-connected graph G, find k node-disjoint paths s t_i, 1 i k. We give an O(n²) time algorithm for the node-to-set disjoint paths problem in n-dimensional star graphs G_n which are (n - 1)-connected. The algorithm finds the n - 1 node-disjoint paths of length at most d(G_n) + 1 for n 4,6 and at most d(G_n) + 2 for n = 4,6, where d(G_n) = 3(n-1)/2 is the diameter of G_n. d(G_n) + 1 and d(G_n) + 2 are also the lower bounds on the length of the paths for the above problem in G_n for n 4,6 and n = 4,6, respectively.
Extending SCI on Hierarchical Directory Trees for Large-Scale Multiprocessors
Ing-Zong LU Tien-Fu CHEN

PAPER

Page(s):
434-440
SCI (Scalable Coherent Interface) is pointerbased coherent directory scheme for massively parallel multiprocessors. Large message latency is one of the problems with SCI because of its linked list structure: the searching latency of messages could grow as a linear order of the number of processors. In this paper, we focus on a hierarchical architecture to propose a new schemeEST(Extending SCI-Tree), which may reduce the message traffic and also take the advantages of the topology property. Simulation results show that the EST scheme is effective in reducing message latency and communication cost when compared with other schemes.
Intelligent Memory: An Architecture for Lock-Free Synchronization
Nakun SEONG Naihoon JUNG Byungho KIM Hyunsoo YOON

PAPER

Page(s):
441-447
This paper presents intelligent memory, a new memory architecture capable of providing efficient lock-free synchronization. In the intelligent memory, a sequence of operations on a shared object associated with that memory module can be processed without any intervention so that an environment for the synchronization can be provided by executing a critical section itself in that memory module. For this, we present a memory architecture for the intelligent memory having minimal instruction set and develop a progtramming model, called Critical Section Procedure (CSP), which consists of shared data structures and operations on them. Intelligent memory is intended to eliminate waste of processing time such as busy waiting in spin lock and the retry due to process contentions in existing lock-free synchronization schemes. Simulation results show that the intelligent memory provides better throughput compared with the spin lock and the existing lock-free synchronization schemes.
A High-Performance Cluster Computing Environment Based on Hybrid Shared Memory/Message Passing Model
Yoshimasa OHNISHI Yoshinari SUGIMOTO Toshinori SUEYOSHI

PAPER

Page(s):
448-454
We conducted research and development of Distributed Supercomputing Environment (DSE) based on distributed shared memory model to serve as a cluster computing environment to provide parallel processing facilities. Shared memory model and message passing model are well-known typical models of parallel processing. It is desired that hybrid programming environment will make the best use of the prominent features of both models. Consequently, we add a new message passing mechanism to present DSE, and create a prototype called Hybrid DSE as a hybrid model based cluster computing environment. In this paper, we describe the implementation of a message passing mechanism on DSE and performance evaluation of Hybrid DSE.
Design of Array Processors for 2-D Discrete Fourier Transform
Shietung PENG Igor SEDUKHIN Stanislav SEDUKHIN

PAPER

Page(s):
455-465
In this paper the design of systolic array processors for computing 2-dimensional Discrete Fourier Transform (2-D DFT) is considered. We investigated three different computational schemes for designing systolic array processors using systematic approach. The systematic approach guarantees to find optimal systolic array processors from a large solution space in terms of the number of processing elements and I/O channels, the processing time, topology, pipeline period, etc. The optimal systolic array processors are scalable, modular and suitable for VLSI implementation. An application of the designed systolic array processors to the prime-factor DFT is also presented.
Parallel File Access for Implementing Dynamic Load Balancing on a Massively Parallel Computer
Masahisa SHIMIZU Yasuhiro OUE Kazumasa OHNISHI Toru KITAMURA

PAPER

Page(s):
466-472
Because a massively parallel computer processes vast amounts of data and generates many access requests from multiple processors simultaneously, parallel secondary storage requires large capacity and high concurrency. One effective method of implementation of such secondary storage is to use disk arrays which have multiple disks connected in parallel. In this paper, we propose a parallel file access method named DECODE (dynamic express changing of data entry) in which load balancing of each disk is achieved by dynamic determination of the write data position. For resolution of the problem of data fragmentation which is caused by the relocation of data during a write process, the concept of "Equivalent Area" is introduced. We have performed a preliminary performance evaluation using software simulation under various access statuses by changing the access pattern, access size and stripe size and confirmed the effectiveness of load balancing with this method.
Data-Localization Scheduling inside Processor-Cluster for Multigrain Parallel Processing
Akimasa YOSHIDA Ken'ichi KOSHIZUKA Wataru OGATA Hironori KASAHARA

PAPER

Page(s):
473-479
This paper proposes a data-localization scheduling scheme inside a processor-cluster for multigrain parallel processing, which hierarchically exploits parallelism among coarsegrain tasks like loops, medium-grain tasks like loop iterations and near-fine-grain tasks like statements. The proposed scheme assigns near-fine-grain or medium-grain tasks inside coarse-grain tasks onto processors inside a processor-cluster so that maximum parallelism can be exploited and inter-processor data transfer can be minimum after data-localization for coarse-grain tasks across processor-clusters. Performance evaluation on a multiprocessor system OSCAR shows that multigrain parallel processing with the proposed data-localization scheduling can reduce execution time for application programs by 10% compared with multigrain parallel processing without data-localization.
Non-Graph Based Approach on the Analysis of Pointers and Structures
Dong-Soo HAN Takao TSUDA

PAPER

Page(s):
480-488
In high performance compilers to process pointer-handling programs, precise pointer alias analysis is useful for the compilers to generate efficient object code. It is well known that most compiler techniques such as data flow analysis, dependence analysis, side effect analysis and optimizations are related to the alias problem. However, without data structure information, there is a limit on the precision of the alias analysis. Even though the automatic data structure detection problem is complex, when pointer manipulation satisfies some restrictions, some data structures can be detected automatically by compilers with some knowledge of aliases. In this paper, we propose an automatic data structure detection method for Pascal and Fortran 90. Linear list, tree and dag data structures are detected. Detected data structure information can be used not only for raising the precision of alias analysis but also for some optimizing techniques for pointer handling programs directly.
A Lookahead Heuristic for Heterogeneous Multiprocessor Scheduling with Communication Costs
Dingchao LI Akira MIZUNO Yuji IWAHORI Naohiro ISHII

PAPER

Page(s):
489-494
This paper describes a new approach to the scheduling problem that assigns tasks of a parallel program described as a task graph onto parallel machines. The approach handles interprocessor communication and heterogeneity, based on using both the theoretical results developed so far and a lookahead scheduling strategy. The experimental results on randomly generated task graphs demonstrate the effectiveness of this scheduling heuristic.
Reproducing the Behavior of a Parallel Program by Using Dataflow Execution Models
Naohisa TAKAHASHI Takeshi MIEI

PAPER

Page(s):
495-503
We present a general framework with which we can evaluate the flexibility and efficiency of various replay systems for parallel programs. In our approach, program monitoring is modeled by making a virtual dataflow program graph, referred to as a VDG, that includes all the instructions executed by the program. The behavior of the program replay is modeled on the parallel interpretation of a VDG based on two basic parallel execution models for dataflow program graphs: a data-driven model and a demand-driven model. Previous attempts to replay parallel programs, known as Instant Replay and P-Sequence, are also modeled as variations of the data-driven replay, i.e. the datadriven interpretation of a VDG. We show that the demand-driven replay, i.e. the demand-driven interpretation of a VDG, is more flexible in program replay than the data-driven replay since it allows better control of parallelism and a more selective replay. We also show that we can implement a demand-driven replay that requires almost the same amount of data to be saved during program monitoring as does the data-driven replay, and which eliminates any centralized bottleneck during program monitoring by optimizing the demand propagation and using an effective data structure.
Implementation of a Parallel Prolog System on a Distributed Memory Parallel Computer
Hideo MATSUDA Toru KAWABATA Yukio KANEDA

PAPER

Page(s):
504-509
In this paper we propose a new method for parallel execution of Prolog programs and present its implementation on a distributed memory parallel computer, Fujitsu AP1000. In our method a number of processes (named Prolog engines) explore different branches of a search tree (named tasks) in parallel, which is the same as OR-parallelism. Unlike OR-parallelism, the mapping between Prolog engines and tasks is statically determined like data parallelism. Each Prolog engine can decide which task is executed by the engine without communicating with the other engines. In many search problems, however, such static task mapping may cause imbalance on the processing time of each engine since the computational costs to explore branches may vary substantially. To cope with this issue, we devise a method to adjust the task imbalance by periodical exchanging how many tasks were processed for each engine. Also for reducing communication overhead in load balancing, we limit the scope of engines that exchange the load information each other. The effectiveness of our method is evaluated by measuring execution times for N Queens and the Traveling Salesman Problem on the AP1000. Using 512 processors, we obtained 355-fold speedup for N Queens and 343-fold speedup on the Traveling Salesman Problem.
An Efficient Implementation of Term Rewriting System on a Distributed Memory Architecture
Yoshinari HACHISU Shinichirou YAMAMOTO Takeshi HAMAGUCHI Kiyoshi AGUSA

PAPER

Page(s):
510-517
Term Rewriting System (TRS) is a model of computation and it is used in various application such as algebraic specification. TRS has an inherent concurrency and it is suitable for parallel computing. We have already proposed BOB (Bundle Of Branches), which is a mechanism of data management for parallel rewriting. We have proposed a model of parallel rewriting using BOB and implemented a TRS simulator based on this model on a shared memory parallel computer. Because it fully depends on the feature of a shared memory architecture, that is, a process can access any memory element, it is hard to transport it on a distributed memory parallel computer. In this paper, we propose autonomous BOB model. This model is suitable for a distributed memory architecture since a process uses message passing protocol and the method of load balancing is provided. We implement a TRS simulator using this model on a distributed memory architecture and it runs about 30 times faster on 64 processors than on a single processor.
Implementation of the Multicolored SOR Method on a Vector Supercomputer
Seiji FUJINO Ryutaro HIMENO Akira KOJIMA Kazuo TERADA

PAPER

Page(s):
518-523
We describe the implementation of an iterative method with the goal of gaining a long vector length. The strategy for vectorization by means of multipoint stencils used for discretization of the partial differential equations is discussed. Numerical experiments show that the strategy that requires certain restrictions on the number of grid points in the x and y directions improves the performance on the vector supercomputer.
High-Performance Parallel Computation of Flows Past a Space Plane Using NWT
Kisa MATSUSHIMA Susumu TAKANASHI

PAPER

Page(s):
524-530
Compressible viscous flows past a space plane have been elucidated by parallel computation on the NWT. The NWT is a vector-parallel architecture computer system which achieves remarkably high performance in processing speed and memory storage. We have examined the advantages of the NWT in order to simulate realistic flow problems in engineering, such as the investigation of global and local aerodynamic characteristics of a space plane. The accuracy of the computational results has been verified by comparison with experimental data. The simplified domain-decomposition technique introduced here is easy to apply for parallel implementation to significantly improve the acceleration rate of computations. The larger available memory storage enables us to conduct a grid refinement study through which several points concerning CFD simulation of a space plane are obtained.
Parallelized Simulation of Complicated Polymer Structures and lts Efficiency
Kazuhito SHIDA Kaoru OHNO Masayuki KIMURA Yoshiyuki KAWAZOE

PAPER

Page(s):
531-537
A large scale simulation for polymer chains in good solvent is performed. The implementation technique for efficient parallel execution, optimization, and load-balancing are discussed on this practical application. Finally, a simple performance model is proposed.

IEICE TRANSACTIONS on Information

Advance publication (published online immediately after acceptance)

Volume E80-D No.4 (Publication Date:1997/04/25)

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles