The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] OMP(3945hit)

2281-2300hit(3945hit)

  • Essential Cycle Calculation Method for Irregular Array Redistribution

    Sheng-Wen BAI  Chu-Sing YANG  

     
    PAPER-Computation and Computational Models

      Vol:
    E89-D No:2
      Page(s):
    789-797

    In many parallel programs, run-time array redistribution is usually required to enhance data locality and reduce remote memory access on the distributed memory multicomputers. In general, array distribution can be classified into regular distribution and irregular distribution according to the distribution fashion. Many methods for performing regular array redistribution have been presented in the literature. However, for the heterogeneous computation environment, irregular array redistributions can be used to adjust data assignment at run-time. In this paper, an Essential Cycle Calculation method for unequal block sizes array redistribution is presented. In the ECC method, a processor first computes the source/destination processor/data sets of array elements in the first essential cycle of the local array it owns. From the source/destination processor/data sets of array elements in the first essential cycle, we can construct packing/unpacking pattern tables. Since each essential cycle has the same communication pattern, based on the packing/unpacking pattern tables, a processor can pack/unpack array elements efficiently. To evaluate the performance of the ECC method, we have implemented this method on an IBM SP2 parallel machine and compare it with the Sequence method. The cost models for these methods are also presented. The experimental results show that the ECC method greatly outperforms the Sequence method for all test samples.

  • An Algorithm for Node-to-Set Disjoint Paths Problem in Bi-Rotator Graphs

    Keiichi KANEKO  

     
    PAPER-Parallel/Distributed Algorithms

      Vol:
    E89-D No:2
      Page(s):
    647-653

    An algorithm is described for solving the node-to-set disjoint paths problem in bi-rotator graphs, which are obtained by making each edge of a rotator graph bi-directional. The algorithm is of polynomial order of n for an n-bi-rotator graph. It is based on recursion and divided into three cases according to the distribution of destination nodes in the classes into which the nodes in a bi-rotator graph are categorized. We estimated that it obtains 2n-3 disjoint paths with a time complexity of O(n5), that the sum of the path lengths is O(n3), and that the length of the maximum path is O(n2). Computer experiment showed that the average execution time was O(n3.9) and, the average sum of the path lengths was O(n3.0).

  • A Coarse-Grain Hierarchical Technique for 2-Dimensional FFT on Configurable Parallel Computers

    Xizhen XU  Sotirios G. ZIAVRAS  

     
    PAPER-Parallel/Distributed Algorithms

      Vol:
    E89-D No:2
      Page(s):
    639-646

    FPGAs (Field-Programmable Gate Arrays) have been widely used as coprocessors to boost the performance of data-intensive applications [1],[2]. However, there are several challenges to further boost FPGA performance: the communication overhead between the host workstation and the FPGAs can be substantial; large-scale applications cannot fit in a single FPGA because of its limited capacity; mapping an application algorithm to FPGAs still remains a daunting job in configurable system design. To circumvent these problems, we propose in this paper the FPGA-based Hierarchical-SIMD (H-SIMD) machine with its codesign of the Pyramidal Instruction Set Architecture (PISA). PISA comprises high-level instructions implemented as FPGA functions of coarse-grain SIMD (Single-Instruction, Multiple-Data) tasks to facilitate ease of program development, code portability across different H-SIMD implementations and high performance. We assume a multi-FPGA board where each FPGA is configured as a separate SIMD machine. Multiple FPGA chips can work in unison at a higher SIMD level, if needed, controlled by the host. Additionally, by using a memory switching scheme and the high-level PISA to partition applications into coarse-grain tasks, host-FPGA communication overheads can be hidden. We enlist the two-dimensional Fast Fourier Transform (2D FFT) to test the effectiveness of H-SIMD. The test results show sustained high performance for this problem. The H-SIMD machine even outperforms a Xeon processor for this problem.

  • An Approximation Algorithm for Minimum Certificate Dispersal Problems

    Hua ZHENG  Shingo OMURA  Koichi WADA  

     
    PAPER-Graphs and Networks

      Vol:
    E89-A No:2
      Page(s):
    551-558

    We consider a network, where a special data called certificate is issued between two users, and all certificates issued by the users in the network can be represented by a directed graph. For any two users u and v, when u needs to send a message to v securely, v's public-key is needed. The user u can obtain v's public-key using the certificates stored in u and v. We need to disperse the certificates to the users such that when a user wants to send a message to the other user securely, there are enough certificates in them to get the reliable public-key. In this paper, when a certificate graph and a set of communication requests are given, we consider the problem to disperse the certificates among the nodes in the network, such that the communication requests are satisfied and the total number of certificates stored in the nodes is minimized. We formulate this problem as MINIMUM CERTIFICATE DISPERSAL (MCD for short). We show that MCD is NP-Complete, even if its input graph is restricted to a strongly connected graph. We also present a polynomial-time 2-approximation algorithm MinPivot for strongly connected graphs, when the communication requests satisfy some restrictions. We introduce some graph classes for which MinPivot can compute optimal dispersals, such as trees, rings, and some Cartesian products of graphs.

  • Development and Implementation of an Interactive Parallelization Assistance Tool for OpenMP: iPat/OMP

    Makoto ISHIHARA  Hiroki HONDA  Mitsuhisa SATO  

     
    PAPER-Parallel/Distributed Programming Models, Paradigms and Tools

      Vol:
    E89-D No:2
      Page(s):
    399-407

    iPat/OMP is an interactive parallelization assistance tool for OpenMP. In the present paper, we describe the design concept of iPat/OMP, the parallelization sequence achieved by the tool and its current implementation status. In addition, we present an evaluation of the performance of the implemented functionalities. The experimental results show that iPat/OMP can detect parallelism and create an appropriate OpenMP directive for several for-loops.

  • Mapping of Hierarchical Parallel Genetic Algorithms for Protein Folding onto Computational Grids

    Weiguo LIU  Bertil SCHMIDT  

     
    PAPER-Grid Computing

      Vol:
    E89-D No:2
      Page(s):
    589-596

    Genetic algorithms are a general problem-solving technique that has been widely used in computational biology. In this paper, we present a framework to map hierarchical parallel genetic algorithms for protein folding problems onto computational grids. By using this framework, the two level communication parts of hierarchical parallel genetic algorithms are separated. Thus both parts of the algorithm can evolve independently. This permits users to experiment with alternative communication models on different levels conveniently. The underlying programming techniques are based on generic programming, a programming technique suited for the generic representation of abstract concepts. This allows the framework to be built in a generic way at application level and thus provides good extensibility and flexibility. Experiments show that it can lead to significant runtime savings on PC clusters and computational grids.

  • High-Speed MT Connector Assembly Method

    Koji SHIBATA  Masaaki TAKAYA  Kazuo HOGARI  Izumi SANKAWA  Tadashi HAIBARA  

     
    PAPER-Optical Fiber for Communications

      Vol:
    E89-B No:2
      Page(s):
    413-418

    This paper describes a high-speed MT connector assembly method. This technique uses adhesive with a short hardening time, is highly reliable and does not require a polishing process, thus reducing the connector assembly time. First, we investigated an alpha-cyanoacrylate adhesive that hardens quickly and whose adhesive strength does not decrease under high humidity and high temperature conditions, thus ensuring its excellent reliability for outside use. In addition, we investigated variations in the position of the fiber endface on the ferrule endface with a view to obtaining a low insertion loss. Based on the results, we assembled an MT connector using our proposed high-speed assembly method. We confirmed that the assembly time could be reduced to less than 70% of the time required with the conventional method. MT connectors assembled using this technique have a low insertion loss and stable environmental characteristics.

  • Analytic Performance Evaluation of OTIS-Hypercubes

    Hashem Hashemi NAJAF-ABADI  Hamid SARBAZI-AZAD  

     
    PAPER-Performance Evaluation

      Vol:
    E89-D No:2
      Page(s):
    441-451

    In this paper, routing properties of cube-based optoelectronic OTIS networks are explored. We show emulations of various cubical network topologies on their OTIS augmented variants, including the n-D grid networks, shuffle-exchange, and de Brujin networks. An analytical performance model for OTIS-cube networks is proposed. The model is validated by means of comparison with rigorously obtained simulation results. Using this model, the performance characteristics of the OTIS-hypercube network are evaluated in view of a number of different constraints. Moreover, we compare the performance characteristics of the OTIS-hypercube with that of equivalent fully-electronic networks under various implementation constraints.

  • Distributed Hierarchical Multicast Tree Algorithms for Application Layer Mesh Networks

    Weijia JIA  Wanqing TU  Jie WU  

     
    PAPER-Parallel/Distributed Algorithms

      Vol:
    E89-D No:2
      Page(s):
    654-662

    This paper proposes a set of novel distributed algorithms on m-D mesh overlay configurations for short delay and low resource consumption application layer multicast. In contrast to previous approaches, our application layer multicast adopts two-layer tree architecture and the novelty and contribution are: (1) cluster formation algorithm assigns the closest group members into the same cluster that greatly decreases the multicast delay and resource consumption caused by the message transmission among the members with long distances; (2) optimal core selection algorithm seeks the cluster member who has the minimum sum of static delay distances to other cluster members as the optimal cores (i.e. cluster cores) that guarantees the short multicast delay; (3) weighted shortest path tree generation algorithm constructs a shortest path tree rooted at the optimal core for each cluster. The shortest path tree utilizes the minimum sum of links that are on the shortest paths among the cluster members; and (4) distributed multicast routing algorithm directs the multicast messages to be efficiently distributed along the two-layer multicast architecture in parallel without a global control. The extended simulation results indicate that the application layer multicast constructed by our algorithms is efficient in terms of short multicast delay and low network resource consumption as compared with other well-known existing multicast solutions.

  • Concurrent Error Detection in Montgomery Multiplication over GF(2m)

    Che-Wun CHIOU  Chiou-Yng LEE  An-Wen DENG  Jim-Min LIN  

     
    PAPER-Information Security

      Vol:
    E89-A No:2
      Page(s):
    566-574

    Because fault-based attacks on cryptosystems have been proven effective, fault diagnosis and tolerance in cryptography have started a new surge of research and development activity in the field of applied cryptography. Without magnitude comparisons, the Montgomery multiplication algorithm is very attractive and popular for Elliptic Curve Cryptosystems. This paper will design a Montgomery multiplier array with a bit-parallel architecture in GF(2m) with concurrent error detection capability to protect it against fault-based attacks. The robust Montgomery multiplier array with concurrent error detection requires only about 0.2% extra space overhead (if m=512 is as an example) and requires four extra clock cycles compared to the original Montgomery multiplier array without concurrent error detection.

  • Transient Analysis of Complex-Domain Adaptive Threshold Nonlinear Algorithm (c-ATNA) for Adaptive Filters in Applications to Digital QAM Systems

    Shin'ichi KOIKE  

     
    PAPER-Digital Signal Processing

      Vol:
    E89-A No:2
      Page(s):
    469-478

    The paper presents an adaptive algorithm named adaptive threshold nonlinear algorithm for use in adaptive filters in the complex-number domain (c-ATNA) in applications to digital QAM systems. Although the c-ATNA is very simple to implement, it makes adaptive filters highly robust against impulse noise and at the same time it ensures filter convergence as fast as that of the well-known LMS algorithm. Analysis is developed to derive a set of difference equations for calculating transient behavior as well as steady-state performance. Experiment with simulations and theoretical calculations for some examples of filter convergence in the presence of Contaminated Gaussian Noise demonstrates that the c-ATNA is effective in combating impulse noise. Good agreement between simulated and theoretical convergence proves the validity of the analysis.

  • Audio Narrowcasting and Privacy for Multipresent Avatars on Workstations and Mobile Phones

    Owen Noel Newton FERNANDO  Kazuya ADACHI  Uresh DUMINDUWARDENA  Makoto KAWAGUCHI  Michael COHEN  

     
    PAPER

      Vol:
    E89-D No:1
      Page(s):
    73-87

    Our group is exploring interactive multi- and hypermedia, especially applied to virtual and mixed reality multimodal groupware systems. We are researching user interfaces to control source→sink transmissions in synchronous groupware (like teleconferences, chatspaces, virtual concerts, etc.). We have developed two interfaces for privacy visualization of narrowcasting (selection) functions in collaborative virtual environments (CVES): for a workstation WIMP (windows/icon/menu/pointer) GUI (graphical user interface), and for networked mobile devices, 2.5- and 3rd-generation mobile phones. The interfaces are integrated with other CVE clients, interoperating with a heterogeneous multimodal groupware suite, including stereographic panoramic browsers and spatial audio backends & speaker arrays. The narrowcasting operations comprise an idiom for selective attention, presence, and privacy-- an infrastructure for rich conferencing capability.

  • Clustering-Based Probabilistic Model Fitting in Estimation of Distribution Algorithms

    Chang Wook AHN  Rudrapatna S. RAMAKRISHNA  

     
    LETTER-Biocybernetics, Neurocomputing

      Vol:
    E89-D No:1
      Page(s):
    381-383

    An efficient clustering strategy for estimation of distribution algorithms (EDAs) is presented. It is used for properly fitting probabilistic models that play an important role in guiding search direction. To this end, a fitness-aided ordering scheme is devised for deciding the input sequence of samples (i.e., individuals) for clustering. It can effectively categorise the individuals by using the (available) information about fitness landscape. Moreover, a virtual leader is introduced for providing a reliable reference for measuring the distance from samples to its own cluster. The proposed algorithm incorporates them within the framework of random the leader algorithm (RLA). Experimental results demonstrate that the proposed approach is more effective than the existing ones with regard to probabilistic model fitting.

  • Non-Audible Murmur (NAM) Recognition

    Yoshitaka NAKAJIMA  Hideki KASHIOKA  Nick CAMPBELL  Kiyohiro SHIKANO  

     
    PAPER

      Vol:
    E89-D No:1
      Page(s):
    1-8

    We propose a new practical input interface for the recognition of Non-Audible Murmur (NAM), which is defined as articulated respiratory sound without vocal-fold vibration transmitted through the soft tissues of the head. We developed a microphone attachment, which adheres to the skin, by applying the principle of a medical stethoscope, found the ideal position for sampling flesh-conducted NAM sound vibration and retrained an acoustic model with NAM samples. Then using the Julius Japanese Dictation Toolkit, we tested the feasibility of using this method in place of an external microphone for analyzing air-conducted voice sound.

  • uT-RBAC: Ubiquitous Role-Based Access Control Model

    Song-hwa CHAE  Wonil KIM  Dong-Kyoo KIM  

     
    LETTER-Access Control

      Vol:
    E89-A No:1
      Page(s):
    238-239

    In ubiquitous environment that users access resource anytime and anywhere, access control model should consider user's location information. The proposed uT-RBAC includes the location information for user's least privilege. It also supports time related information, which enables the access control model to accommodate various ubiquitous environments. The proposed uT-RBAC can be dynamically applied to various ubiquitous computing envrionment.

  • A Universally Composable Secure Channel Based on the KEM-DEM Framework

    Waka NAGAO  Yoshifumi MANABE  Tatsuaki OKAMOTO  

     
    PAPER-Public Key Cryptography

      Vol:
    E89-A No:1
      Page(s):
    28-38

    As part of ISO standards on public-key encryption, Shoup introduced the framework of KEM (Key Encapsulation Mechanism), and DEM (Data Encapsulation Mechanism), for formalizing and realizing one-directional hybrid encryption; KEM is a formalization of asymmetric encryption specified for key distribution, which DEM is a formalization of symmetric encryption. This paper investigates a more general hybrid protocol, secure channel, that uses KEM and DEM, while KEM supports distribution of a session key and DEM, along with the session key, is used for multiple bi-directional encrypted transactions in a session. This paper shows that KEM, which is semantically secure against adaptively chosen ciphertext attacks (IND-CCA2), and DEM, which is semantically secure against adaptively chosen plaintext/ciphertext attacks (IND-P2-C2), along with secure signatures and ideal certification authority are sufficient to realize a universally composable (UC) secure channel. To obtain the main result, this paper also shows several equivalence results: UC KEM, IND-CCA2 KEM and NM-CCA2 (non-malleable against CCA2) KEM are equivalent, and UC DEM, IND-P2-C2 DEM and NM-P2-C2 DEM are equivalent.

  • Low Encoding Complexity Video Compression Based on Low-Density Parity Check Codes

    Haruhiko KANEKO  

     
    LETTER-Information Theory

      Vol:
    E89-A No:1
      Page(s):
    340-347

    Conventional video compression methods generally require a large amount of computation in the encoding process because they perform motion estimations. In order to reduce the encoding complexity for video compression, this paper proposes a new video compression method based on low-density parity check codes. The proposed method is suitable for resource-constrained devices such as mobile phones and satellite cameras.

  • Efficient Algorithms for Tate Pairing

    Tetsutaro KOBAYASHI  Kazumaro AOKI  Hideki IMAI  

     
    PAPER-Elliptic Curve Cryptography

      Vol:
    E89-A No:1
      Page(s):
    134-143

    This paper presents new algorithms for the Tate pairing on a prime field. Recently, many pairing-based cryptographic schemes have been proposed. However, computing pairings incurs a high computational cost and represents the bottleneck to using pairings in actual protocols. This paper shows that the proposed algorithms reduce the cost of multiplication and inversion on an extension field, and reduce the number of calculations of the extended finite field. This paper also discusses the optimal algorithm to be used for each pairing parameter and shows that the total computational cost is reduced by 50% if k = 6 and 57% if k = 8.

  • Wearable Telepresence System Based on Multimodal Communication for Effective Teleoperation with a Humanoid

    Yong-Ho SEO  Hun-Young PARK  Taewoo HAN  Hyun Seung YANG  

     
    PAPER

      Vol:
    E89-D No:1
      Page(s):
    11-19

    This paper presents a new type of wearable teleoperation system that can be applied to the control of a humanoid robot. The proposed system has self-contained computing hardware with a stereo head-mounted display, a microphone, a set of headphones, and a wireless LAN. It also has a mechanism that tracks arm and head motion by using several types of sensors that detect the motion data of an operator, along with a simple force reflection mechanism that uses vibration motors at appropriate joints. For remote tasks, we use intelligent self-sensory feedback and autonomous behavior, such as automatic grasping and obstacle avoidance in a slave robot, and we feed the information back to an operator through a multimodal communication channel. Through this teleoperation system, we successfully demonstrate several teleoperative tasks, including object manipulation and mobile platform control of a humanoid robot.

  • A Hill-Shift Learning Algorithm of Hopfield Network for Bipartite Subgraph Problem

    Rong-Long WANG  Kozo OKAZAKI  

     
    LETTER-Neural Networks and Bioengineering

      Vol:
    E89-A No:1
      Page(s):
    354-358

    In this paper, we present a hill-shift learning method of the Hopfield neural network for bipartite subgraph problem. The method uses the Hopfield neural network to get a near-maximum bipartite subgraph, and shifts the local minimum of energy function by adjusts the balance between two terms in the energy function to help the network escape from the state of the near-maximum bipartite subgraph to the state of the maximum bipartite subgraph or better one. A large number of instances are simulated to verify the proposed method with the simulation results showing that the solution quality is superior to that of best existing parallel algorithm.

2281-2300hit(3945hit)