The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] OMP(3945hit)

3421-3440hit(3945hit)

  • Syntactic Unification Problems under Constrained Substitutions

    Kazuhiro TAKADA  Yuichi KAJI  Tadao KASAMI  

     
    PAPER-Automata,Languages and Theory of Computing

      Vol:
    E80-D No:5
      Page(s):
    553-561

    Some kind of practical problems such as security verification of cryptographic protocols can be described as a problem to accomplish a given purpose by using limited operations and limited materials only. To model such problems in a natural way, unification problems under constrained substitutions have been proposed. This paper is a collection of results on the decidability and the computational complexity of a syntactic unification problem under constrained substitutions. A number of decidable, undecidable, tractable and intractable results of the problem are presented. Since a unification problem under constrained substitutions can be regarded as an order-sorted unification problem with term declarations such that the number of sorts is only one, the results presented in this paper also indicate how the intractability of order-sorted unification problems is reduced by restecting the number of sorts to one.

  • Wide-Angle Coupling to Multi-Mode Interference DevicesA Novel Concept for Compacting Photonic Integrated Circuits

    Martin BOUDA  Yoshiaki NAKANO  Kunio TADA  

     
    PAPER

      Vol:
    E80-C No:5
      Page(s):
    640-645

    Extremely compact multi-mode interference (MMI) devices using central wide-angle coupling of input and output waveguides are proposed. It is shown that MMI can be used to change the propagation direction of light without the need for corner mirrors or bent waveguides. The concept can also be used for very compact power splitters which are even smaller than conventional MMI power splitters. Coupling between waveguides at wide angles is discussed and a number of regularities are found. The results can be useful for the design of more compact integrated circuits by a reduction of the number of bent waveguides which usually take up the largest part of the area of a photonic integrated circuit.

  • Value Distribution of Linear Complexity for p-Ary Periodic Sequences with Period pn, p a Prime

    Satoshi UEHARA  Kyoki IMAMURA  Takayasu KAIDA  

     
    LETTER-Information Theory and Coding Theory

      Vol:
    E80-A No:5
      Page(s):
    920-921

    Firstly we show a usuful property of the fast algorithm for computing linear complexities of p-ary periodic sequences with period pn (p: a prime). Secondly the property is successfully applied to obtain the value distribution of the linear complexity for p-ary periodic sequences with period pn.

  • Surface Defect Inspection of Cold Rolled Strips with Features Based on Adaptive Wavelet Packets

    Chang Su LEE  Chong-Ho CHOI  Young CHOI  Se Ho CHOI  

     
    PAPER-Image Processing,Computer Graphics and Pattern Recognition

      Vol:
    E80-D No:5
      Page(s):
    594-604

    The defects in the cold rolled strips have textural characteristics, which are nonuniform due to its irregularities and deformities in geometrical appearance. In order to handle the textural characteristics of images with defects, this paper proposes a surface inspection method based on textural feature extraction using the wavelet transform. The wavelet transform is employed to extract local features from textural images with defects both in the frequency and in the spatial domain. To extract features effectively, an adaptive wavelet packet scheme is developed, in which the optimum number of features are produced automatically through subband coding gain. The energies for all subbands of the optimal quadtree of the adaptive wavelet packet algorithm and four entropy features in the level one LL subband, which correspond to the local features in the spatial domain, are extracted. A neural network is used to classify the defects of these features. Experiments with real image data show good training and generalization performances of the proposed method.

  • High-Performance Parallel Computation of Flows Past a Space Plane Using NWT

    Kisa MATSUSHIMA  Susumu TAKANASHI  

     
    PAPER

      Vol:
    E80-D No:4
      Page(s):
    524-530

    Compressible viscous flows past a space plane have been elucidated by parallel computation on the NWT. The NWT is a vector-parallel architecture computer system which achieves remarkably high performance in processing speed and memory storage. We have examined the advantages of the NWT in order to simulate realistic flow problems in engineering, such as the investigation of global and local aerodynamic characteristics of a space plane. The accuracy of the computational results has been verified by comparison with experimental data. The simplified domain-decomposition technique introduced here is easy to apply for parallel implementation to significantly improve the acceleration rate of computations. The larger available memory storage enables us to conduct a grid refinement study through which several points concerning CFD simulation of a space plane are obtained.

  • Parallel File Access for Implementing Dynamic Load Balancing on a Massively Parallel Computer

    Masahisa SHIMIZU  Yasuhiro OUE  Kazumasa OHNISHI  Toru KITAMURA  

     
    PAPER

      Vol:
    E80-D No:4
      Page(s):
    466-472

    Because a massively parallel computer processes vast amounts of data and generates many access requests from multiple processors simultaneously, parallel secondary storage requires large capacity and high concurrency. One effective method of implementation of such secondary storage is to use disk arrays which have multiple disks connected in parallel. In this paper, we propose a parallel file access method named DECODE (dynamic express changing of data entry) in which load balancing of each disk is achieved by dynamic determination of the write data position. For resolution of the problem of data fragmentation which is caused by the relocation of data during a write process, the concept of "Equivalent Area" is introduced. We have performed a preliminary performance evaluation using software simulation under various access statuses by changing the access pattern, access size and stripe size and confirmed the effectiveness of load balancing with this method.

  • The Effect of Optimizing Compilers on Architecture and Programs

    Michael WOLFE  

     
    INVITED PAPER

      Vol:
    E80-D No:4
      Page(s):
    403-408

    The first optimizing compiler was developed at IBM in order to prove that high level language programming could be as efficient as hand-coded machine language. Computer architecture and compiler optimization interacted through a feedback loop, from the high-level language computer architectures of the 1970s to the RISC machines of the 1980s. In the supercomputing community, the availability of effective vectorizing compilers delivered easy-to-use performance in the 1980s to the present. These compilers were successful at least in part because they could predict poor performance spots in the program and report these to users. This fostered a feedback loop between programmers and compilers to develop high performance programs. Future optimizing compilers for high performance computers and supercomputers will have to take advantage of both feedback loops.

  • Reproducing the Behavior of a Parallel Program by Using Dataflow Execution Models

    Naohisa TAKAHASHI  Takeshi MIEI  

     
    PAPER

      Vol:
    E80-D No:4
      Page(s):
    495-503

    We present a general framework with which we can evaluate the flexibility and efficiency of various replay systems for parallel programs. In our approach, program monitoring is modeled by making a virtual dataflow program graph, referred to as a VDG, that includes all the instructions executed by the program. The behavior of the program replay is modeled on the parallel interpretation of a VDG based on two basic parallel execution models for dataflow program graphs: a data-driven model and a demand-driven model. Previous attempts to replay parallel programs, known as Instant Replay and P-Sequence, are also modeled as variations of the data-driven replay, i.e. the datadriven interpretation of a VDG. We show that the demand-driven replay, i.e. the demand-driven interpretation of a VDG, is more flexible in program replay than the data-driven replay since it allows better control of parallelism and a more selective replay. We also show that we can implement a demand-driven replay that requires almost the same amount of data to be saved during program monitoring as does the data-driven replay, and which eliminates any centralized bottleneck during program monitoring by optimizing the demand propagation and using an effective data structure.

  • A High-Performance Cluster Computing Environment Based on Hybrid Shared Memory/Message Passing Model

    Yoshimasa OHNISHI  Yoshinari SUGIMOTO  Toshinori SUEYOSHI  

     
    PAPER

      Vol:
    E80-D No:4
      Page(s):
    448-454

    We conducted research and development of Distributed Supercomputing Environment (DSE) based on distributed shared memory model to serve as a cluster computing environment to provide parallel processing facilities. Shared memory model and message passing model are well-known typical models of parallel processing. It is desired that hybrid programming environment will make the best use of the prominent features of both models. Consequently, we add a new message passing mechanism to present DSE, and create a prototype called Hybrid DSE as a hybrid model based cluster computing environment. In this paper, we describe the implementation of a message passing mechanism on DSE and performance evaluation of Hybrid DSE.

  • Computational Power of Nondeterministic Ordered Binary Decision Diagrams and Their Subclasses

    Kazuyoshi TAKAGI  Koyo NITTA  Hironori BOUNO  Yasuhiko TAKENAGA  Shuzo YAJIMA  

     
    PAPER

      Vol:
    E80-A No:4
      Page(s):
    663-669

    Ordered Binary Decision Diagrams (OBDDs) are graph-based representations of Boolean functions which are widely used because of their good properties. In this paper, we introduce nondeterministic OBDDs (NOBDDs) and their restricted forms, and evaluate their expressive power. In some applications of OBDDs, canonicity, which is one of the good properties of OBDDs, is not necessary. In such cases, we can reduce the required amount of storage by using OBDDs in some non-canonical form. A class of NOBDDs can be used as a non-canonical form of OBDDs. In this paper, we focus on two particular methods which can be regarded as using restricted forms of NOBDDs. Our aim is to show how the size of OBDDs can be reduced in such forms from theoretical point of view. Firstly, we consider a method to solve satisfiability problem of combinational circuits using the structure of circuits as a key to reduce the NOBDD size. We show that the NOBDD size is related to the cutwidth of circuits. Secondly, we analyze methods that use OBDDs to represent Boolean functions as sets of product terms. We show that the class of functions treated feasibly in this representation strictly contains that in OBDDs and contained by that in NOBDDs.

  • Vienna Fortran and the Path Towards a Standard Parallel Language

    Barbara M. CHAPMAN  Piyush MEHROTRA  Hans P. ZIMA  

     
    INVITED PAPER

      Vol:
    E80-D No:4
      Page(s):
    409-416

    Highly parallel scalable multiprocessing systems (HMPs) are powerful tools for solving large-scale scientific and engineering problems. However, these machines are difficult to program since algorithms must exploit locality in order to achieve high performance. Vienna Fortran was the first fully specified data-parallel language for HMPs that provided features for the specification of data distribution and alignment at a high level of abstraction. In this paper we outline the major elements of Vienna Fortran and compare it to High Performance Fortran (HPF), a de-facto standard in this area. A significant weakness of HPF is its lack of support for many advanced applications, which require irregular data distributions and dynamic load balancing. We introduce HPF +, an extension of HPF based on Vienna Fortran, that provides the required functionality.

  • Parallelized Simulation of Complicated Polymer Structures and lts Efficiency

    Kazuhito SHIDA  Kaoru OHNO  Masayuki KIMURA  Yoshiyuki KAWAZOE  

     
    PAPER

      Vol:
    E80-D No:4
      Page(s):
    531-537

    A large scale simulation for polymer chains in good solvent is performed. The implementation technique for efficient parallel execution, optimization, and load-balancing are discussed on this practical application. Finally, a simple performance model is proposed.

  • Multi-Phase Tree Transformations

    Akio FUJIYOSHI  Takumi KASAI  

     
    LETTER-Thought and Language

      Vol:
    E80-A No:4
      Page(s):
    761-768

    In this paper, we introduce a computational mode of a tree transducer called a bi-stage transducer and study its properties. We consider a mapping on trees realized by composition of any sequence of top-down transducers and bottom-up transducers, and call such a mapping a multi-phase tree transformation. We think a multi-phase tree transformation is sufficiently powerful. It is shown that in the case of rank-preserving transducers, a multi-phase tree transformation is realized by a bi-stage transducer.

  • Counting the Number of Paths in a Graph via BDDs

    Kyoko SEKINE  Hiroshi IMAI  

     
    PAPER

      Vol:
    E80-A No:4
      Page(s):
    682-688

    This paper proposes a unified approach by means of the binary decision diagram, BDD in short, to solve #P-hand problems of counting the number of paths between two terminals in undirected and directed graphs. Our approach provides algorithms running in O (2O (n) ) time for typical planar graphs such as grid graphs. In fact, for any class of graphs having a good elimination ordering, this paradigm provides efficient solutions.

  • The Largest Common Similar Substructure Problem

    Shaoming LIU  Eiichi TANAKA  

     
    PAPER

      Vol:
    E80-A No:4
      Page(s):
    643-650

    This paper discusses the largest common similar substructure (in short, LCSS) problem for trees. The problem is, for all pairs of "substructure of A and that of B," to find one of them, denoted by A and B', such that A is most similar to B' and the sum of the number of vertices of A and that of B' is largest. An algorithm for the LCSS problem for unrooted and unordered trees (in short, trees) and that for trees embedded in a plane (in short, Co-trees) are proposed. The time complexity of the algorithm for trees is O (max (ma, mb)2 NaNb) and that for CO-trees is O (mambNaNb), where, ma (mb) and Na (Nb) are the largest degree of a vertex of tree Ta (Tb) and the number of vertices of Ta (Tb), respectively. It is easy to modify the algorithms for enumerating all of the LCSSs for trees and CO-trees. The algorithms can be applied to structure-activity studies in chemistry and various structure comparison problems.

  • The p-Collection Problem in a Flow Network with Lower Bounds

    Kaoru WATANABE  Hiroshi TAMURA  Keisuke NAKANO  Masakazu SENGOKU  

     
    PAPER

      Vol:
    E80-A No:4
      Page(s):
    651-657

    In this paper we extend the p-collection problem to a flow network with lower bounds, and call the extended problem the lower-bounded p-collection problem. First we discuss the complexity of this problem to show NP-hardness for a network with path structure. Next we present a linear time algorithm for the lower-bounded 1-collection problem in a network with tree structure, and a pseudo-polynomial time algorithm with dynamic programming type for the lower-bounded p-collection problem in a network with tree structure. Using the pseudo-polynomial time algorithm, we show an exponential algorithm, which is efficient in a connected network with few cycles, for the lower-bounded p-collection problem.

  • Interval Finding and Its Application to Data Mining

    Takeshi FUKUDA  Yasuhiko MORIMOTO  Shinichi MORISHITA  Takeshi TOKUYAMA  

     
    PAPER

      Vol:
    E80-A No:4
      Page(s):
    620-626

    In this paper, we investigate inverse problems of the interval query problem in application to data mining. Let I be the set of all intervals on U = {1, 2, , n}. Consider an objective function f(I), conditional functions ui(I) on I, and define an optimization problem of finding the interval I maximizing f(I) subject to ui(I) > Ki for given real numbers Ki (i = 1, 2, , h). We propose efficient alogorithms to solve the above optimization problem if the objective function is either additive or quotient, and the conditional functions are additive, where a function f is additive if f(I) = ΣiIf^(i) extending a function f^ on U, and quotient if it is represented as a quotient of two additive functions. We use computational-geometric methods such as convex hull, range searching, and multidimensional divide-and-conquer.

  • Polynomials Approximating Complex Functions

    Masao KODAMA  Kengo TAIRA  

     
    LETTER-Numerical Analysis and Optimization

      Vol:
    E80-A No:4
      Page(s):
    778-781

    We frequently use a polynomial to approximate a complex function. This study shows a method which determines the optimum coefficients and the number of terms of the polynomial, and the error of the polynomial is estimated.

  • Data-Localization Scheduling inside Processor-Cluster for Multigrain Parallel Processing

    Akimasa YOSHIDA  Ken'ichi KOSHIZUKA  Wataru OGATA  Hironori KASAHARA  

     
    PAPER

      Vol:
    E80-D No:4
      Page(s):
    473-479

    This paper proposes a data-localization scheduling scheme inside a processor-cluster for multigrain parallel processing, which hierarchically exploits parallelism among coarsegrain tasks like loops, medium-grain tasks like loop iterations and near-fine-grain tasks like statements. The proposed scheme assigns near-fine-grain or medium-grain tasks inside coarse-grain tasks onto processors inside a processor-cluster so that maximum parallelism can be exploited and inter-processor data transfer can be minimum after data-localization for coarse-grain tasks across processor-clusters. Performance evaluation on a multiprocessor system OSCAR shows that multigrain parallel processing with the proposed data-localization scheduling can reduce execution time for application programs by 10% compared with multigrain parallel processing without data-localization.

  • High Performance Two-Phase Asynchronous Pipelines

    Sam APPLETON  Shannon MORTON  Michael LIEBELT  

     
    PAPER-Design

      Vol:
    E80-D No:3
      Page(s):
    287-295

    In this paper we describe the implementation of complex architectures using a general design approach for two-phase asynchronous systems. This fundamental approach, called Event Controlled Systems, can be used to widely extend the utility of two phase systems. We describe solutions that we have developed that dramatically improve the performance of static and dynamic-logic asynchronous pipelines, and briefly describe a complex microprocessor designed using ECS.

3421-3440hit(3945hit)