The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Ti(30728hit)

20001-20020hit(30728hit)

  • VLaTTe: A Java Just-in-Time Compiler for VLIW with Fast Scheduling and Register Allocation

    Suhyun KIM  Soo-Mook MOON  Kemal EBCIOLU  Erik ALTMAN  

     
    PAPER-Software Support and Optimization Techniques

      Vol:
    E87-D No:7
      Page(s):
    1712-1720

    For network computing on desktop machines, fast execution of Java bytecode programs is essential because these machines are expected to run substantial application programs written in Java. We believe higher Java performance can be achieved by exploiting instruction-level parallelism (ILP) in the context of Java JIT compilation. This paper introduces VLaTTe, a Java JIT compiler for VLIW machines that performs efficient scheduling while doing fast register allocation. It is an extended version of our previous JIT compiler for RISC machines called LaTTe whose translation overhead is low (i.e., consistently taking one or two seconds for SPECJVM98 benchmarks) due to its fast register allocation. VLaTTe adds the scheduling capability onto the same framework of register allocation, with a constraint for precise in-order exception handling which guarantees the same Java exception behavior with the original bytecode program. Our experimental results on the SPECJVM98 benchmarks show that VLaTTe achieves a geometric mean of useful IPC 1.7 (2-ALU), 2.1 (4-ALU), and 2.3 (8-ALU), while the scheduling/allocation overhead is 3.6 times longer than LaTTe's on average, which appears to be reasonable.

  • An Acceleration Processor for Data Intensive Scientific Computing

    Cheong Ghil KIM  Hong-Sik KIM  Sungho KANG  Shin Dug KIM  Gunhee HAN  

     
    PAPER-Scientific and Engineering Computing with Applications

      Vol:
    E87-D No:7
      Page(s):
    1766-1773

    Scientific computations for diffusion equations and ANNs (Artificial Neural Networks) are data intensive tasks accompanied by heavy memory access; on the other hand, their computational complexities are relatively low. Thus, this type of tasks naturally maps onto SIMD (Single Instruction Multiple Data stream) parallel processing with distributed memory. This paper proposes a high performance acceleration processor of which architecture is optimized for scientific computing using diffusion equations and ANNs. The proposed architecture includes a customized instruction set and specific hardware resources which consist of a control unit (CU), 16 processing units (PUs), and a non-linear function unit (NFU) on chip. They are effectively connected with dedicated ring and global bus structure. Each PU is equipped with an address modifier (AM) and 16-bit 1.5 k-word local memory (LM). The proposed processor can be easily expanded by multi-chip expansion mode to accommodate to a large scale parallel computation. The prototype chip is implemented with FPGA. The total gate count is about 1 million with 530, 432-bit embedded memory cells and it operates at 15 MHz. The functionality and performance of the proposed processor is verified with simulation of oil reservoir problem using diffusion equations and character recognition application using ANNs. The execution times of two applications are compared with software realizations on 1.7 GHz Pentium IV personal computer. Though the proposed processor architecture and the instruction set are optimized for diffusion equations and ANNs, it provides flexibility to program for many other scientific computation algorithms.

  • Algorithmic Concept Recognition to Support High Performance Code Reengineering

    Beniamino DI MARTINO  

     
    PAPER-Software Support and Optimization Techniques

      Vol:
    E87-D No:7
      Page(s):
    1743-1750

    Techniques for automatic program recognition, at the algorithmic level, could be of high interest for the area of Software Maintenance, in particular for knowledge based reengineering, because the selection of suitable restructuring strategies is mainly driven by algorithmic features of the code. In this paper an automated hierarchical concept parsing recognition technique, and a formalism for the specification of algorithmic concepts, is presented. Based on this technique, the design and development of ALCOR, a production rule based system for automatic recognition of algorithmic concepts within programs, aimed at support of knowledge based reengineering for high performance, is presented.

  • Proposal of a Tree Load Balancing Algorithm to Grid Computing Environments

    Rodrigo Fernandes de MELLO  Erico C. T. de MATTOS  Luis Carlos TREVELIN  Maria Stela Veludo de PAIVA  Laurence T. YANG  

     
    PAPER-Software Support and Optimization Techniques

      Vol:
    E87-D No:7
      Page(s):
    1729-1736

    The availability of a low cost hardware has increased the development of distributed systems, by making then more and more accessible. In order to optimize the resources allocation on the distributed systems, some load balancing algorithms have been proposed. These algorithms distribute the application loads over the environment computers, make homogeneous the occupation of the whole environment and increase the application performance. This equal distribution prevents certain computers to get overloaded, to the detriment of the idleness of the other ones. This article proposes and analyzes the TLBAGrid, a load balancing algorithm for Grid computing environments.

  • Traditional File Systems versus DualFS: A Performance Comparison Approach

    Juan PIERNAS  Toni CORTES  Jose M. GARCIA  

     
    PAPER-Software Support and Optimization Techniques

      Vol:
    E87-D No:7
      Page(s):
    1703-1711

    DualFS is a next-generation journaling file system which has the same consistency guaranties as traditional journaling file systems but better performance. This paper introduces three new enhancements which significantly improve DualFS performance during normal operation, and presents different experimental results which compare DualFS and other traditional file systems, namely, Ext2, Ext3, XFS, JFS, and ReiserFS. The experiments carried out prove, for the first time, that a new file system design based on separation of data and metadata can significantly improve file systems' performance without requiring several storage devices.

  • A Method to Preserve Layered Architectural Style in Development Phases

    Chanjin PARK  Euyseok HONG  Chisu WU  

     
    LETTER-Software Engineering

      Vol:
    E87-D No:7
      Page(s):
    1965-1970

    This paper proposes a new type of relationship between layers in layered architecture and shows how to concretize the relationship between layers into design constraints. The meaning of layer relationship is explained with examples from design patterns and Microsoft COM. In addition, a prototype tool to check conformance is implemented and the architecture document of an open-source software project is checked against the actual architecture extracted from source code developed by many international developers. As a result of checking, parts that do not conform to the architecture document are investigated and it is pointed out that their modifications should be controlled with caution.

  • Reduced-State Sequence Estimation for Coded Modulation in CPSC on Frequency-Selective Fading Channels

    Jeong-Woo JWA  

     
    LETTER-Wireless Communication Technology

      Vol:
    E87-B No:7
      Page(s):
    2040-2044

    Reduced-state sequence estimation (RSSE) for trellis-coded modulation (TCM) in cyclic prefixed single carrier (CPSC) with minimum mean-square error-linear equalization (MMSE-LE) on frequency-selective Rayleigh fading channels is proposed. The Viterbi algorithm (VA) is used to search for the best path through the reduced-state trellis combined equalization and TCM decoding. Computer simulations confirm the symbol error probability of the proposed scheme.

  • A Distributed 3D Rendering Application for Massive Data Sets

    Huabing ZHU  Tony K.Y. CHAN  Lizhe WANG  Reginald C. JEGATHESE  

     
    PAPER-Distributed, Grid and P2P Computing

      Vol:
    E87-D No:7
      Page(s):
    1805-1812

    This paper presents a prototype of a distributed 3D rendering system in a hierarchical Grid environment. 3D rendering with massive data sets is a computationally intensive task. In order to make full use of computational resources on Grids, a hierarchical system architecture is designed to run over multiple clusters. This architecture involves both sort-first and sort-last parallel rendering algorithms to achieve excellent scalability, rendering performance and load balance.

  • Simulation of Simultaneous Multi-Wavelength Conversion in GaN/AlN Intersubband Optical Amplifiers

    Nobuo SUZUKI  

     
    PAPER

      Vol:
    E87-C No:7
      Page(s):
    1155-1160

    Simultaneous wavelength conversion utilizing four-wave mixing in optically-pumped GaN/AlN intersubband optical amplifiers has been investigated by means of a finite-difference time-domain (FDTD) model. The conversion efficiencies at a pump power of +7-+10 dBm were predicted to be -9-+6 dB depending on the frequency detuning (0.3-10.9 THz). The difference in efficiency among 18 channels of WDM signals with 100-GHz spacing was within about 3 dB.

  • Enhancing ICP with P2P Technology: Cost, Availability, and Reconfiguration

    Ping-Jer YEH  Yu-Chen CHUANG  Shyan-Ming YUAN  

     
    PAPER-Networking and System Architectures

      Vol:
    E87-D No:7
      Page(s):
    1641-1648

    Traditional Web cache servers based on HTTP and ICP infrastructure tend to have higher hardware and management cost, have difficulty in availability, automatic and dynamic reconfiguration, and may have slow links to some users. We find that peer-to-peer technology can help solve these problems. The peer cache service (PCS) we proposed here leverages each peer's local cache, similar access patterns, fully distributed coordination, and fast communication channels to enhance response time, scale of cacheable objects, and availability. Moreover, incorporating goals and strategies such as making the protocol lightweight and mutually compatible with existing cache infrastructure, supporting mobile devices, undertaking dynamic three-level caching, and exchanging cache meta-information further improve the effectiveness and differentiate our work from other similar-at-first-glance P2P Web cache systems.

  • Utilization of the On-Chip L2 Cache Area in CC-NUMA Multiprocessors for Applications with a Small Working Set

    Sung Woo CHUNG  Hyong-Shik KIM  Chu Shik JHON  

     
    PAPER-Networking and System Architectures

      Vol:
    E87-D No:7
      Page(s):
    1617-1624

    In CC-NUMA multiprocessor systems, it is important to reduce the remote memory access time. Based upon the fact that increasing the size of the LRU second-level (L2) cache more than a certain value does not reduce the cache miss rate significantly, in this paper, we propose two split L2 caches to utilize the surplus of the L2 cache. The split L2 caches are composed of a traditional LRU cache and another cache to reduce the remote memory access time. Both work together to reduce total L2 cache miss time by keeping remote (or long-distance) blocks as well as recently used blocks. For another cache, we propose two alternatives: an L2-RVC (Level 2 - Remote Victim Cache) and an L2-DAVC (Level 2 - Distance-Aware Victim Cache). The proposed split L2 caches reduce total execution time by up to 27%. It is also found that the proposed split L2 caches outperform the traditional single LRU cache of double size.

  • Multiple Access Systems with QPSK Modulation

    Ha H. NGUYEN  Huy G. VU  David E. DODDS  

     
    LETTER-Spread Spectrum Technologies and Applications

      Vol:
    E87-A No:7
      Page(s):
    1833-1835

    This letter considers multiple access systems without bandwidth expansion. To improve the spectral efficiency, each user employs a QPSK modulation. The orientation of QPSK constellations is designed to maximize the minimum distance of the superimposed symbol constellation. The upper and lower bounds for the error performance of the proposed design demonstrate its advantage.

  • Ultrafast All Optical Switching Using Pulse Trapping by Ultrashort Soliton Pulse

    Norihiko NISHIZAWA  Toshio GOTO  

     
    INVITED PAPER

      Vol:
    E87-C No:7
      Page(s):
    1148-1154

    Ultrafast all optical switching using pulse trapping by 100 fs ultrashort soliton pulse across zero dispersion wavelength is investigated. The characteristics of pulse trapping are analyzed both experimentally and numerically. Using the pulse trapping, 1 THz ultrafast all optical switching is demonstrated experimentally. Arbitral one pulse is picked off from pulse train. Pulse trapping for CW signal is also demonstrated and ultrashort pulse is generated by pulse trapping. From these investigation, it is shown that ultrafast all optical switching up to 2 THz can be demonstrated using pulse trapping.

  • The Role of Fast Carrier Dynamics in SOA Based Devices

    Jesper MØRK  Tommy W. BERG  Mads L. NIELSEN  Alexander V. USKOV  

     
    INVITED PAPER

      Vol:
    E87-C No:7
      Page(s):
    1126-1133

    We describe the characteristics of all-optical switching schemes based on semiconductor optical amplifiers (SOAs), with particular emphasis on the role of the fast carrier dynamics. The SOA response to a single short pulse as well as to a data-modulated pulse train is investigated and the properties of schemes relying on cross-gain as well as cross-phase modulation are discussed. The possible benefits of using SOAs with quantum dot active regions are theoretically analyzed. The bandfilling characteristics and the presence of fast capture processes may allow to reach bitrates in excess of 100 Gb/s even for simple cross-gain modulation schemes.

  • Optical Packet Switching Network Based on Ultra-Fast Optical Code Label Processing

    Naoya WADA  Hiroaki HARAI  Fumito KUBOTA  

     
    INVITED PAPER

      Vol:
    E87-C No:7
      Page(s):
    1090-1096

    Ultrahigh-speed all-optical label processing method is proposed and experimentally demonstrated. This processing method dramatically increases the label processing capability. Optical packet switch (OPS) systems and networks based on OPS nodes are applications of optical processing technologies. For the experiment, we constructed the world's first 40 Gbit/s/port OPS prototype with an all-optical label processor, optical switch, optical buffer, and electronic scheduler. Three-hop optical packet routing using OPS nodes was experimentally demonstrated with it, verifying the feasibility of OPS networks.

  • Auto Focusing Algorithm for Iris Recognition Camera Using Corneal Specular Reflection

    Kang Ryoung PARK  

    This paper was deleted on March 10, 2006 because it was found to be a duplicate submission (see details in the pdf file).
     
    PAPER-Image Processing and Video Processing

      Vol:
    E87-D No:7
      Page(s):
    1923-1934

    Iris recognition is used to identify a user based on the iris texture information which exists between the white sclera and the black pupil. For fast iris recognition, it is very important to capture user's focused eye image at fast speed. If not, the total recognition time is increased and it makes the user feel much inconvenience. In previous researches and systems, they use the focusing method which has been used for general landscape scene without considering the characteristics of iris image. So, they take much focusing time sometimes, especially in case of the user with glasses. To overcome such problems, we propose a new iris image acquisition method to capture user's focused eye image at very fast speed based on the corneal specular reflection. Experimental results show that the focusing time for both the users with glasses and without glasses is average 480 ms and we can conclude our method can be used for the real-time iris recognition camera.

  • Ultrafast All-Optical Switching and Modulation Using Intersubband Transitions in Coupled Quantum Well Structures

    Haruhiko YOSHIDA  Takasi SIMOYAMA  Achanta Venu GOPAL  Jun-ichi KASAI  Teruo MOZUME  Hiroshi ISHIKAWA  

     
    INVITED PAPER

      Vol:
    E87-C No:7
      Page(s):
    1134-1141

    In this report we present all-optical switches and modulators based on the intersubband transition in semiconductor quantum wells. The use of InGaAs/AlAsSb coupled double quantum well structures is proposed to facilitate intersubband transitions in the optical-communication band, and to reduce the intersubband absorption recovery time from several picoseconds to a few hundred femtoseconds by utilizing enhanced electron-phonon scattering. Subpicosecond all-optical gating and modulation in coupled double quantum wells are observed using pump-probe experiments at optical-communication wavelengths. The results indicate that the intersubband transition in this structure is very useful for ultrafast all-optical switching devices.

  • Packing/Unpacking Using MPI User-Defined Datatypes for Efficient Data Redistribution

    Sheng-Wen BAI  Chu-Sing YANG  Tsung-Chuan HUANG  

     
    PAPER-Software Support and Optimization Techniques

      Vol:
    E87-D No:7
      Page(s):
    1721-1728

    In many parallel programs, run-time data redistribution is usually required to enhance data locality and reduce remote memory access on the distributed memory multicomputers. Research on data redistribution algorithms has recently matured. The time required to generate data sets and processor sets is much lesser than before. Therefore, packing/unpacking has become a relatively high cost in redistribution. In this paper, we present methods to perform BLOCK-CYCLIC(s) to BLOCK-CYCLIC(t) redistribution, using MPI user-defined datatypes. This method reduces the required memory buffers and avoids unnecessary movement of data. Theoretical models are presented to determine the best method for redistribution. The methods were implemented on an IBM SP2 parallel machine to evaluate the performance of the proposed methods. The experimental results indicate that this approach can clearly improve the redistribution in most cases.

  • Visual Customer Relationship Management System that Supports Broadband Network E-Commerce

    Tetsushi MORITA  Tetsuo HIDAKA  Tomohiko NAKAMURA  Morihide OINUMA  Yutaka HIRAKAWA  

     
    PAPER-Network Application

      Vol:
    E87-B No:7
      Page(s):
    1789-1796

    Recently, broadband access is widely spreading, and many broadband network E-commerce services are planned and developed. This article proposes a broadband online shop where a videoconferencing system is used to enable direct, face-to-face communication. It is important for a broadband online shop to understand what preference their customers want in order to provide them with more appropriate information. By using customer preferences, a salesclerk can have a serviceable conversation with few questions to his online customers. So, we are developing a visual Customer Relationship Management system (v-CRM system) that offers customer preferences to broadband network service such as broadband online shop. In this paper, we classify customer preferences, and describe three visualization methods that enable customer preferences to be intuitively understood quickly. We outline the v-CRM evaluation system and describe an experiment where we evaluated how accurately customer preferences can be recognized using these methods. The results show that v-CRM system is effective for understanding customer preferences.

  • Node Mobility Aware Routing for Mobile Ad Hoc Network

    Shinichi FURUSHO  Teruaki KITASUKA  Tsuneo NAKANISHI  Akira FUKUDA  

     
    LETTER

      Vol:
    E87-B No:7
      Page(s):
    1926-1930

    In ad-hoc on-demand routing algorithm, when a route is broken a relay node must perform error transaction and the source node must do rerouting to discover an alternate route. It is important to construct a stable route when route discovery occurs. In this paper, we use relative speeds among nodes as a measure of node mobility. Our routing algorithm chooses nodes of lower relative speed as relay nodes. As a result of our simulation, when there is one session in the network, our proposing algorithm can reduce the number of route breaks: about 3 times smaller than DSR. And our proposing algorithm can deliver more packets than DSR: 18% higher rate. However, in the congested traffic situation our algorithm should be improved.

20001-20020hit(30728hit)