IEICE global.ieice.org Site

Keyword Search Result

[Keyword] Ti(30728hit)

20001-20020hit(30728hit)

VLaTTe: A Java Just-in-Time Compiler for VLIW with Fast Scheduling and Register Allocation
Suhyun KIM Soo-Mook MOON Kemal EBCIOLU Erik ALTMAN

PAPER-Software Support and Optimization Techniques

Vol:
E87-D No:7
Page(s):
1712-1720
For network computing on desktop machines, fast execution of Java bytecode programs is essential because these machines are expected to run substantial application programs written in Java. We believe higher Java performance can be achieved by exploiting instruction-level parallelism (ILP) in the context of Java JIT compilation. This paper introduces VLaTTe, a Java JIT compiler for VLIW machines that performs efficient scheduling while doing fast register allocation. It is an extended version of our previous JIT compiler for RISC machines called LaTTe whose translation overhead is low (i.e., consistently taking one or two seconds for SPECJVM98 benchmarks) due to its fast register allocation. VLaTTe adds the scheduling capability onto the same framework of register allocation, with a constraint for precise in-order exception handling which guarantees the same Java exception behavior with the original bytecode program. Our experimental results on the SPECJVM98 benchmarks show that VLaTTe achieves a geometric mean of useful IPC 1.7 (2-ALU), 2.1 (4-ALU), and 2.3 (8-ALU), while the scheduling/allocation overhead is 3.6 times longer than LaTTe's on average, which appears to be reasonable.
An Acceleration Processor for Data Intensive Scientific Computing
Cheong Ghil KIM Hong-Sik KIM Sungho KANG Shin Dug KIM Gunhee HAN

PAPER-Scientific and Engineering Computing with Applications

Vol:
E87-D No:7
Page(s):
1766-1773
Scientific computations for diffusion equations and ANNs (Artificial Neural Networks) are data intensive tasks accompanied by heavy memory access; on the other hand, their computational complexities are relatively low. Thus, this type of tasks naturally maps onto SIMD (Single Instruction Multiple Data stream) parallel processing with distributed memory. This paper proposes a high performance acceleration processor of which architecture is optimized for scientific computing using diffusion equations and ANNs. The proposed architecture includes a customized instruction set and specific hardware resources which consist of a control unit (CU), 16 processing units (PUs), and a non-linear function unit (NFU) on chip. They are effectively connected with dedicated ring and global bus structure. Each PU is equipped with an address modifier (AM) and 16-bit 1.5 k-word local memory (LM). The proposed processor can be easily expanded by multi-chip expansion mode to accommodate to a large scale parallel computation. The prototype chip is implemented with FPGA. The total gate count is about 1 million with 530, 432-bit embedded memory cells and it operates at 15 MHz. The functionality and performance of the proposed processor is verified with simulation of oil reservoir problem using diffusion equations and character recognition application using ANNs. The execution times of two applications are compared with software realizations on 1.7 GHz Pentium IV personal computer. Though the proposed processor architecture and the instruction set are optimized for diffusion equations and ANNs, it provides flexibility to program for many other scientific computation algorithms.
Algorithmic Concept Recognition to Support High Performance Code Reengineering
Beniamino DI MARTINO

PAPER-Software Support and Optimization Techniques

Vol:
E87-D No:7
Page(s):
1743-1750
Techniques for automatic program recognition, at the algorithmic level, could be of high interest for the area of Software Maintenance, in particular for knowledge based reengineering, because the selection of suitable restructuring strategies is mainly driven by algorithmic features of the code. In this paper an automated hierarchical concept parsing recognition technique, and a formalism for the specification of algorithmic concepts, is presented. Based on this technique, the design and development of ALCOR, a production rule based system for automatic recognition of algorithmic concepts within programs, aimed at support of knowledge based reengineering for high performance, is presented.
Proposal of a Tree Load Balancing Algorithm to Grid Computing Environments
Rodrigo Fernandes de MELLO Erico C. T. de MATTOS Luis Carlos TREVELIN Maria Stela Veludo de PAIVA Laurence T. YANG

PAPER-Software Support and Optimization Techniques

Vol:
E87-D No:7
Page(s):
1729-1736
The availability of a low cost hardware has increased the development of distributed systems, by making then more and more accessible. In order to optimize the resources allocation on the distributed systems, some load balancing algorithms have been proposed. These algorithms distribute the application loads over the environment computers, make homogeneous the occupation of the whole environment and increase the application performance. This equal distribution prevents certain computers to get overloaded, to the detriment of the idleness of the other ones. This article proposes and analyzes the TLBAGrid, a load balancing algorithm for Grid computing environments.
Traditional File Systems versus DualFS: A Performance Comparison Approach
Juan PIERNAS Toni CORTES Jose M. GARCIA

PAPER-Software Support and Optimization Techniques

Vol:
E87-D No:7
Page(s):
1703-1711
DualFS is a next-generation journaling file system which has the same consistency guaranties as traditional journaling file systems but better performance. This paper introduces three new enhancements which significantly improve DualFS performance during normal operation, and presents different experimental results which compare DualFS and other traditional file systems, namely, Ext2, Ext3, XFS, JFS, and ReiserFS. The experiments carried out prove, for the first time, that a new file system design based on separation of data and metadata can significantly improve file systems' performance without requiring several storage devices.
A Method to Preserve Layered Architectural Style in Development Phases
Chanjin PARK Euyseok HONG Chisu WU

LETTER-Software Engineering

Vol:
E87-D No:7
Page(s):
1965-1970
This paper proposes a new type of relationship between layers in layered architecture and shows how to concretize the relationship between layers into design constraints. The meaning of layer relationship is explained with examples from design patterns and Microsoft COM. In addition, a prototype tool to check conformance is implemented and the architecture document of an open-source software project is checked against the actual architecture extracted from source code developed by many international developers. As a result of checking, parts that do not conform to the architecture document are investigated and it is pointed out that their modifications should be controlled with caution.
Reduced-State Sequence Estimation for Coded Modulation in CPSC on Frequency-Selective Fading Channels
Jeong-Woo JWA

LETTER-Wireless Communication Technology

Vol:
E87-B No:7
Page(s):
2040-2044
Reduced-state sequence estimation (RSSE) for trellis-coded modulation (TCM) in cyclic prefixed single carrier (CPSC) with minimum mean-square error-linear equalization (MMSE-LE) on frequency-selective Rayleigh fading channels is proposed. The Viterbi algorithm (VA) is used to search for the best path through the reduced-state trellis combined equalization and TCM decoding. Computer simulations confirm the symbol error probability of the proposed scheme.
A Distributed 3D Rendering Application for Massive Data Sets
Huabing ZHU Tony K.Y. CHAN Lizhe WANG Reginald C. JEGATHESE

PAPER-Distributed, Grid and P2P Computing

Vol:
E87-D No:7
Page(s):
1805-1812
This paper presents a prototype of a distributed 3D rendering system in a hierarchical Grid environment. 3D rendering with massive data sets is a computationally intensive task. In order to make full use of computational resources on Grids, a hierarchical system architecture is designed to run over multiple clusters. This architecture involves both sort-first and sort-last parallel rendering algorithms to achieve excellent scalability, rendering performance and load balance.
Simulation of Simultaneous Multi-Wavelength Conversion in GaN/AlN Intersubband Optical Amplifiers
Nobuo SUZUKI

PAPER

Vol:
E87-C No:7
Page(s):
1155-1160
Simultaneous wavelength conversion utilizing four-wave mixing in optically-pumped GaN/AlN intersubband optical amplifiers has been investigated by means of a finite-difference time-domain (FDTD) model. The conversion efficiencies at a pump power of +7-+10 dBm were predicted to be -9-+6 dB depending on the frequency detuning (0.3-10.9 THz). The difference in efficiency among 18 channels of WDM signals with 100-GHz spacing was within about 3 dB.
Enhancing ICP with P2P Technology: Cost, Availability, and Reconfiguration
Ping-Jer YEH Yu-Chen CHUANG Shyan-Ming YUAN

PAPER-Networking and System Architectures

Vol:
E87-D No:7
Page(s):
1641-1648
Traditional Web cache servers based on HTTP and ICP infrastructure tend to have higher hardware and management cost, have difficulty in availability, automatic and dynamic reconfiguration, and may have slow links to some users. We find that peer-to-peer technology can help solve these problems. The peer cache service (PCS) we proposed here leverages each peer's local cache, similar access patterns, fully distributed coordination, and fast communication channels to enhance response time, scale of cacheable objects, and availability. Moreover, incorporating goals and strategies such as making the protocol lightweight and mutually compatible with existing cache infrastructure, supporting mobile devices, undertaking dynamic three-level caching, and exchanging cache meta-information further improve the effectiveness and differentiate our work from other similar-at-first-glance P2P Web cache systems.
Utilization of the On-Chip L2 Cache Area in CC-NUMA Multiprocessors for Applications with a Small Working Set
Sung Woo CHUNG Hyong-Shik KIM Chu Shik JHON

PAPER-Networking and System Architectures

Vol:
E87-D No:7
Page(s):
1617-1624
In CC-NUMA multiprocessor systems, it is important to reduce the remote memory access time. Based upon the fact that increasing the size of the LRU second-level (L2) cache more than a certain value does not reduce the cache miss rate significantly, in this paper, we propose two split L2 caches to utilize the surplus of the L2 cache. The split L2 caches are composed of a traditional LRU cache and another cache to reduce the remote memory access time. Both work together to reduce total L2 cache miss time by keeping remote (or long-distance) blocks as well as recently used blocks. For another cache, we propose two alternatives: an L2-RVC (Level 2 - Remote Victim Cache) and an L2-DAVC (Level 2 - Distance-Aware Victim Cache). The proposed split L2 caches reduce total execution time by up to 27%. It is also found that the proposed split L2 caches outperform the traditional single LRU cache of double size.
Multiple Access Systems with QPSK Modulation
Ha H. NGUYEN Huy G. VU David E. DODDS

LETTER-Spread Spectrum Technologies and Applications

Vol:
E87-A No:7
Page(s):
1833-1835
This letter considers multiple access systems without bandwidth expansion. To improve the spectral efficiency, each user employs a QPSK modulation. The orientation of QPSK constellations is designed to maximize the minimum distance of the superimposed symbol constellation. The upper and lower bounds for the error performance of the proposed design demonstrate its advantage.
Ultrafast All Optical Switching Using Pulse Trapping by Ultrashort Soliton Pulse
Norihiko NISHIZAWA Toshio GOTO

INVITED PAPER

Vol:
E87-C No:7
Page(s):
1148-1154
Ultrafast all optical switching using pulse trapping by 100 fs ultrashort soliton pulse across zero dispersion wavelength is investigated. The characteristics of pulse trapping are analyzed both experimentally and numerically. Using the pulse trapping, 1 THz ultrafast all optical switching is demonstrated experimentally. Arbitral one pulse is picked off from pulse train. Pulse trapping for CW signal is also demonstrated and ultrashort pulse is generated by pulse trapping. From these investigation, it is shown that ultrafast all optical switching up to 2 THz can be demonstrated using pulse trapping.
The Role of Fast Carrier Dynamics in SOA Based Devices
Jesper MØRK Tommy W. BERG Mads L. NIELSEN Alexander V. USKOV

INVITED PAPER

Vol:
E87-C No:7
Page(s):
1126-1133
We describe the characteristics of all-optical switching schemes based on semiconductor optical amplifiers (SOAs), with particular emphasis on the role of the fast carrier dynamics. The SOA response to a single short pulse as well as to a data-modulated pulse train is investigated and the properties of schemes relying on cross-gain as well as cross-phase modulation are discussed. The possible benefits of using SOAs with quantum dot active regions are theoretically analyzed. The bandfilling characteristics and the presence of fast capture processes may allow to reach bitrates in excess of 100 Gb/s even for simple cross-gain modulation schemes.
Optical Packet Switching Network Based on Ultra-Fast Optical Code Label Processing
Naoya WADA Hiroaki HARAI Fumito KUBOTA

INVITED PAPER

Vol:
E87-C No:7
Page(s):
1090-1096
Ultrahigh-speed all-optical label processing method is proposed and experimentally demonstrated. This processing method dramatically increases the label processing capability. Optical packet switch (OPS) systems and networks based on OPS nodes are applications of optical processing technologies. For the experiment, we constructed the world's first 40 Gbit/s/port OPS prototype with an all-optical label processor, optical switch, optical buffer, and electronic scheduler. Three-hop optical packet routing using OPS nodes was experimentally demonstrated with it, verifying the feasibility of OPS networks.
Auto Focusing Algorithm for Iris Recognition Camera Using Corneal Specular Reflection
Kang Ryoung PARK

This paper was deleted on March 10, 2006 because it was found to be a duplicate submission (see details in the pdf file).

PAPER-Image Processing and Video Processing

Vol:
E87-D No:7
Page(s):
1923-1934
Iris recognition is used to identify a user based on the iris texture information which exists between the white sclera and the black pupil. For fast iris recognition, it is very important to capture user's focused eye image at fast speed. If not, the total recognition time is increased and it makes the user feel much inconvenience. In previous researches and systems, they use the focusing method which has been used for general landscape scene without considering the characteristics of iris image. So, they take much focusing time sometimes, especially in case of the user with glasses. To overcome such problems, we propose a new iris image acquisition method to capture user's focused eye image at very fast speed based on the corneal specular reflection. Experimental results show that the focusing time for both the users with glasses and without glasses is average 480 ms and we can conclude our method can be used for the real-time iris recognition camera.
Ultrafast All-Optical Switching and Modulation Using Intersubband Transitions in Coupled Quantum Well Structures
Haruhiko YOSHIDA Takasi SIMOYAMA Achanta Venu GOPAL Jun-ichi KASAI Teruo MOZUME Hiroshi ISHIKAWA

INVITED PAPER

Vol:
E87-C No:7
Page(s):
1134-1141
In this report we present all-optical switches and modulators based on the intersubband transition in semiconductor quantum wells. The use of InGaAs/AlAsSb coupled double quantum well structures is proposed to facilitate intersubband transitions in the optical-communication band, and to reduce the intersubband absorption recovery time from several picoseconds to a few hundred femtoseconds by utilizing enhanced electron-phonon scattering. Subpicosecond all-optical gating and modulation in coupled double quantum wells are observed using pump-probe experiments at optical-communication wavelengths. The results indicate that the intersubband transition in this structure is very useful for ultrafast all-optical switching devices.
Packing/Unpacking Using MPI User-Defined Datatypes for Efficient Data Redistribution
Sheng-Wen BAI Chu-Sing YANG Tsung-Chuan HUANG

PAPER-Software Support and Optimization Techniques

Vol:
E87-D No:7
Page(s):
1721-1728
In many parallel programs, run-time data redistribution is usually required to enhance data locality and reduce remote memory access on the distributed memory multicomputers. Research on data redistribution algorithms has recently matured. The time required to generate data sets and processor sets is much lesser than before. Therefore, packing/unpacking has become a relatively high cost in redistribution. In this paper, we present methods to perform BLOCK-CYCLIC(s) to BLOCK-CYCLIC(t) redistribution, using MPI user-defined datatypes. This method reduces the required memory buffers and avoids unnecessary movement of data. Theoretical models are presented to determine the best method for redistribution. The methods were implemented on an IBM SP2 parallel machine to evaluate the performance of the proposed methods. The experimental results indicate that this approach can clearly improve the redistribution in most cases.
Visual Customer Relationship Management System that Supports Broadband Network E-Commerce
Tetsushi MORITA Tetsuo HIDAKA Tomohiko NAKAMURA Morihide OINUMA Yutaka HIRAKAWA

PAPER-Network Application

Vol:
E87-B No:7
Page(s):
1789-1796
Recently, broadband access is widely spreading, and many broadband network E-commerce services are planned and developed. This article proposes a broadband online shop where a videoconferencing system is used to enable direct, face-to-face communication. It is important for a broadband online shop to understand what preference their customers want in order to provide them with more appropriate information. By using customer preferences, a salesclerk can have a serviceable conversation with few questions to his online customers. So, we are developing a visual Customer Relationship Management system (v-CRM system) that offers customer preferences to broadband network service such as broadband online shop. In this paper, we classify customer preferences, and describe three visualization methods that enable customer preferences to be intuitively understood quickly. We outline the v-CRM evaluation system and describe an experiment where we evaluated how accurately customer preferences can be recognized using these methods. The results show that v-CRM system is effective for understanding customer preferences.
Node Mobility Aware Routing for Mobile Ad Hoc Network
Shinichi FURUSHO Teruaki KITASUKA Tsuneo NAKANISHI Akira FUKUDA

LETTER

Vol:
E87-B No:7
Page(s):
1926-1930
In ad-hoc on-demand routing algorithm, when a route is broken a relay node must perform error transaction and the source node must do rerouting to discover an alternate route. It is important to construct a stable route when route discovery occurs. In this paper, we use relative speeds among nodes as a measure of node mobility. Our routing algorithm chooses nodes of lower relative speed as relay nodes. As a result of our simulation, when there is one session in the network, our proposing algorithm can reduce the number of route breaks: about 3 times smaller than DSR. And our proposing algorithm can deliver more packets than DSR: 18% higher rate. However, in the congested traffic situation our algorithm should be improved.

20001-20020hit(30728hit)

Keyword Search Result

[Keyword] Ti(30728hit)

VLaTTe: A Java Just-in-Time Compiler for VLIW with Fast Scheduling and Register Allocation

An Acceleration Processor for Data Intensive Scientific Computing

Algorithmic Concept Recognition to Support High Performance Code Reengineering

Proposal of a Tree Load Balancing Algorithm to Grid Computing Environments

Traditional File Systems versus DualFS: A Performance Comparison Approach

A Method to Preserve Layered Architectural Style in Development Phases

Reduced-State Sequence Estimation for Coded Modulation in CPSC on Frequency-Selective Fading Channels

A Distributed 3D Rendering Application for Massive Data Sets

Simulation of Simultaneous Multi-Wavelength Conversion in GaN/AlN Intersubband Optical Amplifiers

Enhancing ICP with P2P Technology: Cost, Availability, and Reconfiguration

Utilization of the On-Chip L2 Cache Area in CC-NUMA Multiprocessors for Applications with a Small Working Set

Multiple Access Systems with QPSK Modulation

Ultrafast All Optical Switching Using Pulse Trapping by Ultrashort Soliton Pulse

The Role of Fast Carrier Dynamics in SOA Based Devices

Optical Packet Switching Network Based on Ultra-Fast Optical Code Label Processing

Auto Focusing Algorithm for Iris Recognition Camera Using Corneal Specular Reflection

Ultrafast All-Optical Switching and Modulation Using Intersubband Transitions in Coupled Quantum Well Structures

Packing/Unpacking Using MPI User-Defined Datatypes for Efficient Data Redistribution

Visual Customer Relationship Management System that Supports Broadband Network E-Commerce

Node Mobility Aware Routing for Mobile Ad Hoc Network

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles