Register-Based Process Virtual Machine Acceleration Using Hardware Extension with Hybrid Execution

Surachai THONGKAEW; Tsuyoshi ISSHIKI; Dongju LI; Hiroaki KUNIEDA

doi:10.1587/transfun.E98.A.2505

IEICE TRANSACTIONS on Fundamentals

Register-Based Process Virtual Machine Acceleration Using Hardware Extension with Hybrid Execution

Surachai THONGKAEW, Tsuyoshi ISSHIKI, Dongju LI, Hiroaki KUNIEDA

Full Text Views

0

Cite this

Summary :

The Process Virtual Machine (VM) is typical software that runs applications inside operating systems. Its purpose is to provide a platform-independent programming environment that abstracts away details of the underlying hardware, operating system and allows bytecodes (portable code) to be executed in the same way on any other platforms. The Process VMs are implemented using an interpreter to interpret bytecode instead of direct execution of host machine codes. Thus, the bytecode execution is slower than those of the compiled programming language execution. Several techniques including our previous paper, the “Fetch/Decode Hardware Extension”, have been proposed to speed up the interpretation of Process VMs. In this paper, we propose an additional methodology, the “Hardware Extension with Hybrid Execution” to further enhance the performance of Process VMs interpretation and focus on Register-based model. This new technique provides an additional decoder which can classify bytecodes into either simple or complex instructions. With “Hybrid Execution”, the simple instruction will be directly executed on hardware of native processor. The complex instruction will be emulated by the “extra optimized bytecode software handler” of native processor. In order to eliminate the overheads of retrieving and storing operand on memory, we utilize the physical registers instead of (low address) virtual registers. Moreover, the combination of 3 techniques: Delay scheduling, Mode predictor HW and Branch/goto controller can eliminate all of the switching mode overheads between native mode and bytecode mode. The experimental results show the improvements of execution speed on the Arithmetic instructions, loop & conditional instructions and method invocation & return instructions can be achieved up to 16.9x, 16.1x and 3.1x respectively. The approximate size of the proposed hardware extension is 0.04mm² (or equivalent to 14.81k gates) and consumes an additional power of only 0.24mW. The stated results are obtained from logic synthesis using the TSMC 90nm technology @ 200MHz.

Publication: IEICE TRANSACTIONS on Fundamentals Vol.E98-A No.12 pp.2505-2518

Publication Date: 2015/12/01

Publicized

Online ISSN: 1745-1337

DOI: 10.1587/transfun.E98.A.2505

Type of Manuscript: Special Section PAPER (Special Section on VLSI Design and CAD Algorithms)

Category: High-Level Synthesis and System-Level Design

Authors

Surachai THONGKAEW
  Tokyo Institute of Technology
Tsuyoshi ISSHIKI
  Tokyo Institute of Technology
Dongju LI
  Tokyo Institute of Technology
Hiroaki KUNIEDA
  Tokyo Institute of Technology

Keyword

Cite this

Copy

Surachai THONGKAEW, Tsuyoshi ISSHIKI, Dongju LI, Hiroaki KUNIEDA, "Register-Based Process Virtual Machine Acceleration Using Hardware Extension with Hybrid Execution" in IEICE TRANSACTIONS on Fundamentals, vol. E98-A, no. 12, pp. 2505-2518, December 2015, doi: 10.1587/transfun.E98.A.2505.
Abstract: The Process Virtual Machine (VM) is typical software that runs applications inside operating systems. Its purpose is to provide a platform-independent programming environment that abstracts away details of the underlying hardware, operating system and allows bytecodes (portable code) to be executed in the same way on any other platforms. The Process VMs are implemented using an interpreter to interpret bytecode instead of direct execution of host machine codes. Thus, the bytecode execution is slower than those of the compiled programming language execution. Several techniques including our previous paper, the “Fetch/Decode Hardware Extension”, have been proposed to speed up the interpretation of Process VMs. In this paper, we propose an additional methodology, the “Hardware Extension with Hybrid Execution” to further enhance the performance of Process VMs interpretation and focus on Register-based model. This new technique provides an additional decoder which can classify bytecodes into either simple or complex instructions. With “Hybrid Execution”, the simple instruction will be directly executed on hardware of native processor. The complex instruction will be emulated by the “extra optimized bytecode software handler” of native processor. In order to eliminate the overheads of retrieving and storing operand on memory, we utilize the physical registers instead of (low address) virtual registers. Moreover, the combination of 3 techniques: Delay scheduling, Mode predictor HW and Branch/goto controller can eliminate all of the switching mode overheads between native mode and bytecode mode. The experimental results show the improvements of execution speed on the Arithmetic instructions, loop & conditional instructions and method invocation & return instructions can be achieved up to 16.9x, 16.1x and 3.1x respectively. The approximate size of the proposed hardware extension is 0.04mm² (or equivalent to 14.81k gates) and consumes an additional power of only 0.24mW. The stated results are obtained from logic synthesis using the TSMC 90nm technology @ 200MHz.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.E98.A.2505/_p

Copy

@ARTICLE{e98-a_12_2505,
author={Surachai THONGKAEW, Tsuyoshi ISSHIKI, Dongju LI, Hiroaki KUNIEDA, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Register-Based Process Virtual Machine Acceleration Using Hardware Extension with Hybrid Execution},
year={2015},
volume={E98-A},
number={12},
pages={2505-2518},
abstract={The Process Virtual Machine (VM) is typical software that runs applications inside operating systems. Its purpose is to provide a platform-independent programming environment that abstracts away details of the underlying hardware, operating system and allows bytecodes (portable code) to be executed in the same way on any other platforms. The Process VMs are implemented using an interpreter to interpret bytecode instead of direct execution of host machine codes. Thus, the bytecode execution is slower than those of the compiled programming language execution. Several techniques including our previous paper, the “Fetch/Decode Hardware Extension”, have been proposed to speed up the interpretation of Process VMs. In this paper, we propose an additional methodology, the “Hardware Extension with Hybrid Execution” to further enhance the performance of Process VMs interpretation and focus on Register-based model. This new technique provides an additional decoder which can classify bytecodes into either simple or complex instructions. With “Hybrid Execution”, the simple instruction will be directly executed on hardware of native processor. The complex instruction will be emulated by the “extra optimized bytecode software handler” of native processor. In order to eliminate the overheads of retrieving and storing operand on memory, we utilize the physical registers instead of (low address) virtual registers. Moreover, the combination of 3 techniques: Delay scheduling, Mode predictor HW and Branch/goto controller can eliminate all of the switching mode overheads between native mode and bytecode mode. The experimental results show the improvements of execution speed on the Arithmetic instructions, loop & conditional instructions and method invocation & return instructions can be achieved up to 16.9x, 16.1x and 3.1x respectively. The approximate size of the proposed hardware extension is 0.04mm² (or equivalent to 14.81k gates) and consumes an additional power of only 0.24mW. The stated results are obtained from logic synthesis using the TSMC 90nm technology @ 200MHz.},
keywords={},
doi={10.1587/transfun.E98.A.2505},
ISSN={1745-1337},
month={December},}

Copy

TY - JOUR
TI - Register-Based Process Virtual Machine Acceleration Using Hardware Extension with Hybrid Execution
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 2505
EP - 2518
AU - Surachai THONGKAEW
AU - Tsuyoshi ISSHIKI
AU - Dongju LI
AU - Hiroaki KUNIEDA
PY - 2015
DO - 10.1587/transfun.E98.A.2505
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E98-A
IS - 12
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - December 2015
AB - The Process Virtual Machine (VM) is typical software that runs applications inside operating systems. Its purpose is to provide a platform-independent programming environment that abstracts away details of the underlying hardware, operating system and allows bytecodes (portable code) to be executed in the same way on any other platforms. The Process VMs are implemented using an interpreter to interpret bytecode instead of direct execution of host machine codes. Thus, the bytecode execution is slower than those of the compiled programming language execution. Several techniques including our previous paper, the “Fetch/Decode Hardware Extension”, have been proposed to speed up the interpretation of Process VMs. In this paper, we propose an additional methodology, the “Hardware Extension with Hybrid Execution” to further enhance the performance of Process VMs interpretation and focus on Register-based model. This new technique provides an additional decoder which can classify bytecodes into either simple or complex instructions. With “Hybrid Execution”, the simple instruction will be directly executed on hardware of native processor. The complex instruction will be emulated by the “extra optimized bytecode software handler” of native processor. In order to eliminate the overheads of retrieving and storing operand on memory, we utilize the physical registers instead of (low address) virtual registers. Moreover, the combination of 3 techniques: Delay scheduling, Mode predictor HW and Branch/goto controller can eliminate all of the switching mode overheads between native mode and bytecode mode. The experimental results show the improvements of execution speed on the Arithmetic instructions, loop & conditional instructions and method invocation & return instructions can be achieved up to 16.9x, 16.1x and 3.1x respectively. The approximate size of the proposed hardware extension is 0.04mm² (or equivalent to 14.81k gates) and consumes an additional power of only 0.24mW. The stated results are obtained from logic synthesis using the TSMC 90nm technology @ 200MHz.
ER -

IEICE TRANSACTIONS on Fundamentals