Takuya ASAKA Hiroyoshi MIWA Yoshiaki TANAKA
Distributed Web caching allows multiple clients to quickly access a pool of popular Web pages. Conventional distributed Web caching schemes, e. g. , the Internet cache protocol and hash routing, require the sending of many query messages among cache servers and/or impose a large load on the cache servers when they are widely dispersed. To overcome these problems, we propose a hash-based query caching method using both a hash function and a query caching method. This method can find cached objects among several cache servers by using only one query message, enabling the construction of an efficient large-scale distributed Web cache server. Compared to conventional methods, this method reduces cache server overhead and object retrieval latency.
The intermediate language (IL) modularizes a compiler into target processor independent and dependent parts, called the front-end and the back-end. By adding a new back-end, it is possible to port existing software from one processor to another. This paper presents a new efficient approach to achieve multiple targeting to quite different architectures using different processors as well, by translating from one IL into other existing ILs. This approach makes it possible to reuse existing back-ends. It has been successfully applied to a commercial-scale project for porting public switching system software. Since the target ILs were not predictable in advance, we provided an abstract syntax tree (AST) with attributes accessible by abstract data type (ADT) interface to convey the source language information from our front-end to back-ends. It was translated into several ILs that were developed independently. These translations made the compiler available in a very short time for different cross-target platforms and on several workstations we needed. The structure of this AST and the mapping to these ILs are presented, and retargeting cost is evaluated.
Toshifumi MORIYAMA Masafumi NAKAMURA Yoshio YAMAGUCHI Hiroyoshi YAMADA Wolfgang-M. BOERNER
This paper discusses the classification of targets buried in the underground by radar polarimetry. The subsurface radar is used for the detection of objects buried beneath the ground surface, such as gas pipes, cables and cavities, or in archeological exploration operation. In addition to target echo, the subsurface radar receives various other echoes, because the underground is inhomogeneous medium. Therefore, the subsurface radar needs to distinguish these echoes. In order to enhance the discrimination capability, we first applied the polarization anisotropy coefficient to distinguish echoes from isotropic targets (plate, sphere) versus anisotropic targets (wire, pipe). It is straightforward to find the man-made target buried in the underground using the polarization anisotropy coefficient. Second, we tried to classify targets using the polarimetric signature approach, in which the characteristic polarization state provides the orientation angle of an anisotropic target. All of these values contribute to the classification of a target. Field experiments using an ultra-wideband (250 MHz to 1 GHz) FM-CW polarimetric radar system were carried out to show the usefulness of radar polarimetry. In this paper, several detection and classification results are demonstrated. It is shown that these techniques improve the detection capability of buried target considerably.
Future cellular systems are envisioned to support mixed traffic, and ultimately multimedia services. However, a mixture of voice and data requires novel service mechanisms that can guarantee quality of service. In order to transfer high-speed data, multislot channel allocation is seen as a favoured solution to the present systems with the least compromise to circuit- switched services. This paper evaluates the performance of narrowband voice calls and multislot data packet transmission in such integrated systems by using a matrix-analytic approach. This method achieves quadratic convergence compared to the conventional spectral methods. Mobility is also considered in a prioritized cellular environment where frequent handoff has the potential of degrading data performance. The voice call distribution, data packets throughput, delay and waiting time distribution are derived. Moreover, a new multiple priority-based distributed control algorithm and a voice rate control scheme are enforced to mitigate the queuing congestion of data packets. The numerical results derived from this study show that larger data packets incur longer latency and the use of these flexible schemes can improve the overall performance.
This paper outlines the modeling requirements of integrated circuit (IC) fabrication processes that have lead to and sustained the development of computer-aided design of technology (i. e. TCAD). Over a period spanning more than two decades the importance of TCAD modeling and the complexity of required models has grown steadily. The paper also illustrates typical applications where TCAD has been powerful and strategic to IC scaling of processes. Finally, the future issues of atomic-scale modeling and the need for an hierarchical approach to capture and use such detailed information at higher levels of simulation are discussed.
From our previous studies, we derived the worst case cell delay within an ATM switch and thus can find the worst case end-to-end delay for a set of real-time connections. We observed that these delays are sensitive to the priority assignment of the connections. With a better priority assignment scheme within the switch, the worst case delay can be reduced and provide a better network performance. We extend our previous work on the closed form analysis to conduct more experimental study of how different priority assignments and system parameters may affect the performance. Furthermore, from our worst case delay analysis on a regulated ATM switch, network traffic can be smoothed by a leaky bucket at the output controller for each connection. With the appropriate setting on the leaky bucket parameter, the burstiness of the network traffic can be reduced without increasing the delay in the switch. Therefore, fewer buffers will be required for each active connection within the switch. In this paper, our experimental results have shown that the buffer requirement can be reduced up to 5.75% for each connection, which could be significant, when hundreds of connections are passing through the switches within a regulated ATM network.
Choong Ho LEE Masayuki KAWAMATA Tatsuo HIGUCHI
This paper proposes an analysis method of the roundoff error due to finite-wordlength decoding in fractal image coding. The proposed method can be applied to large images such as 256 256 or 512 512 images because it needs no complex matrix computation. The simplified model used here ignores the effect of decimation ratio on the roundoff error because it is negligible. As an analysis result, the proposed method gives the output error variance which consists of grey-tone scaling coefficients and an iteration number. This method is tested on various types of 12 standard images which have 256 256 size or 512 512 size with 256 grey levels. Comparisons of simulation results with analysis results are given. The results show that our analysis method is valid for the fractal image coding.
Hitoshi KIYA Jun FURUKAWA Yoshihiro NOGUCHI
We propose a motion estimation algorithm using less gray level images, which are composed of bits pixels lower than 8 bits pixels. Threshold values for generating low bits pixels from 8 bits pixels are simply determined as median values of pixels in a macro block. The proposed algorithm reduces the computational complexity of motion estimation at less expense of video quality. Moreover, median cut quantization can be applied to multilevel images and combined with a lot of fast algorithms to obtain more effective algorithms.
Yoon-Jong KIM Dong-Hoon LEE Seung-Hong HONG
In this paper, near real time digital radiography system was implemented for the automatic verification of local errors between simulation plan and radiation therapy. Portal image could be acquired through video camera, image board and PC after therapy radiation was converted into light by a metal/fluorescent screen. Considering the divergence according to the distance between the source and the plate, we made a 340 340 12 cm3 basis point plate on which five rods of 4 cm height and 8 mm diameter lead (Pb) were built to display reference points on the simulator and the portal image. We converted the portal image into the binary image using the optimal threshold value which was gotten through the histogram analysis of the acquired portal image using the basis point plate. we got the location information of the iso-center and basis points from the binary image, and removed the systematic errors which were from the differences between the simulation plan and the portal image. Field size which was measured automatically by optimal threshold portal image, was verified with simulation plan. Anatomic errors were automatically detected and verified with the normalized simulation and the portal image by pattern matching method after irradiating a part of the radiation. Therapy efficiency was improved and radiation side effects were reduced by these techniques, so exact radiation treatment are expected.
In this paper an analysis on the oversampling data recovery circuit is presented. The input waveform is assumed to be non-return-zero (NRZ) binary signals. A finite Markov chain model is used to evaluate the steady-state phase jitter performance. Theoretical analysis enables us to predict the input signal-to-noise ratio (SNR) versus bit error rate (BER) of the oversampling data recovery circuit for various oversampling ratios. The more number of samples per single bit results in the better performance on BER at the same input SNR. To achieve 10-11 BER, 8 times oversampling has about 2 dB input signal penalty compared to 16 times oversampling. In an architectural choice of the oversampling data recovery circuit, the recovered clock can be updated in each data bit or in every multiple bits depending on the input data rate and input noise. Two different clock update schemes were analyzed and compared. The scheme updating clock in every data bit has about 1.5 dB penalty against the multiple bits (4 bits) clock updating scheme with 16 times oversampling in white noise dominant input data. The results were applied to the fabricated circuits to validate the analysis.
Kei EGUCHI Takahiro INOUE Akio TSUNEDA
In this paper, a new digital chaos circuit which can generate multiple-scroll strange attractors is proposed. Being based on the piecewise-linear function which is determined by on-chip supervised learning, the proposed digital chaos circuit can generate multiple-scroll strange attractors. Hence, the proposed circuit can exhibit various bifurcation phenomena. By numerical simulations, the learning dynamics and the quasi-chaos generation of the proposed digital chaos circuit are analyzed in detail. Furthermore, as a design example of the integrated digital chaos circuit, the proposed circuit realizing the nonlinear function with five breakpoints is implemented onto the FPGA (Field Programmable Gate Array). The synthesized FPGA circuit which can generate n-scroll strange attractors (n=1, 2, 4) showed that the proposed circuit is implementable onto a single FPGA except for the SRAM.
Byung-Chul KIM Dong-Ho KIM You-Ze CHO Yoon-Young AN Yul KWON
This letter proposes an efficient implementation method for a binary feedback switch, called EFCI/RELAY, which can reduce the feedback delay of the congestion status of a switch in multiple-hop network environments. At each transit switch, this method relays the EFCI-bit contained in an incoming data cell to the head-of-line cell with a corresponding VC which is waiting for transmission in the output buffer. Simulation results show that the proposed method can achieve a lower queue length while maintaining a higher link utilization.
Taewhan KIM Ki-Seok CHUNG C. L. LIU
This paper presents a new data path synthesis algorithm which takes into account simultaneously three important design criteria: testability, design area, and total execution time. We define a goodness measure on the testability of a circuit based on three rules of thumb introduced in prior work on synthesis for testability. We then develop a stepwise refinement synthesis algorithm which carries out the scheduling and allocation tasks in an integrated fashion. Experimental results for benchmark and other circuit examples show that we were able to enhance the testability of circuits significantly with very little overheads on design area and execution time.
Chi-Sung LAIH Fu-Kuan TU Yung-Cheng LEE
Secret information stored in a tamperfree device is revealed during the decryption or signature generation processes due to fault-based attack. In this paper, based on the coding approach, we propose a new fault-resistant system which enables any fault existing in modular multiplication and exponentiation computations to be detected with a very high probability. The proposed method can be used to implement all crypto-schemes whose basic operations are modular multiplications for resisting both memory and computational fault-based attacks with a very low computational overhead.
Tomoharu SHIBUYA Ryo HASEGAWA Kohichi SAKANIWA
In this paper, we introduce a lower bound for the generalized Hamming weights, which is applicable to arbitrary linear code, in terms of the notion of well-behaving. We also show that any [n,k] linear code C over a finite field F is the t-th rank MDS for t such that g(C)+1 t k where g(C) is easily calculated from the basis of Fn so chosen that whose first n-k elements generate C. Finally, we apply our result to Reed-Solomon, Reed-Muller and algebraic geometry codes on Cab, and determine g(C) for each code.
In this paper, a novel variable-rate vector quantizer (VQ) design algorithm using fuzzy clustering technique is presented. The algorithm, termed fuzzy entropy-constrained VQ (FECVQ) design algorithm, has a better rate-distortion performance than that of the usual entropy-constrained VQ (ECVQ) algorithm for variable-rate VQ design. When performing the fuzzy clustering, the FECVQ algorithm considers both the usual squared-distance measure, and the length of channel index associated with each codeword so that the average rate of the VQ can be controlled. In addition, the membership function for achieving the optimal clustering for the design of FECVQ are derived. Simulation results demonstrate that the FECVQ can be an effective alternative for the design of variable-rate VQs.
Hawkins noise model is modified for HBT application. The non-ideal ideality factor of HBT is included in both dynamic resistance and noise figure equations. Emitter resistance is also included. The extraction method of noise resistance Rn is developed. Based on the method, a simple analytic equation of Rn is derived and experimentally verified. The effects of noise sources on minimum noise figure are analyzed. The dominant noise sources are the shot noises of emitter and collector currents. Generally, when the minimum noise figure is measured at various current levels, there exists an current level at which the slope of minimum noise figure curve is zero. The zero slope current level coincides with the current level at which the noise contribution of the emitter and collector shot noises including the cancellation by correlation of two sources is minimum. Parasitic resistance degrades output noise through the shot noise amplification with a minor effect of the thermal noise of itself.
Andrzej J. STROJWAS Xiaolei LI Kevin D. LUCAS
In this paper we present a rigorous vector 3D lithography simulator METROPOLE-3D which is designed to run moderately fast on conventional engineering workstations. METROPOLE-3D solves Maxwell's equations rigorously in three dimensions to model how the non-vertically incident light is scattered and transmitted in non-planar structures. METROPOLE-3D consists of several simulation modules: photomask simulator, exposure simulator, post-exposure baking module and 3D development module. This simulator has been applied to a wide range of pressing engineering problems encountered in state-of-the-art VLSI fabrication processes, such as layout printability/manufacturability analysis including reflective notching problems and optimization of an anti-reflective coating (ARC) layer. Finally, a 3D contamination to defect transformation study was successfully performed using our rigorous simulator.
Yuji TAKAHASHI Kazuaki KUNIHIRO Yasuo OHNO
A device simulator that simulates device performance in the cyclic bias steady state was developed, and it was applied to GaAs hetero-junction FET (HJFET) pulse pattern effect. Although there is a large time-constant difference between the pulse signals and deep trap reactions, the simulator searches the cyclic bias steady states at about 30 iterations. A non-linear shift in the drain current level with the mark ratio was confirmed, which has been estimated from the rate equation of electron capture and emission based on Shockley-Read-Hall statistics for deep traps.
Christoph JUNGEMANN Stefan KEITH Martin BARTELS Bernd MEINERZHAGEN
The full-band Monte Carlo technique is currently the most accurate device simulation method, but its usefulness is limited because it is very CPU intensive. This work describes efficient algorithms in detail, which raise the efficiency of the full-band Monte Carlo method to a level where it becomes applicable in the device design process beyond exemplary simulations. The k-space is discretized with a nonuniform tetrahedral grid, which minimizes the discretization error of the linear energy interpolation and memory requirements. A consistent discretization of the inverse mass tensor is utilized to formulate efficient transport parameter estimators. Particle scattering is modeled in such a way that a very fast rejection technique can be used for the generation of the final state eliminating the main cause of the inefficiency of full-band Monte Carlo simulations. The developed full-band Monte Carlo simulator is highly efficient. For example, in conjunction with the nonself-consistent simulation technique CPU times of a few CPU minutes per bias point are achieved for substrate current calculations. Self-consistent calculations of the drain current of a 60nm-NMOSFET take about a few CPU hours demonstrating the feasibility of full-band Monte Carlo simulations.