IEICE global.ieice.org Site

Keyword Search Result

[Keyword] ATI(18690hit)

5861-5880hit(18690hit)

Comparing Operating Systems Scalability on Multicore Processors by Microbenchmarking
Yan CUI Yu CHEN Yuanchun SHI

PAPER-Computer System and Services

Vol:
E95-D No:12
Page(s):
2810-2820
Multicore processor architectures have become ubiquitous in today's computing platforms, especially in parallel computing installations, with their power and cost advantages. While the technology trend continues towards having hundreds of cores on a chip in the foreseeable future, an urgent question posed to system designers as well as application users is whether applications can receive sufficient support on today's operating systems for them to scale to many cores. To this end, people need to understand the strengths and weaknesses on their support on scalability and to identify major bottlenecks limiting the scalability, if any. As open-source operating systems are of particular interests in the research and industry communities, in this paper we choose three operating systems (Linux, Solaris and FreeBSD) to systematically evaluate and compare their scalability by using a set of highly-focused microbenchmarks for broad and detailed understanding their scalability on an AMD 32-core system. We use system profiling tools and analyze kernel source codes to find out the root cause of each observed scalability bottleneck. Our results reveal that there is no single operating system among the three standing out on all system aspects, though some system(s) can prevail on some of the system aspects. For example, Linux outperforms Solaris and FreeBSD significantly for file-descriptor- and process-intensive operations. For applications with intensive sockets creation and deletion operations, Solaris leads FreeBSD, which scales better than Linux. With the help of performance tools and source code instrumentation and analysis, we find that synchronization primitives protecting shared data structures in the kernels are the major bottleneck limiting system scalability.
TSV Geometrical Variations and Optimization Metric with Repeaters for 3D IC
Hung Viet NGUYEN Myunghwan RYU Youngmin KIM

PAPER-Integrated Electronics

Vol:
E95-C No:12
Page(s):
1864-1871
This paper evaluates the impact of Through-Silicon Via (TSV) on the performance and power consumption of 3D circuitry. The physical and electrical model of TSV which considers the coupling effects with adjacent TSVs is exploited in our investigation. Simulation results show that the overall performance of 3D IC infused with TSV can be improved noticeably. The frequency of the ring oscillator in 4-tier stacking layout soars up to two times compared with one in 2D planar. Furthermore, TSV process variations are examined by Monte Carlo simulations to figure out the geometrical factor having more impact in manufacturing. An in-depth research on repeater associated with TSV offers a metric to compute the optimization of 3D systems integration in terms of performance and energy dissipation. By such optimization metric with 45 nm MOSFET used in our circuit layout, it is found that the optimal number of tiers in both performance and power consumption approaches 4 since the substantial TSV-TSV coupling effect in the worst case of interference is expected in 3D IC.
Permutation Polynomials of Higher Degrees for Turbo Code Interleavers
Jonghoon RYU

LETTER

Vol:
E95-B No:12
Page(s):
3760-3762
Permutation polynomial based interleavers over integer rings, in particular quadratic permutation polynomials have been widely studied. In this letter, higher degree permutation polynomials for interleavers are considered for interleavers and permutation polynomials superior to quadratic permutation polynomials are found for some lengths.
Face Representation and Recognition with Local Curvelet Patterns
Wei ZHOU Alireza AHRARY Sei-ichiro KAMATA

PAPER-Image Recognition, Computer Vision

Vol:
E95-D No:12
Page(s):
3078-3087
In this paper, we propose Local Curvelet Binary Patterns (LCBP) and Learned Local Curvelet Patterns (LLCP) for presenting the local features of facial images. The proposed methods are based on Curvelet transform which can overcome the weakness of traditional Gabor wavelets in higher dimensions, and better capture the curve singularities and hyperplane singularities of facial images. LCBP can be regarded as a combination of Curvelet features and LBP operator while LLCP designs several learned codebooks from patch sets, which are constructed by sampling patches from Curvelet filtered facial images. Each facial image can be encoded into multiple pattern maps and block-based histograms of these patterns are concatenated into an histogram sequence to be used as a face descriptor. During the face representation phase, one input patch is encoded by one pattern in LCBP while multi-patterns in LLCP. Finally, an effective classifier called Weighted Histogram Spatially constrained Earth Mover's Distance (WHSEMD) which utilizes the discriminative powers of different facial parts, the different patterns and the spatial information of face is proposed. Performance assessment in face recognition and gender estimation under different challenges shows that the proposed approaches are superior than traditional ones.
Non-orthogonal Access Scheme over Multiple Channels with Iterative Interference Cancellation and Fractional Sampling in OFDM Receiver
Hiroyuki OSADA Mamiko INAMORI Yukitoshi SANADA

PAPER-Wireless Communication Technologies

Vol:
E95-B No:12
Page(s):
3837-3844
A diversity scheme with Fractional Sampling (FS) in OFDM receivers has been investigated recently. FS path diversity makes use of the imaging components of the desired signal transmitted on the adjacent channel. To increase the diversity gain with FS the bandwidth of the transmit signal has to be enlarged. This leads to the reduction of spectrum efficiency. In this paper non-orthogonal access over multiple channels in the frequency domain with iterative interference cancellation (IIC) and FS is proposed. The proposed scheme transmits the imaging component non-orthogonally on the adjacent channel. In order to accommodate the imaging component, it is underlaid on the other desired signal. Through diversity with FS and IIC, non-orthogonal access on multiple channels is realized. Our proposed scheme can obtain diversity gains for non-orthogonal signals modulated with QPSK.
Effect of Discharge Gap Shape on High-Speed Electrostatic Discharge Events
Masao MASUGI Norihito HIRASAWA Yoshiharu AKIYAMA Kazuo MURAKAWA

LETTER-Electromagnetic Compatibility(EMC)

Vol:
E95-B No:12
Page(s):
3898-3901
To clarify the characteristics of high-speed electrostatic discharge (ESD) events, we use two kinds of discharge electrodes: sphere- and cylinder-shape ones. We measure the energy level of ESD waveforms with charging voltages of 0.25, 0.5, and 1.0 kV. We find that the cylindrical electrode yields higher high-speed ESD energies, especially when the charging voltage is high; this indicates that the discharge gap shape is an important factor in ESD events.
Lossless Compression of Double-Precision Floating-Point Data for Numerical Simulations: Highly Parallelizable Algorithms for GPU Computing
Mamoru OHARA Takashi YAMAGUCHI

PAPER-Parallel and Distributed Computing

Vol:
E95-D No:12
Page(s):
2778-2786
In numerical simulations using massively parallel computers like GPGPU (General-Purpose computing on Graphics Processing Units), we often need to transfer computational results from external devices such as GPUs to the main memory or secondary storage of the host machine. Since size of the computation results is sometimes unacceptably large to hold them, it is desired that the data is compressed and stored. In addition, considering overheads for transferring data between the devices and host memories, it is preferable that the data is compressed in a part of parallel computation performed on the devices. Traditional compression methods for floating-point numbers do not always show good parallelism. In this paper, we propose a new compression method for massively-parallel simulations running on GPUs, in which we combine a few successive floating-point numbers and interleave them to improve compression efficiency. We also present numerical examples of compression ratio and throughput obtained from experimental implementations of the proposed method runnig on CPUs and GPUs.
A Variability-Aware Energy-Minimization Strategy for Subthreshold Circuits
Junya KAWASHIMA Hiroshi TSUTSUI Hiroyuki OCHI Takashi SATO

PAPER-Device and Circuit Modeling and Analysis

Vol:
E95-A No:12
Page(s):
2242-2250
We investigate a design strategy for subthreshold circuits focusing on energy-consumption minimization and yield maximization under process variations. The design strategy is based on the following findings related to the operation of low-power CMOS circuits: (1) The minimum operation voltage (VDDmin) of a circuit is dominated by flip-flops (FFs), and VDDmin of an FF can be improved by upsizing a few key transistors, (2) VDDmin of an FF is stochastically modeled by a log-normal distribution, (3) VDDmin of a large circuit can be efficiently estimated by using the above model, which eliminates extensive Monte Carlo simulations, and (4) improving VDDmin may substantially contribute to decreasing energy consumption. The effectiveness of the proposed design strategy has been verified through circuit simulations on various circuits, which clearly show the design tradeoff between voltage scaling and transistor sizing.
Statistical Learning Theory of Quasi-Regular Cases
Koshi YAMADA Sumio WATANABE

PAPER-General Fundamentals and Boundaries

Vol:
E95-A No:12
Page(s):
2479-2487
Many learning machines such as normal mixtures and layered neural networks are not regular but singular statistical models, because the map from a parameter to a probability distribution is not one-to-one. The conventional statistical asymptotic theory can not be applied to such learning machines because the likelihood function can not be approximated by any normal distribution. Recently, new statistical theory has been established based on algebraic geometry and it was clarified that the generalization and training errors are determined by two birational invariants, the real log canonical threshold and the singular fluctuation. However, their concrete values are left unknown. In the present paper, we propose a new concept, a quasi-regular case in statistical learning theory. A quasi-regular case is not a regular case but a singular case, however, it has the same property as a regular case. In fact, we prove that, in a quasi-regular case, two birational invariants are equal to each other, resulting that the symmetry of the generalization and training errors holds. Moreover, the concrete values of two birational invariants are explicitly obtained, hence the quasi-regular case is useful to study statistical learning theory.
Parameterization of Perfect Sequences over a Composition Algebra
Takao MAEDA Takafumi HAYASHI

PAPER-Sequence

Vol:
E95-A No:12
Page(s):
2139-2147
A parameterization of perfect sequences over composition algebras over the real number field is presented. According to the proposed parameterization theorem, a perfect sequence can be represented as a sum of trigonometric functions and points on a unit sphere of the algebra. Because of the non-commutativity of the multiplication, there are two definitions of perfect sequences, but the equivalence of the definitions is easily shown using the theorem. A composition sequence of sequences is introduced. Despite the non-associativity, the proposed theorem reveals that the composition sequence from perfect sequences is perfect.
GREAT-CEO: larGe scale distRibuted dEcision mAking Techniques for Wireless Chief Executive Officer Problems Open Access
Xiaobo ZHOU Xin HE Khoirul ANWAR Tad MATSUMOTO

INVITED PAPER

Vol:
E95-B No:12
Page(s):
3654-3662
In this paper, we reformulate the issue related to wireless mesh networks (WMNs) from the Chief Executive Officer (CEO) problem viewpoint, and provide a practical solution to a simple case of the problem. It is well known that the CEO problem is a theoretical basis for sensor networks. The problem investigated in this paper is described as follows: an originator broadcasts its binary information sequence to several forwarding nodes (relays) over Binary Symmetric Channels (BSC); the originator's information sequence suffers from independent random binary errors; at the forwarding nodes, they just further interleave, encode the received bit sequence, and then forward it, without making heavy efforts for correcting errors that may occur in the originator-relay links, to the final destination (FD) over Additive White Gaussian Noise (AWGN) channels. Hence, this strategy reduces the complexity of the relay significantly. A joint iterative decoding technique at the FD is proposed by utilizing the knowledge of the correlation due to the errors occurring in the link between the originator and forwarding nodes (referred to as intra-link). The bit-error-rate (BER) performances show that the originator's information can be reconstructed at the FD even by using a very simple coding scheme. We provide BER performance comparison between joint decoding and separate decoding strategies. The simulation results show that excellent performance can be achieved by the proposed system. Furthermore, extrinsic information transfer (EXIT) chart analysis is performed to investigate convergence property of the proposed technique, with the aim of, in part, optimizing the code rate at the originator.
An Enhanced Doppler Spread Estimation Method for OFDM Systems
Bin SHENG Pengcheng ZHU Xiaohu YOU

LETTER-Wireless Communication Technologies

Vol:
E95-B No:12
Page(s):
3911-3914
In OFDM systems, the correlation of cyclic prefix (CP) with its corresponding part at the end of the symbol can be used to estimate the maximum Doppler spread. However, the estimation accuracy of this CP based method is seriously affected by the inter-symbol interference (ISI) generated in the multipath channel. In this letter, we propose an enhanced CP based method which is immune to the ISI and can hence obtain an unbiased estimate of the auto-correlation function in multipath channels.
Anonymous Authentication Scheme without Verification Table for Wireless Environments
Ryoichi ISAWA Masakatu MORII

LETTER-Cryptography and Information Security

Vol:
E95-A No:12
Page(s):
2488-2492
Lee and Kwon proposed an anonymous authentication scheme based on Zhu et al.'s scheme. However, Lee et al.'s scheme has two disadvantages. Firstly, their scheme is vulnerable to off-line dictionary attacks. An adversary can guess a user password from the user's login messages eavesdropped by the adversary. Secondly, an authentication server called a home agent requires a verification table, which violates the original advantage of Zhu et al.'s scheme. That is, it increases the key management costs of the home agent. In this letter, we show the weaknesses of Lee et al.'s scheme and another three existing schemes. Then, we propose a new secure scheme without the verification table, while providing security for off-line dictionary attacks and other attacks except for a certain type of combined attacks.
Throughput Comparisons of 32/64APSK Schemes Based on Mutual Information Considering Cubic Metric
Reo KOBAYASHI Teruo KAWAMURA Nobuhiko MIKI Mamoru SAWAHASHI

PAPER

Vol:
E95-B No:12
Page(s):
3719-3727
This paper presents comprehensive comparisons of the achievable throughput between the 32-/64-ary amplitude and phase shift keying (APSK) and cross 32QAM/square 64QAM schemes based on mutual information (MI) considering the peak-to-average power ratio (PAPR) of the modulated signal. As a PAPR criterion, we use a cubic metric (CM) that directly corresponds to the transmission back-off of a power amplifier. In the analysis, we present the best ring ratio for the 32 or 64APSK scheme from the viewpoint of minimizing the required received signal-to-noise power ratio (SNR) considering the CM that achieves the peak throughput, i.e., maximum error-free transmission rate. We show that the required received SNR considering the CM at the peak throughput is minimized with the number of rings of M = 3 and 4 for 32-ary APSK and 64-asry APSK, respectively. Then, we show with the best ring ratios that the (4, 12, 16) 32APSK scheme with M = 3 achieves a lower required received SNR considering the CM compared to that for the cross 32QAM scheme. Similarly, we show that the (4, 12, 20, 28) 64APSK scheme with M = 4 achieves almost the same required received SNR considering the CM as that for the square 64QAM scheme.
An Efficient OFDM Timing Synchronization for CMMB System
Yong WANG Jian-hua GE Jun HU Bo AI

PAPER-Transmission Systems and Transmission Equipment for Communications

Vol:
E95-B No:12
Page(s):
3786-3792
An accurate and rapid synchronization scheme is a prerequisite for achieving high-quality multimedia transmission for wireless handheld terminals, e.g. China multimedia mobile broadcasting (CMMB) system. In this paper, an efficient orthogonal frequency division multiplexing (OFDM) timing synchronization scheme, which is robust to the doubly selective fading channel, is proposed for CMMB system. TS timing is derived by performing an inverse sliding correlation (ISC) between the segmented Sync sequences in the Beacon, which possesses the inverse conjugate symmetry (ICS) characteristic. The ISC can provide sufficient correlative gain even in the ultra low signal noise ratio (SNR) scenarios. Moreover, a fast fine symbol timing method based on the auto-correlation property of Sync sequence is also presented. According to the detection strategy for the significant channel taps, the specific information about channel profile can be obtained. The advantages of the proposed timing scheme over the traditional ones have been demonstrated through both theoretical analysis and numerical simulations.
Impact of Elastic Optical Paths That Adopt Distance Adaptive Modulation to Create Efficient Networks
Tatsumi TAKAGI Hiroshi HASEGAWA Ken-ichi SATO Yoshiaki SONE Akira HIRANO Masahiko JINNO

PAPER-Fiber-Optic Transmission for Communications

Vol:
E95-B No:12
Page(s):
3793-3801
We propose optical path routing and frequency slot assignment algorithms that can make the best use of elastic optical paths and the capabilities of distance adaptive modulation. Due to the computational difficulty of the assignment problem, we develop algorithms for 1+1 dedicated/1:1 shared protected ring networks and unprotected mesh networks to that fully utilize the characteristics of the topologies. Numerical experiments elucidate that the introduction of path elasticity and distance adaptive modulation significantly reduce the occupied bandwidth.
Fault-Injection Analysis to Estimate SEU Failure in Time by Using Frame-Based Partial Reconfiguration
Yoshihiro ICHINOMIYA Tsuyoshi KIMURA Motoki AMAGASAKI Morihiro KUGA Masahiro IIDA Toshinori SUEYOSHI

PAPER-High-Level Synthesis and System-Level Design

Vol:
E95-A No:12
Page(s):
2347-2356
SRAM-based field programmable gate arrays (FPGAs) are vulnerable to a soft-error induced by radiation. Techniques for designing dependable circuits, such as triple modular redundancy (TMR) with scrubbing, have been studied extensively. However, currently available evaluation techniques that can be used to check the dependability of these circuits are inadequate. Further, their results are restrictive because they do not represent the result in terms of general reliability indicator to decide whether the circuit is dependable. In this paper, we propose an evaluation method that provides results in terms of the realistic failure in time (FIT) by using reconfiguration-based fault-injection analysis. Current fault-injection analyses do not consider fault accumulation, and hence, they are not suitable for evaluating the dependability of a circuit such as a TMR circuit. Therefore, we configure an evaluation system that can handle fault-accumulation by using frame-based partial reconfiguration and the bootstrap method. By using the proposed method, we successfully evaluated a TMR circuit and could discuss the result in terms of realistic FIT data. Our method can evaluate the dependability of an actual system, and help with the tuning and selection in dependable system design.
Power Distribution Network Optimization for Timing Improvement with Statistical Noise Model and Timing Analysis
Takashi ENAMI Takashi SATO Masanori HASHIMOTO

PAPER-Device and Circuit Modeling and Analysis

Vol:
E95-A No:12
Page(s):
2261-2271
We propose an optimization method for power distribution network that explicitly deals with timing. We have found and focused on the facts that decoupling capacitance (decap) does not necessarily improve gate delay depending on the switching timing within a cycle and that power wire expansion may locally degrade the voltage. To resolve the above facts, we devised an efficient sensitivity calculation of timing to decap size and power wire width for guiding optimization. The proposed method, which is based on statistical noise modeling and timing analysis, accelerates sensitivity calculation with an approximation and adjoint sensitivity analysis. Experimental results show that decap allocation based on the sensitivity analysis efficiently minimizes the worst-case circuit delay within a given decap budget. Compared to the maximum decap placement, the delay improvement due to decap increases by 3.13% even while the total amount of decaps is reduced to 40%. The wire sizing with the proposed method also efficiently reduces required wire resource necessary to attain the same circuit delay by 11.5%.
SLA_Driven Adaptive Resource Allocation for Virtualized Servers
Wei ZHANG Li RUAN Mingfa ZHU Limin XIAO Jiajun LIU Xiaolan TANG Yiduo MEI Ying SONG Yuzhong SUN

PAPER-Computer System and Services

Vol:
E95-D No:12
Page(s):
2833-2843
In order to reduce cost and improve efficiency, many data centers adopt virtualization solutions. The advent of virtualization allows multiple virtual machines hosted on a single physical server. However, this poses new challenges for resource management. Web workloads which are dominant in data centers are known to vary dynamically with time. In order to meet application's service level agreement (SLA), how to allocate resources for virtual machines has become an important challenge in virtualized server environments, especially when dealing with fluctuating workloads and complex server applications. User experience is an important manifestation of SLA and attracts more attention. In this paper, the SLA is defined by server-side response time. Traditional resource allocation based on resource utilization has some drawbacks. We argue that dynamic resource allocation directly based on real-time user experience is more reasonable and also has practical significance. To address the problem, we propose a system architecture that combines response time measurements and analysis of user experience for resource allocation. An optimization model is introduced to dynamically allocate the resources among virtual machines. When resources are insufficient, we provide service differentiation and firstly guarantee resource requirements of applications that have higher priorities. We evaluate our proposal using TPC-W and Webbench. The experimental results show that our system can judiciously allocate system resources. The system helps stabilize applications' user experience. It can reduce the mean deviation of user experience from desired targets.
Analytical Modeling of Network Throughput Prediction on the Internet
Chunghan LEE Hirotake ABE Toshio HIROTSU Kyoji UMEMURA

PAPER-Network and Communication

Vol:
E95-D No:12
Page(s):
2870-2878
Predicting network throughput is important for network-aware applications. Network throughput depends on a number of factors, and many throughput prediction methods have been proposed. However, many of these methods are suffering from the fact that a distribution of traffic fluctuation is unclear and the scale and the bandwidth of networks are rapidly increasing. Furthermore, virtual machines are used as platforms in many network research and services fields, and they can affect network measurement. A prediction method that uses pairs of differently sized connections has been proposed. This method, which we call connection pair, features a small probe transfer using the TCP that can be used to predict the throughput of a large data transfer. We focus on measurements, analyses, and modeling for precise prediction results. We first clarified that the actual throughput for the connection pair is non-linearly and monotonically changed with noise. Second, we built a previously proposed predictor using the same training data sets as for our proposed method, and it was unsuitable for considering the above characteristics. We propose a throughput prediction method based on the connection pair that uses ν-support vector regression and the polynomial kernel to deal with prediction models represented as a non-linear and continuous monotonic function. The prediction results of our method compared to those of the previous predictor are more accurate. Moreover, under an unstable network state, the drop in accuracy is also smaller than that of the previous predictor.

5861-5880hit(18690hit)

Keyword Search Result

[Keyword] ATI(18690hit)

Comparing Operating Systems Scalability on Multicore Processors by Microbenchmarking

TSV Geometrical Variations and Optimization Metric with Repeaters for 3D IC

Permutation Polynomials of Higher Degrees for Turbo Code Interleavers

Face Representation and Recognition with Local Curvelet Patterns

Non-orthogonal Access Scheme over Multiple Channels with Iterative Interference Cancellation and Fractional Sampling in OFDM Receiver

Effect of Discharge Gap Shape on High-Speed Electrostatic Discharge Events

Lossless Compression of Double-Precision Floating-Point Data for Numerical Simulations: Highly Parallelizable Algorithms for GPU Computing

A Variability-Aware Energy-Minimization Strategy for Subthreshold Circuits

Statistical Learning Theory of Quasi-Regular Cases

Parameterization of Perfect Sequences over a Composition Algebra

GREAT-CEO: larGe scale distRibuted dEcision mAking Techniques for Wireless Chief Executive Officer Problems Open Access

An Enhanced Doppler Spread Estimation Method for OFDM Systems

Anonymous Authentication Scheme without Verification Table for Wireless Environments

Throughput Comparisons of 32/64APSK Schemes Based on Mutual Information Considering Cubic Metric

An Efficient OFDM Timing Synchronization for CMMB System

Impact of Elastic Optical Paths That Adopt Distance Adaptive Modulation to Create Efficient Networks

Fault-Injection Analysis to Estimate SEU Failure in Time by Using Frame-Based Partial Reconfiguration

Power Distribution Network Optimization for Timing Improvement with Statistical Noise Model and Timing Analysis

SLA_Driven Adaptive Resource Allocation for Virtualized Servers

Analytical Modeling of Network Throughput Prediction on the Internet

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles