The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] ATI(18690hit)

5861-5880hit(18690hit)

  • Comparing Operating Systems Scalability on Multicore Processors by Microbenchmarking

    Yan CUI  Yu CHEN  Yuanchun SHI  

     
    PAPER-Computer System and Services

      Vol:
    E95-D No:12
      Page(s):
    2810-2820

    Multicore processor architectures have become ubiquitous in today's computing platforms, especially in parallel computing installations, with their power and cost advantages. While the technology trend continues towards having hundreds of cores on a chip in the foreseeable future, an urgent question posed to system designers as well as application users is whether applications can receive sufficient support on today's operating systems for them to scale to many cores. To this end, people need to understand the strengths and weaknesses on their support on scalability and to identify major bottlenecks limiting the scalability, if any. As open-source operating systems are of particular interests in the research and industry communities, in this paper we choose three operating systems (Linux, Solaris and FreeBSD) to systematically evaluate and compare their scalability by using a set of highly-focused microbenchmarks for broad and detailed understanding their scalability on an AMD 32-core system. We use system profiling tools and analyze kernel source codes to find out the root cause of each observed scalability bottleneck. Our results reveal that there is no single operating system among the three standing out on all system aspects, though some system(s) can prevail on some of the system aspects. For example, Linux outperforms Solaris and FreeBSD significantly for file-descriptor- and process-intensive operations. For applications with intensive sockets creation and deletion operations, Solaris leads FreeBSD, which scales better than Linux. With the help of performance tools and source code instrumentation and analysis, we find that synchronization primitives protecting shared data structures in the kernels are the major bottleneck limiting system scalability.

  • TSV Geometrical Variations and Optimization Metric with Repeaters for 3D IC

    Hung Viet NGUYEN  Myunghwan RYU  Youngmin KIM  

     
    PAPER-Integrated Electronics

      Vol:
    E95-C No:12
      Page(s):
    1864-1871

    This paper evaluates the impact of Through-Silicon Via (TSV) on the performance and power consumption of 3D circuitry. The physical and electrical model of TSV which considers the coupling effects with adjacent TSVs is exploited in our investigation. Simulation results show that the overall performance of 3D IC infused with TSV can be improved noticeably. The frequency of the ring oscillator in 4-tier stacking layout soars up to two times compared with one in 2D planar. Furthermore, TSV process variations are examined by Monte Carlo simulations to figure out the geometrical factor having more impact in manufacturing. An in-depth research on repeater associated with TSV offers a metric to compute the optimization of 3D systems integration in terms of performance and energy dissipation. By such optimization metric with 45 nm MOSFET used in our circuit layout, it is found that the optimal number of tiers in both performance and power consumption approaches 4 since the substantial TSV-TSV coupling effect in the worst case of interference is expected in 3D IC.

  • Permutation Polynomials of Higher Degrees for Turbo Code Interleavers

    Jonghoon RYU  

     
    LETTER

      Vol:
    E95-B No:12
      Page(s):
    3760-3762

    Permutation polynomial based interleavers over integer rings, in particular quadratic permutation polynomials have been widely studied. In this letter, higher degree permutation polynomials for interleavers are considered for interleavers and permutation polynomials superior to quadratic permutation polynomials are found for some lengths.

  • Face Representation and Recognition with Local Curvelet Patterns

    Wei ZHOU  Alireza AHRARY  Sei-ichiro KAMATA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E95-D No:12
      Page(s):
    3078-3087

    In this paper, we propose Local Curvelet Binary Patterns (LCBP) and Learned Local Curvelet Patterns (LLCP) for presenting the local features of facial images. The proposed methods are based on Curvelet transform which can overcome the weakness of traditional Gabor wavelets in higher dimensions, and better capture the curve singularities and hyperplane singularities of facial images. LCBP can be regarded as a combination of Curvelet features and LBP operator while LLCP designs several learned codebooks from patch sets, which are constructed by sampling patches from Curvelet filtered facial images. Each facial image can be encoded into multiple pattern maps and block-based histograms of these patterns are concatenated into an histogram sequence to be used as a face descriptor. During the face representation phase, one input patch is encoded by one pattern in LCBP while multi-patterns in LLCP. Finally, an effective classifier called Weighted Histogram Spatially constrained Earth Mover's Distance (WHSEMD) which utilizes the discriminative powers of different facial parts, the different patterns and the spatial information of face is proposed. Performance assessment in face recognition and gender estimation under different challenges shows that the proposed approaches are superior than traditional ones.

  • Non-orthogonal Access Scheme over Multiple Channels with Iterative Interference Cancellation and Fractional Sampling in OFDM Receiver

    Hiroyuki OSADA  Mamiko INAMORI  Yukitoshi SANADA  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E95-B No:12
      Page(s):
    3837-3844

    A diversity scheme with Fractional Sampling (FS) in OFDM receivers has been investigated recently. FS path diversity makes use of the imaging components of the desired signal transmitted on the adjacent channel. To increase the diversity gain with FS the bandwidth of the transmit signal has to be enlarged. This leads to the reduction of spectrum efficiency. In this paper non-orthogonal access over multiple channels in the frequency domain with iterative interference cancellation (IIC) and FS is proposed. The proposed scheme transmits the imaging component non-orthogonally on the adjacent channel. In order to accommodate the imaging component, it is underlaid on the other desired signal. Through diversity with FS and IIC, non-orthogonal access on multiple channels is realized. Our proposed scheme can obtain diversity gains for non-orthogonal signals modulated with QPSK.

  • Effect of Discharge Gap Shape on High-Speed Electrostatic Discharge Events

    Masao MASUGI  Norihito HIRASAWA  Yoshiharu AKIYAMA  Kazuo MURAKAWA  

     
    LETTER-Electromagnetic Compatibility(EMC)

      Vol:
    E95-B No:12
      Page(s):
    3898-3901

    To clarify the characteristics of high-speed electrostatic discharge (ESD) events, we use two kinds of discharge electrodes: sphere- and cylinder-shape ones. We measure the energy level of ESD waveforms with charging voltages of 0.25, 0.5, and 1.0 kV. We find that the cylindrical electrode yields higher high-speed ESD energies, especially when the charging voltage is high; this indicates that the discharge gap shape is an important factor in ESD events.

  • Lossless Compression of Double-Precision Floating-Point Data for Numerical Simulations: Highly Parallelizable Algorithms for GPU Computing

    Mamoru OHARA  Takashi YAMAGUCHI  

     
    PAPER-Parallel and Distributed Computing

      Vol:
    E95-D No:12
      Page(s):
    2778-2786

    In numerical simulations using massively parallel computers like GPGPU (General-Purpose computing on Graphics Processing Units), we often need to transfer computational results from external devices such as GPUs to the main memory or secondary storage of the host machine. Since size of the computation results is sometimes unacceptably large to hold them, it is desired that the data is compressed and stored. In addition, considering overheads for transferring data between the devices and host memories, it is preferable that the data is compressed in a part of parallel computation performed on the devices. Traditional compression methods for floating-point numbers do not always show good parallelism. In this paper, we propose a new compression method for massively-parallel simulations running on GPUs, in which we combine a few successive floating-point numbers and interleave them to improve compression efficiency. We also present numerical examples of compression ratio and throughput obtained from experimental implementations of the proposed method runnig on CPUs and GPUs.

  • A Variability-Aware Energy-Minimization Strategy for Subthreshold Circuits

    Junya KAWASHIMA  Hiroshi TSUTSUI  Hiroyuki OCHI  Takashi SATO  

     
    PAPER-Device and Circuit Modeling and Analysis

      Vol:
    E95-A No:12
      Page(s):
    2242-2250

    We investigate a design strategy for subthreshold circuits focusing on energy-consumption minimization and yield maximization under process variations. The design strategy is based on the following findings related to the operation of low-power CMOS circuits: (1) The minimum operation voltage (VDDmin) of a circuit is dominated by flip-flops (FFs), and VDDmin of an FF can be improved by upsizing a few key transistors, (2) VDDmin of an FF is stochastically modeled by a log-normal distribution, (3) VDDmin of a large circuit can be efficiently estimated by using the above model, which eliminates extensive Monte Carlo simulations, and (4) improving VDDmin may substantially contribute to decreasing energy consumption. The effectiveness of the proposed design strategy has been verified through circuit simulations on various circuits, which clearly show the design tradeoff between voltage scaling and transistor sizing.

  • Statistical Learning Theory of Quasi-Regular Cases

    Koshi YAMADA  Sumio WATANABE  

     
    PAPER-General Fundamentals and Boundaries

      Vol:
    E95-A No:12
      Page(s):
    2479-2487

    Many learning machines such as normal mixtures and layered neural networks are not regular but singular statistical models, because the map from a parameter to a probability distribution is not one-to-one. The conventional statistical asymptotic theory can not be applied to such learning machines because the likelihood function can not be approximated by any normal distribution. Recently, new statistical theory has been established based on algebraic geometry and it was clarified that the generalization and training errors are determined by two birational invariants, the real log canonical threshold and the singular fluctuation. However, their concrete values are left unknown. In the present paper, we propose a new concept, a quasi-regular case in statistical learning theory. A quasi-regular case is not a regular case but a singular case, however, it has the same property as a regular case. In fact, we prove that, in a quasi-regular case, two birational invariants are equal to each other, resulting that the symmetry of the generalization and training errors holds. Moreover, the concrete values of two birational invariants are explicitly obtained, hence the quasi-regular case is useful to study statistical learning theory.

  • Parameterization of Perfect Sequences over a Composition Algebra

    Takao MAEDA  Takafumi HAYASHI  

     
    PAPER-Sequence

      Vol:
    E95-A No:12
      Page(s):
    2139-2147

    A parameterization of perfect sequences over composition algebras over the real number field is presented. According to the proposed parameterization theorem, a perfect sequence can be represented as a sum of trigonometric functions and points on a unit sphere of the algebra. Because of the non-commutativity of the multiplication, there are two definitions of perfect sequences, but the equivalence of the definitions is easily shown using the theorem. A composition sequence of sequences is introduced. Despite the non-associativity, the proposed theorem reveals that the composition sequence from perfect sequences is perfect.

  • GREAT-CEO: larGe scale distRibuted dEcision mAking Techniques for Wireless Chief Executive Officer Problems Open Access

    Xiaobo ZHOU  Xin HE  Khoirul ANWAR  Tad MATSUMOTO  

     
    INVITED PAPER

      Vol:
    E95-B No:12
      Page(s):
    3654-3662

    In this paper, we reformulate the issue related to wireless mesh networks (WMNs) from the Chief Executive Officer (CEO) problem viewpoint, and provide a practical solution to a simple case of the problem. It is well known that the CEO problem is a theoretical basis for sensor networks. The problem investigated in this paper is described as follows: an originator broadcasts its binary information sequence to several forwarding nodes (relays) over Binary Symmetric Channels (BSC); the originator's information sequence suffers from independent random binary errors; at the forwarding nodes, they just further interleave, encode the received bit sequence, and then forward it, without making heavy efforts for correcting errors that may occur in the originator-relay links, to the final destination (FD) over Additive White Gaussian Noise (AWGN) channels. Hence, this strategy reduces the complexity of the relay significantly. A joint iterative decoding technique at the FD is proposed by utilizing the knowledge of the correlation due to the errors occurring in the link between the originator and forwarding nodes (referred to as intra-link). The bit-error-rate (BER) performances show that the originator's information can be reconstructed at the FD even by using a very simple coding scheme. We provide BER performance comparison between joint decoding and separate decoding strategies. The simulation results show that excellent performance can be achieved by the proposed system. Furthermore, extrinsic information transfer (EXIT) chart analysis is performed to investigate convergence property of the proposed technique, with the aim of, in part, optimizing the code rate at the originator.

  • An Enhanced Doppler Spread Estimation Method for OFDM Systems

    Bin SHENG  Pengcheng ZHU  Xiaohu YOU  

     
    LETTER-Wireless Communication Technologies

      Vol:
    E95-B No:12
      Page(s):
    3911-3914

    In OFDM systems, the correlation of cyclic prefix (CP) with its corresponding part at the end of the symbol can be used to estimate the maximum Doppler spread. However, the estimation accuracy of this CP based method is seriously affected by the inter-symbol interference (ISI) generated in the multipath channel. In this letter, we propose an enhanced CP based method which is immune to the ISI and can hence obtain an unbiased estimate of the auto-correlation function in multipath channels.

  • Anonymous Authentication Scheme without Verification Table for Wireless Environments

    Ryoichi ISAWA  Masakatu MORII  

     
    LETTER-Cryptography and Information Security

      Vol:
    E95-A No:12
      Page(s):
    2488-2492

    Lee and Kwon proposed an anonymous authentication scheme based on Zhu et al.'s scheme. However, Lee et al.'s scheme has two disadvantages. Firstly, their scheme is vulnerable to off-line dictionary attacks. An adversary can guess a user password from the user's login messages eavesdropped by the adversary. Secondly, an authentication server called a home agent requires a verification table, which violates the original advantage of Zhu et al.'s scheme. That is, it increases the key management costs of the home agent. In this letter, we show the weaknesses of Lee et al.'s scheme and another three existing schemes. Then, we propose a new secure scheme without the verification table, while providing security for off-line dictionary attacks and other attacks except for a certain type of combined attacks.

  • Throughput Comparisons of 32/64APSK Schemes Based on Mutual Information Considering Cubic Metric

    Reo KOBAYASHI  Teruo KAWAMURA  Nobuhiko MIKI  Mamoru SAWAHASHI  

     
    PAPER

      Vol:
    E95-B No:12
      Page(s):
    3719-3727

    This paper presents comprehensive comparisons of the achievable throughput between the 32-/64-ary amplitude and phase shift keying (APSK) and cross 32QAM/square 64QAM schemes based on mutual information (MI) considering the peak-to-average power ratio (PAPR) of the modulated signal. As a PAPR criterion, we use a cubic metric (CM) that directly corresponds to the transmission back-off of a power amplifier. In the analysis, we present the best ring ratio for the 32 or 64APSK scheme from the viewpoint of minimizing the required received signal-to-noise power ratio (SNR) considering the CM that achieves the peak throughput, i.e., maximum error-free transmission rate. We show that the required received SNR considering the CM at the peak throughput is minimized with the number of rings of M = 3 and 4 for 32-ary APSK and 64-asry APSK, respectively. Then, we show with the best ring ratios that the (4, 12, 16) 32APSK scheme with M = 3 achieves a lower required received SNR considering the CM compared to that for the cross 32QAM scheme. Similarly, we show that the (4, 12, 20, 28) 64APSK scheme with M = 4 achieves almost the same required received SNR considering the CM as that for the square 64QAM scheme.

  • An Efficient OFDM Timing Synchronization for CMMB System

    Yong WANG  Jian-hua GE  Jun HU  Bo AI  

     
    PAPER-Transmission Systems and Transmission Equipment for Communications

      Vol:
    E95-B No:12
      Page(s):
    3786-3792

    An accurate and rapid synchronization scheme is a prerequisite for achieving high-quality multimedia transmission for wireless handheld terminals, e.g. China multimedia mobile broadcasting (CMMB) system. In this paper, an efficient orthogonal frequency division multiplexing (OFDM) timing synchronization scheme, which is robust to the doubly selective fading channel, is proposed for CMMB system. TS timing is derived by performing an inverse sliding correlation (ISC) between the segmented Sync sequences in the Beacon, which possesses the inverse conjugate symmetry (ICS) characteristic. The ISC can provide sufficient correlative gain even in the ultra low signal noise ratio (SNR) scenarios. Moreover, a fast fine symbol timing method based on the auto-correlation property of Sync sequence is also presented. According to the detection strategy for the significant channel taps, the specific information about channel profile can be obtained. The advantages of the proposed timing scheme over the traditional ones have been demonstrated through both theoretical analysis and numerical simulations.

  • Impact of Elastic Optical Paths That Adopt Distance Adaptive Modulation to Create Efficient Networks

    Tatsumi TAKAGI  Hiroshi HASEGAWA  Ken-ichi SATO  Yoshiaki SONE  Akira HIRANO  Masahiko JINNO  

     
    PAPER-Fiber-Optic Transmission for Communications

      Vol:
    E95-B No:12
      Page(s):
    3793-3801

    We propose optical path routing and frequency slot assignment algorithms that can make the best use of elastic optical paths and the capabilities of distance adaptive modulation. Due to the computational difficulty of the assignment problem, we develop algorithms for 1+1 dedicated/1:1 shared protected ring networks and unprotected mesh networks to that fully utilize the characteristics of the topologies. Numerical experiments elucidate that the introduction of path elasticity and distance adaptive modulation significantly reduce the occupied bandwidth.

  • Fault-Injection Analysis to Estimate SEU Failure in Time by Using Frame-Based Partial Reconfiguration

    Yoshihiro ICHINOMIYA  Tsuyoshi KIMURA  Motoki AMAGASAKI  Morihiro KUGA  Masahiro IIDA  Toshinori SUEYOSHI  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E95-A No:12
      Page(s):
    2347-2356

    SRAM-based field programmable gate arrays (FPGAs) are vulnerable to a soft-error induced by radiation. Techniques for designing dependable circuits, such as triple modular redundancy (TMR) with scrubbing, have been studied extensively. However, currently available evaluation techniques that can be used to check the dependability of these circuits are inadequate. Further, their results are restrictive because they do not represent the result in terms of general reliability indicator to decide whether the circuit is dependable. In this paper, we propose an evaluation method that provides results in terms of the realistic failure in time (FIT) by using reconfiguration-based fault-injection analysis. Current fault-injection analyses do not consider fault accumulation, and hence, they are not suitable for evaluating the dependability of a circuit such as a TMR circuit. Therefore, we configure an evaluation system that can handle fault-accumulation by using frame-based partial reconfiguration and the bootstrap method. By using the proposed method, we successfully evaluated a TMR circuit and could discuss the result in terms of realistic FIT data. Our method can evaluate the dependability of an actual system, and help with the tuning and selection in dependable system design.

  • Power Distribution Network Optimization for Timing Improvement with Statistical Noise Model and Timing Analysis

    Takashi ENAMI  Takashi SATO  Masanori HASHIMOTO  

     
    PAPER-Device and Circuit Modeling and Analysis

      Vol:
    E95-A No:12
      Page(s):
    2261-2271

    We propose an optimization method for power distribution network that explicitly deals with timing. We have found and focused on the facts that decoupling capacitance (decap) does not necessarily improve gate delay depending on the switching timing within a cycle and that power wire expansion may locally degrade the voltage. To resolve the above facts, we devised an efficient sensitivity calculation of timing to decap size and power wire width for guiding optimization. The proposed method, which is based on statistical noise modeling and timing analysis, accelerates sensitivity calculation with an approximation and adjoint sensitivity analysis. Experimental results show that decap allocation based on the sensitivity analysis efficiently minimizes the worst-case circuit delay within a given decap budget. Compared to the maximum decap placement, the delay improvement due to decap increases by 3.13% even while the total amount of decaps is reduced to 40%. The wire sizing with the proposed method also efficiently reduces required wire resource necessary to attain the same circuit delay by 11.5%.

  • SLA_Driven Adaptive Resource Allocation for Virtualized Servers

    Wei ZHANG  Li RUAN  Mingfa ZHU  Limin XIAO  Jiajun LIU  Xiaolan TANG  Yiduo MEI  Ying SONG  Yuzhong SUN  

     
    PAPER-Computer System and Services

      Vol:
    E95-D No:12
      Page(s):
    2833-2843

    In order to reduce cost and improve efficiency, many data centers adopt virtualization solutions. The advent of virtualization allows multiple virtual machines hosted on a single physical server. However, this poses new challenges for resource management. Web workloads which are dominant in data centers are known to vary dynamically with time. In order to meet application's service level agreement (SLA), how to allocate resources for virtual machines has become an important challenge in virtualized server environments, especially when dealing with fluctuating workloads and complex server applications. User experience is an important manifestation of SLA and attracts more attention. In this paper, the SLA is defined by server-side response time. Traditional resource allocation based on resource utilization has some drawbacks. We argue that dynamic resource allocation directly based on real-time user experience is more reasonable and also has practical significance. To address the problem, we propose a system architecture that combines response time measurements and analysis of user experience for resource allocation. An optimization model is introduced to dynamically allocate the resources among virtual machines. When resources are insufficient, we provide service differentiation and firstly guarantee resource requirements of applications that have higher priorities. We evaluate our proposal using TPC-W and Webbench. The experimental results show that our system can judiciously allocate system resources. The system helps stabilize applications' user experience. It can reduce the mean deviation of user experience from desired targets.

  • Analytical Modeling of Network Throughput Prediction on the Internet

    Chunghan LEE  Hirotake ABE  Toshio HIROTSU  Kyoji UMEMURA  

     
    PAPER-Network and Communication

      Vol:
    E95-D No:12
      Page(s):
    2870-2878

    Predicting network throughput is important for network-aware applications. Network throughput depends on a number of factors, and many throughput prediction methods have been proposed. However, many of these methods are suffering from the fact that a distribution of traffic fluctuation is unclear and the scale and the bandwidth of networks are rapidly increasing. Furthermore, virtual machines are used as platforms in many network research and services fields, and they can affect network measurement. A prediction method that uses pairs of differently sized connections has been proposed. This method, which we call connection pair, features a small probe transfer using the TCP that can be used to predict the throughput of a large data transfer. We focus on measurements, analyses, and modeling for precise prediction results. We first clarified that the actual throughput for the connection pair is non-linearly and monotonically changed with noise. Second, we built a previously proposed predictor using the same training data sets as for our proposed method, and it was unsuitable for considering the above characteristics. We propose a throughput prediction method based on the connection pair that uses ν-support vector regression and the polynomial kernel to deal with prediction models represented as a non-linear and continuous monotonic function. The prediction results of our method compared to those of the previous predictor are more accurate. Moreover, under an unstable network state, the drop in accuracy is also smaller than that of the previous predictor.

5861-5880hit(18690hit)