The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] ATI(18690hit)

5241-5260hit(18690hit)

  • A Rectangular Weighting Function Approximating Local Phase Error for Designing Equiripple All-Pass IIR Filters

    Taisaku ISHIWATA  Yoshinao SHIRAKI  

     
    PAPER-Signal Processing

      Vol:
    E96-A No:12
      Page(s):
    2398-2404

    In this paper, we propose a rectangular weighting function that can be used in the method of iteratively reweighted least squares (IRWLS) for designing equiripple all-pass IIR filters. The purpose of introducing this weighting function is to improve the convergence performance in the solution of the IRWLS. The height of each rectangle is designed to be equal to the local maximum of each ripple, and the width of each rectangle is designed so that the area of each rectangle becomes equal to the area of each ripple. Here, the ripple is the absolute value of the phase error. We show experimentally that the convergence performance in the solution of the IRWLS can be improved by using the proposed weighting function.

  • A 5.83pJ/bit/iteration High-Parallel Performance-Aware LDPC Decoder IP Core Design for WiMAX in 65nm CMOS

    Xiongxin ZHAO  Zhixiang CHEN  Xiao PENG  Dajiang ZHOU  Satoshi GOTO  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E96-A No:12
      Page(s):
    2623-2632

    In this paper, we propose a synthesizable LDPC decoder IP core for the WiMAX system with high parallelism and enhanced error-correcting performance. By taking the advantages of both layered scheduling and fully-parallel architecture, the decoder can fully support multi-mode decoding specified in WiMAX with the parallelism much higher than commonly used partial-parallel layered LDPC decoder architecture. 6-bit quantized messages are split into bit-serial style and 2bit-width serial processing lines work concurrently so that only 3 cycles are required to decode one layer. As a result, 12∼24 cycles are enough to process one iteration for all the code-rates specified in WiMAX. Compared to our previous bit-serial decoder, it doubles the parallelism and solves the message saturation problem of the bit-serial arithmetic, with minor gate count increase. Power synthesis result shows that the proposed decoder achieves 5.83pJ/bit/iteration energy efficiency which is 46.8% improvement compared to state-of-the-art work. Furthermore, an advanced dynamic quantization (ADQ) technique is proposed to enhance the error-correcting performance in layered decoder architecture. With about 2% area overhead, 6-bit ADQ can achieve the error-correcting performance close to 7-bit fixed quantization with improved error floor performance.

  • High Performance NAND Flash Memory System with a Data Buffer

    Jung-Hoon LEE  Bo-Sung JUNG  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E96-A No:12
      Page(s):
    2645-2651

    The objective of this research is to design a high-performance NAND flash memory system with a data buffer. The proposed buffer system in the NAND flash memory consists of two parts, i.e., a fully associative temporal buffer for temporal locality and a fully associative spatial buffer for spatial locality. We propose a new operating mechanism for reducing overhead of flash memory, that is, erase and write operations. According to our simulation results, the proposed buffer system can reduce the write and erase operations by about 73% and 79% for spec application respectively, compared with a fully associative buffer with two times more space. Futhermore, the average memory access time can improve by about 60% compared with other large buffer systems.

  • Duopoly Competition in Time-Dependent Pricing for Improving Revenue of Network Service Providers

    Cheng ZHANG  Bo GU  Kyoko YAMORI  Sugang XU  Yoshiaki TANAKA  

     
    PAPER

      Vol:
    E96-B No:12
      Page(s):
    2964-2975

    Due to network users' different time-preference, network traffic load usually significantly differs at different time. In traffic peak time, network congestion may happen, which make the quality of service for network users deteriorate. There are essentially two ways to improve the quality of services in this case: (1) Network service providers (NSPs) over-provision network capacity by investment; (2) NSPs use time-dependent pricing (TDP) to reduce the traffic at traffic peak time. However, over-provisioning network capacity can be costly. Therefore, some researchers have proposed TDP to control congestion as well as improve the revenue of NSP. But to the best of our knowledge, all of the literature related time-dependent pricing scheme only consider the monopoly NSP case. In this paper, a duopoly NSP case is studied. The NSPs try to maximize their overall revenue by setting time-dependent price, while users choose NSP by considering their own preference, congestion status in the networks and the price set by the NSPs. Analytical and experimental results show that the TDP benefits the NSPs, but the revenue improvement is limited due to the competition effect.

  • A Low Complexity Heterodyne Multiband MIMO Receiver with Baseband Automatic Gain Control

    Tomoya OHTA  Satoshi DENNO  Masahiro MORIKURA  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E96-B No:12
      Page(s):
    3124-3134

    This paper proposes a novel heterodyne multiband multiple-input multiple-output (MIMO) receiver with baseband automatic gain control (AGC) for cognitive radios. The proposed receiver uses heterodyne reception implemented with a wide-passband band-pass filter in the radio frequency (RF) stage to be able to receive signals in arbitrary frequency bands. Even when an RF Hilbert transformer is utilized in the receiver, image-band interference occurs due to the imperfection of the Hilbert transformer. In the receiver, analog baseband AGC is introduced to prevent the baseband signals exceeding the voltage reference of analog-to-digital converters (ADCs). This paper proposes a novel technique to estimate the imperfection of the Hilbert transformer in the heterodyne multiband MIMO receiver with baseband AGC. The proposed technique estimates not only the imperfection of the Hilbert transformer but also the AGC gain ratio, and analog devices imperfection in the feedback loop, which enables to offset the imperfection of the Hilbert transformer. The performance of the proposed receiver is verified by using computer simulations. As a result, the required resolution of the ADC is 9 bits in the proposed receiver. Moreover, the proposed receiver has less computational complexity than that with the baseband interference cancellation unless a frequency band is changed every 9 packets or less.

  • Lower-Energy Structure Optimization of (C60)N Clusters Using an Improved Genetic Algorithm

    Guifang SHAO  Wupeng HONG  Tingna WANG  Yuhua WEN  

     
    PAPER-Fundamentals of Information Systems

      Vol:
    E96-D No:12
      Page(s):
    2726-2732

    An improved genetic algorithm is employed to optimize the structure of (C60)N (N≤25) fullerene clusters with the lowest energy. First, crossover with variable precision, realized by introducing the hamming distance, is developed to provide a faster search mechanism. Second, the bit string mutation and feedback mutation are incorporated to maintain the diversity in the population. The interaction between C60 molecules is described by the Pacheco and Ramalho potential derived from first-principles calculations. We compare the performance of the Improved GA (IGA) with that of the Standard GA (SGA). The numerical and graphical results verify that the proposed approach is faster and more robust than the SGA. The second finite differential of the total energy shows that the (C60)N clusters with N=7, 13, 22 are particularly stable. Performance with the lowest energy is achieved in this work.

  • Network Interface Architecture with Scalable Low-Latency Message Receiving Mechanism

    Noboru TANABE  Atsushi OHTA  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2536-2544

    Most of scientists except computer scientists do not want to make efforts for performance tuning with rewriting their MPI applications. In addition, the number of processing elements which can be used by them is increasing year by year. On large-scale parallel systems, the number of accumulated messages on a message buffer tends to increase in some of their applications. Since searching message queue in MPI is time-consuming, system side scalable acceleration is needed for those systems. In this paper, a support function named LHS (Limited-length Head Separation) is proposed. Its performance in searching message buffer and hardware cost are evaluated. LHS accelerates searching message buffer by means of switching location to store limited-length heads of messages. It uses the effects such as increasing hit rate of cache on host with partial off-loading to hardware. Searching speed of message buffer when the order of message reception is different from the receiver's expectation is accelerated 14.3 times with LHS on FPGA-based network interface card (NIC) named DIMMnet-2. This absolute performance is 38.5 times higher than that of IBM BlueGene/P although the frequency is 8.5times slower than BlueGene/P. LHS has higher scalability than ALPU in the performance per frequency. Since these results are obtained with partially on loaded linear searching on old Pentium®4, performance gap will increase using state of art CPU. Therefore, LHS is more suitable for larger parallel systems. The discussions for adopting proposed method to state of art processors and systems are also presented.

  • A Fully Optical Ring Network-on-Chip with Static and Dynamic Wavelength Allocation

    Ahmadou Dit Adi CISSE  Michihiro KOIBUCHI  Masato YOSHIMI  Hidetsugu IRIE  Tsutomu YOSHINAGA  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2545-2554

    Silicon photonics Network-on-Chips (NoCs) have emerged as an attractive solution to alleviate the high power consumption of traditional electronic interconnects. In this paper, we propose a fully optical ring NoC that combines static and dynamic wavelength allocation communication mechanisms. A different wavelength-channel is statically allocated to each destination node for light weight communication. Contention of simultaneous communication requests from multiple source nodes to the destination is solved by a token based arbitration for the particular wavelength-channel. For heavy load communication, a multiwavelength-channel is available by requesting it in execution time from source node to a special node that manages dynamic allocation of the shared multiwavelength-channel among all nodes. We combine these static and dynamic communication mechanisms in a same network that introduces selection techniques based on message size and congestion information. Using a photonic NoC simulator based on Phoenixsim, we evaluate our architecture under uniform random, neighbor, and hotspot traffic patterns. Simulation results show that our proposed fully optical ring NoC presents a good performance by utilizing adequate static and dynamic channels based on the selection techniques. We also show that our architecture can reduce by more than half, the energy consumption necessary for arbitration compared to hybrid photonic ring and mesh NoCs. A comparison with several previous works in term of architecture hardware cost shows that our architecture can be an attractive cost-performance efficient interconnection infrastructure for future SoCs and CMPs.

  • Improving Text Categorization with Semantic Knowledge in Wikipedia

    Xiang WANG  Yan JIA  Ruhua CHEN  Hua FAN  Bin ZHOU  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E96-D No:12
      Page(s):
    2786-2794

    Text categorization, especially short text categorization, is a difficult and challenging task since the text data is sparse and multidimensional. In traditional text classification methods, document texts are represented with “Bag of Words (BOW)” text representation schema, which is based on word co-occurrence and has many limitations. In this paper, we mapped document texts to Wikipedia concepts and used the Wikipedia-concept-based document representation method to take the place of traditional BOW model for text classification. In order to overcome the weakness of ignoring the semantic relationships among terms in document representation model and utilize rich semantic knowledge in Wikipedia, we constructed a semantic matrix to enrich Wikipedia-concept-based document representation. Experimental evaluation on five real datasets of long and short text shows that our approach outperforms the traditional BOW method.

  • An Auction Based Distribute Mechanism for P2P Adaptive Bandwidth Allocation

    Fang ZUO  Wei ZHANG  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2704-2712

    In P2P applications, networks are formed by devices belonging to independent users. Therefore, routing hotspots or routing congestions are typically created by an unanticipated new event that triggers an unanticipated surge of users to request streaming service from some particular nodes; and a challenging problem is how to provide incentive mechanisms to allocation bandwidth more fairly in order to avoid congestion and other short backs for P2P QoS. In this paper, we study P2P bandwidth game — the bandwidth allocation in P2P networks. Unlike previous works which focus either on routing or on forwarding, this paper investigates the game theoretic mechanism to incentivize node's real bandwidth demands and propose novel method that avoid congestion proactively, that is, prior to a congestion event. More specifically, we define an incentive-compatible pricing vector explicitly and give theoretical proofs to demonstrate that our mechanism can provide incentives for nodes to tell the true bandwidth demand. In order to apply this mechanism to the P2P distribution applications, we evaluate our mechanism by NS-2 simulations. The simulation results show that the incentive pricing mechanism can distribute the bandwidth fairly and effectively and can also avoid the routing hotspot and congestion effectively.

  • Optimal Parallel Algorithms for Computing the Sum, the Prefix-Sums, and the Summed Area Table on the Memory Machine Models

    Koji NAKANO  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2626-2634

    The main contribution of this paper is to show optimal parallel algorithms to compute the sum, the prefix-sums, and the summed area table on two memory machine models, the Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM). The DMM and the UMM are theoretical parallel computing models that capture the essence of the shared memory and the global memory of GPUs. These models have three parameters, the number p of threads, and the width w of the memory, and the memory access latency l. We first show that the sum of n numbers can be computed in $O({nover w}+{nlover p}+llog n)$ time units on the DMM and the UMM. We then go on to show that $Omega({nover w}+{nlover p}+llog n)$ time units are necessary to compute the sum. We also present a parallel algorithm that computes the prefix-sums of n numbers in $O({nover w}+{nlover p}+llog n)$ time units on the DMM and the UMM. Finally, we show that the summed area table of size $sqrt{n} imessqrt{n}$ can be computed in $O({nover w}+{nlover p}+llog n)$ time units on the DMM and the UMM. Since the computation of the prefix-sums and the summed area table is at least as hard as the sum computation, these parallel algorithms are also optimal.

  • A Robust Signal Recognition Method for Communication System under Time-Varying SNR Environment

    Jing-Chao LI  Yi-Bing LI  Shouhei KIDERA  Tetsuo KIRIMOTO  

     
    PAPER-Pattern Recognition

      Vol:
    E96-D No:12
      Page(s):
    2814-2819

    As a consequence of recent developments in communications, the parameters of communication signals, such as the modulation parameter values, are becoming unstable because of time-varying SNR under electromagnetic conditions. In general, it is difficult to classify target signals that have time-varying parameters using traditional signal recognition methods. To overcome this problem, this study proposes a novel recognition method that works well even for such time-dependent communication signals. This method is mainly composed of feature extraction and classification processes. In the feature extraction stage, we adopt Shannon entropy and index entropy to obtain the stable features of modulated signals. In the classification stage, the interval gray relation theory is employed as suitable for signals with time-varying parameter spaces. The advantage of our method is that it can deal with time-varying SNR situations, which cannot be handled by existing methods. The results from numerical simulation show that the proposed feature extraction algorithm, based on entropy characteristics in time-varying SNR situations,offers accurate clustering performance, and the classifier, based on interval gray relation theory, can achieve a recognition rate of up to 82.9%, even when the SNR varies from -10 to -6 dB.

  • An Efficiency-Aware Scheduling for Data-Intensive Computations on MapReduce Clusters

    Hui ZHAO  Shuqiang YANG  Hua FAN  Zhikun CHEN  Jinghu XU  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2654-2662

    Scheduling plays a key role in MapReduce systems. In this paper, we explore the efficiency of an MapReduce cluster running lots of independent and continuously arriving MapReduce jobs. Data locality and load balancing are two important factors to improve computation efficiency in MapReduce systems for data-intensive computations. Traditional cluster scheduling technologies are not well suitable for MapReduce environment, there are some in-used schedulers for the popular open-source Hadoop MapReduce implementation, however, they can not well optimize both factors. Our main objective is to minimize total flowtime of all jobs, given it's a strong NP-hard problem, we adopt some effective heuristics to seek satisfied solution. In this paper, we formalize the scheduling problem as job selection problem, a load balance aware job selection algorithm is proposed, in task level we design a strict data locality tasks scheduling algorithm for map tasks on map machines and a load balance aware scheduling algorithm for reduce tasks on reduce machines. Comprehensive experiments have been conducted to compare our scheduling strategy with well-known Hadoop scheduling strategies. The experimental results validate the efficiency of our proposed scheduling strategy.

  • Standard Cell Structure with Flexible P/N Well Boundaries for Near-Threshold Voltage Operation

    Shinichi NISHIZAWA  Tohru ISHIHARA  Hidetoshi ONODERA  

     
    PAPER-Physical Level Design

      Vol:
    E96-A No:12
      Page(s):
    2499-2507

    This paper propose a structure of standard cells where the P/N boundary ratio of each cell can be independently customized for near-threshold operation. Lowering the supply voltage is one of the most promising approaches for reducing the power consumption of VLSI circuit, however, this causes an increase of imbalance between rise and fall delays for cells having transistor stacks. Conventional cell library with fixed P/N boundary is not efficient to compensate this delay imbalance. Proposed structure achieves individual P/N boundary ratio optimization for each standard cell, therefore it cancels the imbalance between rise and fall delays at the expense of cell area. Proposed structure is verified using measured result of Ring Oscillator circuits and simulation result of benchmark circuits in 65nm CMOS. The experiments with ISCAS'85 benchmark circuits demonstrate that the standard cell library consisting of the proposed cells reduces the power consumption of the benchmark circuits by 16% on average without increasing the circuit area, compared to that of the same circuit synthesized with a library which is not optimized for the near-threshold operation.

  • A Trusted Network Access Protocol for WLAN Mesh Networks

    Yuelei XIAO  Yumin WANG  Liaojun PANG  Shichong TAN  

     
    LETTER-Information Network

      Vol:
    E96-D No:12
      Page(s):
    2865-2869

    To solve the problems of the existing trusted network access protocols for Wireless Local Area Network (WLAN) mesh networks, we propose a new trusted network access protocol for WLAN mesh networks, which is abbreviated as WMN-TNAP. This protocol implements mutual user authentication and Platform-Authentication between the supplicant and Mesh Authenticator (MA), and between the supplicant and Authentication Server (AS) of a WLAN mesh network, establishes the key management system for the WLAN mesh network, and effectively prevents the platform configuration information of the supplicant, MA and AS from leaking out. Moreover, this protocol is proved secure based on the extended Strand Space Model (SSM) for trusted network access protocols.

  • Floorplan Driven Architecture and High-Level Synthesis Algorithm for Dynamic Multiple Supply Voltages

    Shin-ya ABE  Youhua SHI  Kimiyoshi USAMI  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E96-A No:12
      Page(s):
    2597-2611

    In this paper, we propose an adaptive voltage huddle-based distributed-register architecture (AVHDR architecture), which integrates dynamic multiple supply voltages and interconnection delay into high-level synthesis. In AVHDR architecture, voltages can be dynamically assigned for energy reduction. In other words, low supply voltages are assigned to non-critical operations, and leakage power is cut off by turning off the power supply to the sleeping functional units. Next, an AVHDR-based high-level synthesis algorithm is proposed. Our algorithm is based on iterative improvement of scheduling/binding and floorplanning. In the iteration process, the modules in each huddle can be placed close to each other and the corresponding AVHDR architecture can be generated and optimized with floorplanning information. Experimental results show that on average our algorithm achieves 43.9% energy-saving compared with conventional algorithms.

  • Nonlinear Metric Learning with Deep Independent Subspace Analysis Network for Face Verification

    Xinyuan CAI  Chunheng WANG  Baihua XIAO  Yunxue SHAO  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E96-D No:12
      Page(s):
    2830-2838

    Face verification is the task of determining whether two given face images represent the same person or not. It is a very challenging task, as the face images, captured in the uncontrolled environments, may have large variations in illumination, expression, pose, background, etc. The crucial problem is how to compute the similarity of two face images. Metric learning has provided a viable solution to this problem. Until now, many metric learning algorithms have been proposed, but they are usually limited to learning a linear transformation. In this paper, we propose a nonlinear metric learning method, which learns an explicit mapping from the original space to an optimal subspace using deep Independent Subspace Analysis (ISA) network. Compared to the linear or kernel based metric learning methods, the proposed deep ISA network is a deep and local learning architecture, and therefore exhibits more powerful ability to learn the nature of highly variable dataset. We evaluate our method on the Labeled Faces in the Wild dataset, and results show superior performance over some state-of-the-art methods.

  • A Cost-Effective Buffer Map Notification Scheme for P2P VoDs Supporting VCR Operations

    Ryusuke UEDERA  Satoshi FUJITA  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2713-2719

    In this paper, we propose a new buffer map notification scheme for Peer-to-Peer Video-on-Demand systems (P2P VoDs) which support VCR operations such as fast-forward, fast-backward, and seek. To enhance the fluidity of such VCR operations, we need to refine the size of each piece as small as possible. However, such a refinement significantly degrades the performance of buffer map notification schemes with respect to the overhead, piece availability and the efficiency of resource utilizations. The basic idea behind our proposed scheme is to use a piece-based buffer map with a segment-based buffer map in a complementary manner. The result of simulations indicates that the proposed scheme certainly increases the accuracy of the information on the piece availability in the neighborhood with a sufficiently low cost, which reduces the intermittent waiting time of each peer by more than 40% even under a situation in which 50% of peers conduct the fast-forward operation over a range of 30% of the entire video.

  • Cooperative VM Migration: A Symbiotic Virtualization Mechanism by Leveraging the Guest OS Knowledge

    Ryousei TAKANO  Hidemoto NAKADA  Takahiro HIROFUCHI  Yoshio TANAKA  Tomohiro KUDOH  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2675-2683

    A virtual machine (VM) migration is useful for improving flexibility and maintainability in cloud computing environments. However, VM monitor (VMM)-bypass I/O technologies, including PCI passthrough and SR-IOV, in which the overhead of I/O virtualization can be significantly reduced, make VM migration impossible. This paper proposes a novel and practical mechanism, called Symbiotic Virtualization (SymVirt), for enabling migration and checkpoint/restart on a virtualized cluster with VMM-bypass I/O devices, without the virtualization overhead during normal operations. SymVirt allows a VMM to cooperate with a message passing layer on the guest OS, then it realizes VM-level migration and checkpoint/restart by using a combination of a user-level dynamic device configuration and coordination of distributed VMMs. We have implemented the proposed mechanism on top of QEMU/KVM and the Open MPI system. All PCI devices, including Infiniband, Ethernet, and Myrinet, are supported without implementing specific para-virtualized drivers; and it is not necessary to modify either of the MPI runtime and applications. Using the proposed mechanism, we demonstrate reactive and proactive FT mechanisms on a virtualized Infiniband cluster. We have confirmed the effectiveness using both a memory intensive micro benchmark and the NAS parallel benchmark.

  • An Access-Point Aggregation Approach for Energy-Saving Wireless Local Area Networks

    Md. Ezharul ISLAM  Nobuo FUNABIKI  Toru NAKANISHI  Kan WATANABE  

     
    PAPER

      Vol:
    E96-B No:12
      Page(s):
    2986-2997

    Nowadays, with spreads of inexpensive small communication devices, a number of wireless local area networks (WLANs) have been deployed even in the same building for the Internet access services. Their wireless access-points (APs) are often independently installed and managed by different groups such as departments or laboratories in a university or a company. Then, a user host can access to multiple WLANs by detecting signals from their APs, which increases the energy consumption and the operational cost. It may also degrade the communication performance by increasing interferences. In this paper, we present an AP aggregation approach to solve these problems in multiple WLAN environments by aggregating deployed APs of different groups into limited ones using virtual APs. First, we formulate the AP aggregation problem as a combinatorial optimization problem and prove the NP-completeness of its decision problem. Then, we propose its heuristic algorithm composed of five phases. We verify the effectiveness through extensive simulations using the WIMNET simulator.

5241-5260hit(18690hit)