The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] TE(21534hit)

5701-5720hit(21534hit)

  • Window Memory Layout Scheme for Alternate Row-Wise/Column-Wise Matrix Access

    Lei GUO  Yuhua TANG  Yong DOU  Yuanwu LEI  Meng MA  Jie ZHOU  

     
    PAPER-Computer System

      Vol:
    E96-D No:12
      Page(s):
    2765-2775

    The effective bandwidth of the dynamic random-access memory (DRAM) for the alternate row-wise/column-wise matrix access (AR/CMA) mode, which is a basic characteristic in scientific and engineering applications, is very low. Therefore, we propose the window memory layout scheme (WMLS), which is a matrix layout scheme that does not require transposition, for AR/CMA applications. This scheme maps one row of a logical matrix into a rectangular memory window of the DRAM to balance the bandwidth of the row- and column-wise matrix access and to increase the DRAM IO bandwidth. The optimal window configuration is theoretically analyzed to minimize the total number of no-data-visit operations of the DRAM. Different WMLS implementationsare presented according to the memory structure of field-programmable gata array (FPGA), CPU, and GPU platforms. Experimental results show that the proposed WMLS can significantly improve DRAM bandwidth for AR/CMA applications. achieved speedup factors of 1.6× and 2.0× are achieved for the general-purpose CPU and GPU platforms, respectively. For the FPGA platform, the WMLS DRAM controller is custom. The maximum bandwidth for the AR/CMA mode reaches 5.94 GB/s, which is a 73.6% improvement compared with that of the traditional row-wise access mode. Finally, we apply WMLS scheme for Chirp Scaling SAR application, comparing with the traditional access approach, the maximum speedup factors of 4.73X, 1.33X and 1.56X can be achieved for FPGA, CPU and GPU platform, respectively.

  • An Efficiency-Aware Scheduling for Data-Intensive Computations on MapReduce Clusters

    Hui ZHAO  Shuqiang YANG  Hua FAN  Zhikun CHEN  Jinghu XU  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2654-2662

    Scheduling plays a key role in MapReduce systems. In this paper, we explore the efficiency of an MapReduce cluster running lots of independent and continuously arriving MapReduce jobs. Data locality and load balancing are two important factors to improve computation efficiency in MapReduce systems for data-intensive computations. Traditional cluster scheduling technologies are not well suitable for MapReduce environment, there are some in-used schedulers for the popular open-source Hadoop MapReduce implementation, however, they can not well optimize both factors. Our main objective is to minimize total flowtime of all jobs, given it's a strong NP-hard problem, we adopt some effective heuristics to seek satisfied solution. In this paper, we formalize the scheduling problem as job selection problem, a load balance aware job selection algorithm is proposed, in task level we design a strict data locality tasks scheduling algorithm for map tasks on map machines and a load balance aware scheduling algorithm for reduce tasks on reduce machines. Comprehensive experiments have been conducted to compare our scheduling strategy with well-known Hadoop scheduling strategies. The experimental results validate the efficiency of our proposed scheduling strategy.

  • A Fully Optical Ring Network-on-Chip with Static and Dynamic Wavelength Allocation

    Ahmadou Dit Adi CISSE  Michihiro KOIBUCHI  Masato YOSHIMI  Hidetsugu IRIE  Tsutomu YOSHINAGA  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2545-2554

    Silicon photonics Network-on-Chips (NoCs) have emerged as an attractive solution to alleviate the high power consumption of traditional electronic interconnects. In this paper, we propose a fully optical ring NoC that combines static and dynamic wavelength allocation communication mechanisms. A different wavelength-channel is statically allocated to each destination node for light weight communication. Contention of simultaneous communication requests from multiple source nodes to the destination is solved by a token based arbitration for the particular wavelength-channel. For heavy load communication, a multiwavelength-channel is available by requesting it in execution time from source node to a special node that manages dynamic allocation of the shared multiwavelength-channel among all nodes. We combine these static and dynamic communication mechanisms in a same network that introduces selection techniques based on message size and congestion information. Using a photonic NoC simulator based on Phoenixsim, we evaluate our architecture under uniform random, neighbor, and hotspot traffic patterns. Simulation results show that our proposed fully optical ring NoC presents a good performance by utilizing adequate static and dynamic channels based on the selection techniques. We also show that our architecture can reduce by more than half, the energy consumption necessary for arbitration compared to hybrid photonic ring and mesh NoCs. A comparison with several previous works in term of architecture hardware cost shows that our architecture can be an attractive cost-performance efficient interconnection infrastructure for future SoCs and CMPs.

  • Teachability of a Subclass of Simple Deterministic Languages

    Yasuhiro TAJIMA  

     
    PAPER-Fundamentals of Information Systems

      Vol:
    E96-D No:12
      Page(s):
    2733-2742

    We show teachability of a subclass of simple deterministic languages. The subclass we define is called stack uniform simple deterministic languages. Teachability is derived by showing the query learning algorithm for this language class. Our learning algorithm uses membership, equivalence and superset queries. Then, it terminates in polynomial time. It is already known that simple deterministic languages are polynomial time query learnable by context-free grammars. In contrast, our algorithm guesses a hypothesis by a stack uniform simple deterministic grammar, thus our result is strict teachability of the subclass of simple deterministic languages. In addition, we discuss parameters of the polynomial for teachability. The “thickness” is an important parameter for parsing and it should be one of parameters to evaluate the time complexity.

  • A Power Adaptation Method for Finite Length Block Fading Channel with Multiple Antennas

    Chen JI  Jiang WU  Dongming WANG  Xiaohu YOU  

     
    PAPER-Fundamental Theories for Communications

      Vol:
    E96-B No:12
      Page(s):
    3041-3049

    We analyze a power adaptation method to maximize the achievable rate under the finite block length regime, for MIMO block fading channel with channel state information available at both the transmitter and receiver side. We find a convex approximation to the lower bound of the achievable rate, and it leads to a simple power and rate adaptation method. We show that the method achieves near optimal channel rate under the finite block length regime. Compared to the classical waterfilling method, the proposed method can further improve achievable rate especially for short block lengths.

  • Real-Time and Memory-Efficient Arrhythmia Detection in ECG Monitors Using Antidictionary Coding

    Takahiro OTA  Hiroyoshi MORITA  Adriaan J. de Lind van WIJNGAARDEN  

     
    PAPER-Source Coding

      Vol:
    E96-A No:12
      Page(s):
    2343-2350

    This paper presents a real-time and memory-efficient arrhythmia detection system with binary classification that uses antidictionary coding for the analysis and classification of electrocardiograms (ECGs). The measured ECG signals are encoded using a lossless antidictionary encoder, and the system subsequently uses the compression rate to distinguish between normal beats and arrhythmia. An automated training data procedure is used to construct the automatons, which are probabilistic models used to compress the ECG signals, and to determine the threshold value for detecting the arrhythmia. Real-time computer simulations with samples from the MIT-BIH arrhythmia database show that the averages of sensitivity and specificity of the proposed system are 97.8% and 96.4% for premature ventricular contraction detection, respectively. The automatons are constructed using training data and comprise only 11 kilobytes on average. The low complexity and low memory requirements make the system particularly suitable for implementation in portable ECG monitors.

  • An Access-Point Aggregation Approach for Energy-Saving Wireless Local Area Networks

    Md. Ezharul ISLAM  Nobuo FUNABIKI  Toru NAKANISHI  Kan WATANABE  

     
    PAPER

      Vol:
    E96-B No:12
      Page(s):
    2986-2997

    Nowadays, with spreads of inexpensive small communication devices, a number of wireless local area networks (WLANs) have been deployed even in the same building for the Internet access services. Their wireless access-points (APs) are often independently installed and managed by different groups such as departments or laboratories in a university or a company. Then, a user host can access to multiple WLANs by detecting signals from their APs, which increases the energy consumption and the operational cost. It may also degrade the communication performance by increasing interferences. In this paper, we present an AP aggregation approach to solve these problems in multiple WLAN environments by aggregating deployed APs of different groups into limited ones using virtual APs. First, we formulate the AP aggregation problem as a combinatorial optimization problem and prove the NP-completeness of its decision problem. Then, we propose its heuristic algorithm composed of five phases. We verify the effectiveness through extensive simulations using the WIMNET simulator.

  • Deployment of OpenFlow/SDN Technologies to Carrier Services Open Access

    Yoichi SATO  Ichiro FUKUDA  Tomonori FUJITA  

     
    INVITED PAPER

      Vol:
    E96-B No:12
      Page(s):
    2946-2952

    The use of computing resources on network is becoming active in the Internet and private networks. OpenFlow/Software-Defined Networking (SDN) is drawing attention as a method to control network virtualization for the cloud computing services and other carrier services. This paper introduces examples of OpenFlow/SDN technologies applied to commercial cloud services. Various activities to expand coverage over commercial carrier networks are also mentioned.

  • Effect of Applied Magnetic Field Angle and Intensity on Magnetic Cluster State of Stacked Perpendicular Recording Media

    Shohei SATO  Yoshiaki YAMAGUCHI  Ryuji SUGITA  

     
    PAPER

      Vol:
    E96-C No:12
      Page(s):
    1479-1483

    The uniform magnetic field of various strength was applied to the perpendicularly and in-plane demagnetized media, and the change in each magnetic cluster state was investigated as the fundamental investigation of the influence of demagnetization method on noise during signal recording on the stacked perpendicular recording media. The results showed that the in-plane demagnetization can achieve lower noise level if the recording field is not very high. In other words, the in-plane demagnetization is an effective way to achieve lower noise in transition area, near track edge of recorded bit, and in high-density bit. In addition, the simulation clarified that this noise reduction can be explained using the idea of sub-domain structure in the in-plane demagnetized media.

  • Deposition of Inclined Magnetic Anisotropy Film by Oblique Incidence Collimated Sputtering

    Naoki HONDA  Akito HONDA  

     
    PAPER

      Vol:
    E96-C No:12
      Page(s):
    1469-1473

    Deposition of inclined anisotropy film for bit-patterned media was studied using an oblique incidence collimated sputtering. Pt underlayer increased the inclination angle of magnetic layer more than 5° on a Ta seed layer. Further increase of the angle was obtained by annealing Pt/Ru underlayer resulting an inclination angle of 9.4° for a Co-Cr15.5 film on the underlayer. The magnetic properties of the Co-Cr15.5 film with an inclined orientation was estimated comparing measured hysteresis loops with simulated ones, which indicated to have inclined magnetic anisotropy with an anisotropy field of about 4.5kOe and a deflection angle of the anisotropy about the same as that of the crystalline orientation.

  • A Recorded-Bit Patterning Scheme with Accumulated Weight Decision for Bit-Patterned Media Recording

    Autthasith ARRAYANGKOOL  Chanon WARISARN  Piya KOVINTAVEWAT  

     
    PAPER

      Vol:
    E96-C No:12
      Page(s):
    1490-1496

    To achieve high recording density in a bit-patterned media recording system, the spacing between data bit islands in both the along-track and the across-track directions must be decreased, thus leading to the increase of two-dimensional (2D) interference. One way to reduce the 2D interference is to apply a 2D coding scheme on a data sequence before recording; however, this method usually requires many redundant bits, thus lowering a code rate. Therefore, we propose a novel 2D coding scheme referred to as a recorded-bit patterning (RBP) scheme to mitigate the 2D interference, which requires no redundant bits at the expense of using more buffer memory. Specifically, an input data sequence is first split into three tracks in which will then be rotated to find the best 3-track data pattern based on a look-up table before recording, such that the shifted data tracks yield the least effect of 2D interference in the readback signal. Numerical results indicate that the proposed RBP scheme provides a significant performance improvement if compared to a conventional system (without 2D coding), especially when the recording density is high and/or the position jitter noise is large.

  • Hybrid Message-Passing Algorithm and Architecture for Decoding Cyclic Non-binary LDPC Codes

    Yichao LU  Gang HE  Guifen TIAN  Satoshi GOTO  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E96-A No:12
      Page(s):
    2652-2659

    Recently, non-binary low-density parity-check (NB-LDPC) codes starts to show their superiority in achieving significant coding gains when moderate codeword lengths are adopted. However, the overwhelming decoding complexity keeps NB-LDPC codes from being widely employed in modern communication devices. This paper proposes a hybrid message-passing decoding algorithm which consumes very low computational complexity. It achieves competitive error performance compared with conventional Min-max algorithm. Simulation result on a (255,174) cyclic code shows that this algorithm obtains at least 0.5dB coding gain over other state-of-the-art low-complexity NB-LDPC decoding algorithms. A partial-parallel NB-LDPC decoder architecture for cyclic NB-LDPC codes is also developed based on this algorithm. Optimization schemes are employed to cut off hard decision symbols in RAMs and also to store only part of the reliability messages. In addition, the variable node units are redesigned especially for the proposed algorithm. Synthesis results demonstrate that about 24.3% gates and 12% memories can be saved over previous works.

  • A Novel Pedestrian Detector on Low-Resolution Images: Gradient LBP Using Patterns of Oriented Edges

    Ahmed BOUDISSA  Joo Kooi TAN  Hyoungseop KIM  Takashi SHINOMIYA  Seiji ISHIKAWA  

     
    LETTER-Pattern Recognition

      Vol:
    E96-D No:12
      Page(s):
    2882-2887

    This paper introduces a simple algorithm for pedestrian detection on low resolution images. The main objective is to create a successful means for real-time pedestrian detection. While the framework of the system consists of edge orientations combined with the local binary patterns (LBP) feature extractor, a novel way of selecting the threshold is introduced. Using the mean-variance of the background examples this threshold improves significantly the detection rate as well as the processing time. Furthermore, it makes the system robust to uniformly cluttered backgrounds, noise and light variations. The test data is the INRIA pedestrian dataset and for the classification, a support vector machine with a radial basis function (RBF) kernel is used. The system performs at state-of-the-art detection rates while being intuitive as well as very fast which leaves sufficient processing time for further operations such as tracking and danger estimation.

  • A New Face Relighting Method Based on Edge-Preserving Filter

    Lingyu LIANG  Lianwen JIN  

     
    LETTER-Computer Graphics

      Vol:
    E96-D No:12
      Page(s):
    2904-2907

    We propose a new face relighting method using an illuminance template generated from a single reference portrait. First, the reference is wrapped according to the shape of the target. Second, we employ a new spatially variant edge-preserving smoothing filter to remove the facial identity and texture details of the wrapped reference, and obtain the illumination template. Finally, we relight the target with the template in CIELAB color space. Experiments show the effectiveness of our method for both grayscale and color faces taken from different databases, and the comparisons with previous works demonstrate a better relighting effect produced by our method.

  • Local Information, Observable Parameters, and Global View Open Access

    Hiroshi SAITO  

     
    INVITED PAPER

      Vol:
    E96-B No:12
      Page(s):
    3017-3027

    The “Blind Men and an Elephant” is an old Indian story about a group of blind men who encounter an elephant and do not know what it is. This story describes the difficulties of understanding a large concept or global view based on only local information. Modern technologies enable us to easily obtain and retain local information. However, simply collecting local information does not give us a global view, as evident in this old story. This paper gives a concrete model of this story on the plane to theoretically and mathematically discuss it. It analyzes what information we can obtain from collected local information. For a convex target object modeling the elephant and a convex sensing area, it is proven that the size and perimeter length of the target object are the only parameters that can be observed by randomly deployed sensors modeling the blind men. To increase the number of observable parameters, this paper argues that non-convex sensing areas are important and introduces composite sensor nodes as an approach to implement non-convex sensing areas. The paper also derives a model on the discrete space and analyzes it. The analysis results on the discrete space are applicable to some network related issues such as link quality estimation in a part of a network based on end-to-end probing.

  • Performance Evaluation of Probing Front-End Circuits for On-Chip Noise Monitoring

    Yuuki ARAGA  Nao UEDA  Yasumasa TAKAGI  Makoto NAGATA  

     
    PAPER-Device and Circuit Modeling and Analysis

      Vol:
    E96-A No:12
      Page(s):
    2516-2523

    A probing front end circuit (PFE) senses and digitizes voltage noises at in-circuit locations on such as power supply wiring and substrate taps in a chip, with the simplest circuit construction only with a source follower or a unity gain buffer, followed by a latch comparator. The PFE with 2.5V I/O transistors in a 65nm CMOS technology node demonstrates 9.0 ENOB and 60.7dB SFDR in equivalent sampling at 1.0Gs/s, for a sinusoidal waveform of 10MHz with 200mV peak-to-peak amplitude. Behavioral modeling of an entire waveform acquisition system using PFEs includes the statistical variations of reference voltage and sampling timing. The simulation quantitatively explains the measured dynamic properties of on-chip noise monitoring, such as the AC response in SNDR and digitizing throughputs, with the clear dependency on the frequency and amplitude of input waveforms.

  • A New 8-Bit AES Design for Wireless Network Applications

    Ming-Chih CHEN  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E96-A No:12
      Page(s):
    2587-2596

    In this paper, we present a pure hardware implementation of the advanced encryption standard (AES) with 8-bit data path with both encryption/decryption abilities for applications of wireless network. To achieve the requirements of low area resource and high throughput performance, the 8-bit AES design overlaps the MixColumns (MC) and ShiftRows (SR), Inverse MixColumns (IMC) and Inverse ShiftRows (ISR) operations in order to reduce the required clock cycles and critical path delay of transformations involved. The combinations of SB with ISB, MC with IMC, and SR with ISR can effectively reduce the area cost of the AES realization. We implement the AES processor in an ASIC chip. The design has the area cost of 4.3 k-gates with throughput of 72Mbps which can meet the throughput requirement of IEEE 802.11g wireless network standard. From the experimental results, we observe that our AES design has better performance compared with other previous designs.

  • Improvement of Steiner Tree Algorithm: Branch-Based Multi-Cast

    Hiroshi MATSUURA  

     
    PAPER-Fundamentals of Information Systems

      Vol:
    E96-D No:12
      Page(s):
    2743-2752

    There is a well known Steiner tree algorithm called minimum-cost paths heuristic (MPH), which is used for many multicast network operations and is considered a benchmark for other Steiner tree algorithms. MPH's average case time complexity is O(m(l+nlog n)), where m is the number of end nodes, n is the number of nodes, and l is the number of links in the network, because MPH has to run Dijkstra's algorithm as many times as the number of end nodes. The author recently proposed a Steiner tree algorithm called branch-based multi-cast (BBMC), which produces exactly the same multicast tree as MPH in a constant processing time irrespective of the number of multicast end nodes. However, the theoretical result for the average case time complexity of BBMC was expressed as O(log m(l+nlog n)) and could not accurately reflect the above experimental result. This paper proves that the average case time complexity of BBMC can be shortened to O(l+nlog n), which is independent of the number of end nodes, when there is an upper limit of the node degree, which is the number of links connected to a node. In addition, a new parameter β is applied to BBMC, so that the multicast tree created by BBMC has less links on it. Even though the tree costs increase due to this parameter, the tree cost increase rates are much smaller than the link decrease rates.

  • Weight Distribution for Non-binary Cluster LDPC Code Ensemble

    Takayuki NOZAKI  Masaki MAEHARA  Kenta KASAI  Kohichi SAKANIWA  

     
    PAPER-Coding Theory

      Vol:
    E96-A No:12
      Page(s):
    2382-2390

    This paper derives the average symbol and bit weight distributions for the irregular non-binary cluster low-density parity-check (LDPC) code ensembles. Moreover, we give the exponential growth rates of the average weight distributions in the limit of large code length. We show the condition that the typical minimum distances linearly grow with the code length.

  • An Exact Approach for GPC-Based Compressor Tree Synthesis

    Taeko MATSUNAGA  Shinji KIMURA  Yusuke MATSUNAGA  

     
    PAPER-Logic Synthesis, Test and Verification

      Vol:
    E96-A No:12
      Page(s):
    2553-2560

    Multi-operand adders that calculate the summation of more than two operands usually consist of compressor trees, which reduce the number of operands to two without any carry propagation, and carry-propagate adders for the two operands in the ASIC implementation. Compressor trees that consist of full adders and half adders cannot be implemented efficiently on LUT-based FPGAs, and carry-chains or dedicated structures have been utilized to produce multi-operand adders on FPGAs. Recent studies indicate that compressor trees can be implemented efficiently on LUTs using Generalized Parallel Counters (GPCs) as the building blocks of compressor trees. This paper addresses the problem of synthesizing compressor trees based on GPCs. Based on the observation that characteristics such as the area, power, and delay correlate roughly to the total number and the maximum level of GPCs, the target problem can be regarded as a minimization problem for the total number of GPCs and the maximum levels of the GPCs, for which an ILP-based approach is proposed. The key point of our formulation is not to model the problem based on the structures of compressor trees like the existing approach, but instead the compression process itself is used to reduce the number of variables and constraints in the ILP formulation. The experimental results demonstrate the advantage of our formulation in terms of the quality and runtime.

5701-5720hit(21534hit)