The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] ALG(2355hit)

1081-1100hit(2355hit)

  • Efficient and Tailored Resource Management for the P2P Web Caching

    Kyungbaek KIM  Daeyeon PARK  

     
    PAPER-Network System

      Vol:
    E90-D No:1
      Page(s):
    48-57

    While web proxy caching is a widely deployed technique, the performance of a proxy cache is limited by the local storage. Some studies have addressed this limitation by using the residual resources of clients via a p2p method and have achieved a very high hit rate. However, these approaches treat web objects as homogeneous objects and there is no consideration of various web characteristics. Consequently, the byte hit rate of the system is limited, external bandwidth is wasted, and perceived user latency is increased. The present paper suggests an efficient p2p based web caching technique that manages objects with different policies so as to exploit the characteristics of web objects, such as size and temporal locality. Small objects are stored alone whereas large objects are stored by dividing them into numerous small blocks, which are distributed in clients. On a proxy cache, header blocks of large objects take the place of objects themselves and smaller objects are cached. This technique increases the hit rate. Unlike a web cache, which evicts large objects as soon as possible in the case where clients fulfill the role of backup storage, large objects are given higher priority than small objects in the proposed approach. This maximizes the effect of hits for large objects and thereby increases the byte hit rate. Furthermore, we construct simple latency models for various p2p based web caching systems and analyze the effects of the proposed policies on these systems. We then examine the performances of the efficient policies via a trace driven simulation. The results demonstrate that the proposed techniques effectively enhance web cache performance, including hit rate, byte hit rate, and response time.

  • Formal Design of Arithmetic Circuits Based on Arithmetic Description Language

    Naofumi HOMMA  Yuki WATANABE  Takafumi AOKI  Tatsuo HIGUCHI  

     
    PAPER-Circuit Synthesis

      Vol:
    E89-A No:12
      Page(s):
    3500-3509

    This paper presents a formal design of arithmetic circuits using an arithmetic description language called ARITH. The key idea in ARITH is to describe arithmetic algorithms directly with high-level mathematical objects (i.e., number representation systems and arithmetic operations/formulae). Using ARITH, we can provide formal description of arithmetic algorithms including those using unconventional number systems. In addition, the described arithmetic algorithms can be formally verified by equivalence checking with formula manipulations. The verified ARITH descriptions are easily translated into the equivalent HDL descriptions. In this paper, we also present an application of ARITH to an arithmetic module generator, which supports a variety of hardware algorithms for 2-operand adders, multi-operand adders, multipliers, constant-coefficient multipliers and multiply accumulators. The language processing system of ARITH incorporated in the generator verifies the correctness of ARITH descriptions in a formal method. As a result, we can obtain highly-reliable arithmetic modules whose functions are completely verified at the algorithm level.

  • Population Fitness Probability for Effectively Terminating Evolution Operations of a Genetic Algorithm

    Heng-Chou CHEN  Oscal T.-C. CHEN  

     
    LETTER-Biocybernetics, Neurocomputing

      Vol:
    E89-D No:12
      Page(s):
    3012-3014

    The probability associated with population fitness in a Genetic Algorithm (GA) is studied using the concept of average Euclidean distance. Based on the probability derived from population fitness, the GA can effectively terminate its evolution operations to mitigate the total computational load. Simulation results verify the feasibility of the derived probability used for the GA's termination strategy.

  • Evolutional Algorithm Based Learning of Time Varying Multipath Fading Channels for Software Defined Radio

    Gagik MKRTCHYAN  Katsuhiro NAITO  Kazuo MORI  Hideo KOBAYASHI  

     
    LETTER

      Vol:
    E89-B No:12
      Page(s):
    3269-3273

    Software defined radio, which uses reconfigurable signal processing devices, requires the determination of multiple unknown parameters to realize the potential capabilities of adaptive communication. Evolutional algorithms are optimal multi dimensional search techniques, and are well known to be effective for parameter determination. This letter proposes an evolutional algorithm for learning the mobile time-varying channel parameters without any specific assumption of scattering distribution. The proposed method is very simple to realize, but can provide precise channel estimation results. Simulations of an OFDM system show that for an example of OFDM communication under the time-varying fading channel, the proposed learning method can achieve the better BER performance.

  • Power-Efficient LDPC Decoder Architecture Based on Accelerated Message-Passing Schedule

    Kazunori SHIMIZU  Tatsuyuki ISHIKAWA  Nozomu TOGAWA  Takeshi IKENAGA  Satoshi GOTO  

     
    PAPER-VLSI Architecture

      Vol:
    E89-A No:12
      Page(s):
    3602-3612

    In this paper, we propose a power-efficient LDPC decoder architecture based on an accelerated message-passing schedule. The proposed decoder architecture is characterized as follows: (i) Partitioning a pipelined operation not to read and write intermediate messages simultaneously enables the accelerated message-passing schedule to be implemented with single-port SRAMs. (ii) FIFO-based buffering reduces the number of SRAM banks and words of the LDPC decoder based on the accelerated message-passing schedule. The proposed LDPC decoder keeps a single message for each non-zero bit in a parity check matrix as well as a classical schedule while achieving the accelerated message-passing schedule. Implementation results in 0.18 [µm] CMOS technology show that the proposed decoder architecture reduces an area of the LDPC decoder by 43% and a power dissipation by 29% compared to the conventional architecture based on the accelerated message-passing schedule.

  • Fast 2-Dimensional 88 Integer Transform Algorithm Design for H.264/AVC Fidelity Range Extensions

    Chih-Peng FAN  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E89-D No:12
      Page(s):
    3006-3011

    In this letter, efficient two-dimensional (2-D) fast algorithms for realizations of 88 forward and inverse integer transforms in H.264/AVC fidelity range extensions (FRExt) are proposed. Based on matrix factorizations with Kronecker product and direct sum operations, efficient fast 2-D 88 forward and inverse integer transforms can be derived from the one-dimensional (1-D) fast 88 forward and inverse integer transforms through matrix operations. The proposed fast 2-D 88 forward and inverse integer transform designs don't require transpose memory in hardware realizations. The fast 2-D 88 integer transforms require fewer latency delays and provide a larger throughput rate than the row-column based method. With regular modularity, the proposed fast algorithms are suitable for VLSI implementations to achieve H.264/AVC FRExt high-profile signal processing.

  • A Structural Approach for Transistor Circuit Synthesis

    Hiroaki YOSHIDA  Makoto IKEDA  Kunihiro ASADA  

     
    PAPER-Circuit Synthesis

      Vol:
    E89-A No:12
      Page(s):
    3529-3537

    This paper presents a structural approach for synthesizing arbitrary multi-output multi-stage static CMOS circuits at the transistor level, targeting the reduction of transistor counts. To make the problem tractable, the solution space is restricted to the circuit structures which can be obtained by performing algebraic transformations on an arbitrary prime-and-irredundant two-level circuit. The proposed algorithm is guaranteed to find the optimal solution within the solution space. The circuit structures are implicitly enumerated via structural transformations on a single graph structure, then a dynamic-programming based algorithm efficiently finds the minimum solution among them. Experimental results on a benchmark suite targeting standard cell implementations demonstrate the feasibility and effectiveness of the proposed approach. We also demonstrated the efficiency of the proposed algorithm by a numerical analysis on randomly-generated problems.

  • An Efficient and Effective Algorithm for Online Task Placement with I/O Communications in Partially Reconfigurable FPGAs

    Mitsuru TOMONO  Masaki NAKANISHI  Shigeru YAMASHITA  Kazuo NAKAJIMA  Katsumasa WATANABE  

     
    PAPER-System Level Design

      Vol:
    E89-A No:12
      Page(s):
    3416-3426

    In a partially reconfigurable FPGA of the future, arbitrary portions of its logic resources and interconnection networks will be reconfigured without affecting the other parts. Multiple tasks will be mapped and executed concurrently in such an FPGA. Efficient execution of the tasks using the limited resources of the FPGA will necessitate effective resource management. A number of online FPGA placement methods have recently been proposed for such an FPGA. However, they cannot handle I/O communications of the tasks. Taking such I/O communications into consideration, we introduce a new approach to online FPGA placement. We present an algorithm for placing each arriving task in an empty area so as to complete all the tasks efficiently. We develop two fitting strategies to effectively handle I/O communications of the tasks. Our experimental results show that properly weighted combinations of these and two other previously proposed strategies enable this algorithm to run very fast and make an effective placement of the tasks. In fact, we show that the overhead associated with the use of this algorithm is negligible as compared to the total execution time of the tasks.

  • Signal Reconstruction with Boundary-Matching via Iterative Algorithm

    Chau-Yun HSU  Tsung-Ming LO  

     
    PAPER-Digital Signal Processing

      Vol:
    E89-A No:11
      Page(s):
    3283-3289

    In various applications of signals transmission and processing, there is always a possibility of loss of samples. The iterative algorithm of Papoulis-Gerchberg (PG) is famous for solving the band-limited lost samples recovery problem. Two problems are known in this domain: (1) a band-limited signal practically is difficult to be obtained and (2) the convergence rate is too slow. By inserting a subtraction by a polynomial in the PG algorithm, using boundary-matched concept, a significant increase in performance and speed of its convergence has been achieved. In this paper, we propose an efficient approach to restore lost samples by adding a preprocess which meets the periodic boundary conditions of Fast Fourier transform in the iteration method. The accuracy of lost samples reconstruction by using the PG algorithm can be greatly improved with the aid of mapping the input data sequence of satisfying the boundary conditions. Further, we also developed another approach that force the signal to meet a new critical boundary conditions in Fourier domain that make the parameters of the preprocessing can be easily obtained. The simulation indicates that the mean square error (MSE) of the recovery and the convergence rate with the preprocess concept is better and faster than the one without preprocess concept. Our both proposed approaches can also be applied to other cases of signal restoration, which allow Cadzow's iterative processing method, with more convenience and flexibility.

  • Minimum Variance Multi-User Detection with Optimum Subband Decomposition over Multipath Channels

    Wan-Shing YANG  Wen-Hsien FANG  Che-Yu LIN  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E89-B No:11
      Page(s):
    3075-3082

    This paper presents a linear constrained minimum variance multiuser detection (MUD) scheme for DS-CDMA systems, which makes full use of the available spreading sequences of the users as well as the relevant channel information of the incoming rays in the construction of the constraint matrix. To further enhance the performance, a statistical optimum filter bank in combination with the developed minimum variance MUD with the partitioned linear interference canceller (PLIC) as the underlying structure is also addressed. The determination of the filter bank coefficients, however, calls for computationally demanding nonlinear programming. To alleviate the computational overhead, an iterative procedure is also proposed, which solves the Rayleigh quotient in each iteration. Furthermore, the expressions of the output signal to interference plus noise ratio (SINR) are also determined to provide further insights into the proposed approach. Conducted simulations validate the new scheme.

  • Systematic Interpretation of Redundant Arithmetic Adders in Binary and Multiple-Valued Logic

    Naofumi HOMMA  Takafumi AOKI  Tatsuo HIGUCHI  

     
    PAPER

      Vol:
    E89-C No:11
      Page(s):
    1645-1654

    This paper presents an algorithm-level interpretation of fast adder structures in binary/multiple-valued logic. The key idea is to employ a unified representation of addition algorithms called Counter Tree Diagrams (CTDs). The use of CTDs makes it possible to describe and analyze addition algorithms at various levels of abstraction. A high-level CTD represents a network of coarse-grained components associated with multiple-valued logic devices, while a low-level CTD represents a network of primitive components directly mapped onto binary logic devices. The level of abstraction in circuit representation can be changed by decomposition of CTDs. We can derive possible variations of adder structures by decomposing a high-level CTD into low-level CTDs. This paper demonstrates the interpretation of redundant arithmetic adders based on CTDs. We first introduce an extension of CTDs to represent possible redundant arithmetic adders with limited carry propagation. Using the extended version of CTDs, we can classify the conventional adder structures including those using emerging devices into three types in a systematic way.

  • Joint Estimation of Frequency Offset and Channel Frequency Response Using EM Algorithm for OFDM Systems

    Masahiro FUJII  Makoto ITAMI  Kohji ITOH  

     
    PAPER

      Vol:
    E89-A No:11
      Page(s):
    3123-3130

    Orthogonal Frequency Division Multiplexing (OFDM) systems are very sensitive to the frequency offset of the local oscillator at the receiver while the symbol timing offset can be absorbed in the guard interval. For the same reason, estimation of the frequency characteristics, needed for OFDM to be adapted to the frequency selective fading, can only be carried out conventionally after the frequency offset has been compensated. And accurate estimation of large frequency offset certainly requires high precision estimate of the frequency characteristics. In this paper, we propose a new joint estimation method of the frequency offset and the channel frequency response using an Expectation-Maximization (EM) algorithm for OFDM systems. The proposed algorithm overcomes the limitation of the thus far proposed algorithm. By computer simulations, we show the proposed algorithm provides estimation accuracy close to its lower bound in a wide range of the frequency offset.

  • Network Design Scheme for Virtual Private Network Services

    Tomonori TAKEDA  Ryuichi MATSUZAKI  Ichiro INOUE  Shigeo URUSHIDANI  

     
    PAPER-Network

      Vol:
    E89-B No:11
      Page(s):
    3046-3054

    This paper proposes a network design scheme for Virtual Private Network (VPN) services. Traditionally, network design to compute required amount of resource is based on static point-to-point resource demand. This scheme is effective for traditional private line services. However, since VPN services allow multi-site connectivity for customers, it may not be appropriate to design a network based on static point-to-point resource demand. In particular, this scheme is not effective when the traffic pattern changes over time. Therefore, network design for VPN services introduces a new challenge in order to comply with traffic flexibility. There are conventional studies tackling this issue. In those studies, by defining a resource demand model considering flexibility, and designing the network based on this model, amount of resource required can be computed. However, there are some deficiencies in those studies. This paper proposes a new network design scheme, consisting of two components. The first one is a new resource demand model, created by extending conventional resource demand models, that can specify resource demand more precisely. The second one is a new network design algorithm for this resource demand model. Simulations are conducted to evaluate the performance of the proposed network design scheme, and the results show significant performance improvement against conventional schemes. In addition, deployment considerations of the proposed scheme are analyzed.

  • Round-Robin with VirtualClock Scheduling Algorithm in Multiservice Packet Networks

    Lei WANG  Mike H. MACGREGOR  

     
    PAPER-Network

      Vol:
    E89-B No:11
      Page(s):
    3040-3045

    In this paper, we present a scheduler that incorporates round robin service within a VirtualClock discipline. Time-stamp based scheduling algorithms attain a low local delay bound and performance guarantee, but are computationally complex. On the other hand, round robin schemes are simple to implement and have computational complexity of O(1), but they are well known for their output burstiness and short-term unfairness. In order to overcome this problem, we combine round robin with VirtualClock in an algorithm we call VCRR. VCRR possesses better fairness than simple round robin, low jitter and a good scheduling delay bound. At the same time, VCRR preserves the O(1) time complexity of round robin. Simulation experiments show VCRR's efficiency in terms of delay performance, jitter and fairness.

  • On Step-by-Step Complete Decoding Triple-Error-Correcting Binary BCH Codes

    Shyue-Win WEI  

     
    LETTER-Coding Theory

      Vol:
    E89-A No:11
      Page(s):
    3360-3362

    According to the properties found in the algebraic complete decoding method for triple-error-correcting binary Bose-Chaudhuri-Hocquenghem (BCH) codes, a step-by-step complete decoding algorithm of this code is presented.

  • Time Complexity Analysis of the Legal Firing Sequence Problem of Petri Nets with Inhibitor Arcs

    Satoshi TAOKA  Toshimasa WATANABE  

     
    PAPER-Concurrent Systems

      Vol:
    E89-A No:11
      Page(s):
    3216-3226

    Petri nets with inhibitor arcs are referred to as inhibitor-arc Petri nets. It is shown that modeling capability of inhibitor-arc Petri nets is equivalent to that of Turing machines. The subject of this paper is the legal firing sequence problem (INLFS) for inhibitor-arc Petri nets: given an inhibitor-arc Petri net IN, an initial marking M0 and a firing count vector X, find a firing sequence δ such that its firing starts from M0 and each transition t appears in δ exactly X(t) times as prescribed by X. The paper is the first step of research for time complexity analysis and designing algorithms of INLFS, one of the most fundamental problems for inhibitor-arc Petri nets having more modeling capability than ordinary Peri nets. The recognition version of INLFS, denoted as RINLFS, means a decision problem, asking a "yes" or "no" answer on the existence of a solution δ to INLFS. The main results are the following (1) and (2). (1) Proving (1-1) and (1-2) when the underlying Petri net of IN is an unweighted state machine: (1-1) INLFS can be solved in pseudo-polynomial (O(|X|)) time for IN of non-adjacent type having only one special place called a rivet; (1-2) RINLFS is NP-hard for IN with at least three rivets; (2) Proving that RINLFS for IN whose underlying Petri net is unweighted and forward conflict-free is NP-hard. Heuristic algorithms for solving INLFS are going to be proposed in separate papers.

  • Automated Design of Analog Circuits Starting with Idealized Elements

    Naoyuki UNNO  Nobuo FUJII  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E89-A No:11
      Page(s):
    3313-3319

    This paper presents an automated design of analog circuits starting with idealized elements. Our system first synthesizes circuits using idealized elements by a genetic algorithm (GA). GA evolves circuit topologies and transconductances of idealized elements to achieve the given specifications. The use of idealized elements effectively reduces search space and make the synthesis efficient. Second, idealized elements in a generated circuit are replaced by MOSFETs. Through the two processes, a circuit satisfying the given specifications can be obtained. The capability of this method was demonstrated through experiments of synthesis of a trans-impedance amplifier and a cubing circuit and benchmark tests. The results of the benchmark tests show the proposed CAD is more than 10 times faster than the CAD which does not use idealized elements.

  • Error Analysis of the Multilevel Fast Multipole Algorithm

    Shinichiro OHNUKI  Weng Cho CHEW  

     
    PAPER-Electromagnetic Theory

      Vol:
    E89-C No:11
      Page(s):
    1676-1681

    The computational error of the multilevel fast multipole algorithm is studied. The error convergence rate, achievable minimum error, and error bound are investigated for various element distributions. We will discuss the boundary between the large and small buffer cases in terms of machine precision. The needed buffer size to reach double precision accuracy will be clarified.

  • Encoding LDPC Codes Using the Triangular Factorization

    Yuichi KAJI  

     
    PAPER-Coding Theory

      Vol:
    E89-A No:10
      Page(s):
    2510-2518

    An algorithm for encoding low-density parity check (LDPC) codes is investigated. The algorithm computes parity check symbols by solving a set of sparse equations, and the triangular factorization is employed to solve the equations efficiently. It is shown analytically and experimentally that the proposed algorithm is more efficient than the Richardson's encoding algorithm if the code has a small gap.

  • Geometric Properties of Quasi-Additive Learning Algorithms

    Kazushi IKEDA  

     
    PAPER-Control, Neural Networks and Learning

      Vol:
    E89-A No:10
      Page(s):
    2812-2817

    The family of Quasi-Additive (QA) algorithms is a natural generalization of the perceptron learning, which is a kind of on-line learning having two parameter vectors: One is an accumulation of input vectors and the other is a weight vector for prediction associated with the former by a nonlinear function. We show that the vectors have a dually-flat structure from the information-geometric point of view, and this representation makes it easier to discuss the convergence properties.

1081-1100hit(2355hit)