The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] algorithm(2137hit)

981-1000hit(2137hit)

  • Fast 2-Dimensional 88 Integer Transform Algorithm Design for H.264/AVC Fidelity Range Extensions

    Chih-Peng FAN  

     
    LETTER-Image Processing and Video Processing

      Vol:
    E89-D No:12
      Page(s):
    3006-3011

    In this letter, efficient two-dimensional (2-D) fast algorithms for realizations of 88 forward and inverse integer transforms in H.264/AVC fidelity range extensions (FRExt) are proposed. Based on matrix factorizations with Kronecker product and direct sum operations, efficient fast 2-D 88 forward and inverse integer transforms can be derived from the one-dimensional (1-D) fast 88 forward and inverse integer transforms through matrix operations. The proposed fast 2-D 88 forward and inverse integer transform designs don't require transpose memory in hardware realizations. The fast 2-D 88 integer transforms require fewer latency delays and provide a larger throughput rate than the row-column based method. With regular modularity, the proposed fast algorithms are suitable for VLSI implementations to achieve H.264/AVC FRExt high-profile signal processing.

  • Formal Design of Arithmetic Circuits Based on Arithmetic Description Language

    Naofumi HOMMA  Yuki WATANABE  Takafumi AOKI  Tatsuo HIGUCHI  

     
    PAPER-Circuit Synthesis

      Vol:
    E89-A No:12
      Page(s):
    3500-3509

    This paper presents a formal design of arithmetic circuits using an arithmetic description language called ARITH. The key idea in ARITH is to describe arithmetic algorithms directly with high-level mathematical objects (i.e., number representation systems and arithmetic operations/formulae). Using ARITH, we can provide formal description of arithmetic algorithms including those using unconventional number systems. In addition, the described arithmetic algorithms can be formally verified by equivalence checking with formula manipulations. The verified ARITH descriptions are easily translated into the equivalent HDL descriptions. In this paper, we also present an application of ARITH to an arithmetic module generator, which supports a variety of hardware algorithms for 2-operand adders, multi-operand adders, multipliers, constant-coefficient multipliers and multiply accumulators. The language processing system of ARITH incorporated in the generator verifies the correctness of ARITH descriptions in a formal method. As a result, we can obtain highly-reliable arithmetic modules whose functions are completely verified at the algorithm level.

  • Power-Efficient LDPC Decoder Architecture Based on Accelerated Message-Passing Schedule

    Kazunori SHIMIZU  Tatsuyuki ISHIKAWA  Nozomu TOGAWA  Takeshi IKENAGA  Satoshi GOTO  

     
    PAPER-VLSI Architecture

      Vol:
    E89-A No:12
      Page(s):
    3602-3612

    In this paper, we propose a power-efficient LDPC decoder architecture based on an accelerated message-passing schedule. The proposed decoder architecture is characterized as follows: (i) Partitioning a pipelined operation not to read and write intermediate messages simultaneously enables the accelerated message-passing schedule to be implemented with single-port SRAMs. (ii) FIFO-based buffering reduces the number of SRAM banks and words of the LDPC decoder based on the accelerated message-passing schedule. The proposed LDPC decoder keeps a single message for each non-zero bit in a parity check matrix as well as a classical schedule while achieving the accelerated message-passing schedule. Implementation results in 0.18 [µm] CMOS technology show that the proposed decoder architecture reduces an area of the LDPC decoder by 43% and a power dissipation by 29% compared to the conventional architecture based on the accelerated message-passing schedule.

  • Population Fitness Probability for Effectively Terminating Evolution Operations of a Genetic Algorithm

    Heng-Chou CHEN  Oscal T.-C. CHEN  

     
    LETTER-Biocybernetics, Neurocomputing

      Vol:
    E89-D No:12
      Page(s):
    3012-3014

    The probability associated with population fitness in a Genetic Algorithm (GA) is studied using the concept of average Euclidean distance. Based on the probability derived from population fitness, the GA can effectively terminate its evolution operations to mitigate the total computational load. Simulation results verify the feasibility of the derived probability used for the GA's termination strategy.

  • An Efficient and Effective Algorithm for Online Task Placement with I/O Communications in Partially Reconfigurable FPGAs

    Mitsuru TOMONO  Masaki NAKANISHI  Shigeru YAMASHITA  Kazuo NAKAJIMA  Katsumasa WATANABE  

     
    PAPER-System Level Design

      Vol:
    E89-A No:12
      Page(s):
    3416-3426

    In a partially reconfigurable FPGA of the future, arbitrary portions of its logic resources and interconnection networks will be reconfigured without affecting the other parts. Multiple tasks will be mapped and executed concurrently in such an FPGA. Efficient execution of the tasks using the limited resources of the FPGA will necessitate effective resource management. A number of online FPGA placement methods have recently been proposed for such an FPGA. However, they cannot handle I/O communications of the tasks. Taking such I/O communications into consideration, we introduce a new approach to online FPGA placement. We present an algorithm for placing each arriving task in an empty area so as to complete all the tasks efficiently. We develop two fitting strategies to effectively handle I/O communications of the tasks. Our experimental results show that properly weighted combinations of these and two other previously proposed strategies enable this algorithm to run very fast and make an effective placement of the tasks. In fact, we show that the overhead associated with the use of this algorithm is negligible as compared to the total execution time of the tasks.

  • Evolutional Algorithm Based Learning of Time Varying Multipath Fading Channels for Software Defined Radio

    Gagik MKRTCHYAN  Katsuhiro NAITO  Kazuo MORI  Hideo KOBAYASHI  

     
    LETTER

      Vol:
    E89-B No:12
      Page(s):
    3269-3273

    Software defined radio, which uses reconfigurable signal processing devices, requires the determination of multiple unknown parameters to realize the potential capabilities of adaptive communication. Evolutional algorithms are optimal multi dimensional search techniques, and are well known to be effective for parameter determination. This letter proposes an evolutional algorithm for learning the mobile time-varying channel parameters without any specific assumption of scattering distribution. The proposed method is very simple to realize, but can provide precise channel estimation results. Simulations of an OFDM system show that for an example of OFDM communication under the time-varying fading channel, the proposed learning method can achieve the better BER performance.

  • Minimum Variance Multi-User Detection with Optimum Subband Decomposition over Multipath Channels

    Wan-Shing YANG  Wen-Hsien FANG  Che-Yu LIN  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E89-B No:11
      Page(s):
    3075-3082

    This paper presents a linear constrained minimum variance multiuser detection (MUD) scheme for DS-CDMA systems, which makes full use of the available spreading sequences of the users as well as the relevant channel information of the incoming rays in the construction of the constraint matrix. To further enhance the performance, a statistical optimum filter bank in combination with the developed minimum variance MUD with the partitioned linear interference canceller (PLIC) as the underlying structure is also addressed. The determination of the filter bank coefficients, however, calls for computationally demanding nonlinear programming. To alleviate the computational overhead, an iterative procedure is also proposed, which solves the Rayleigh quotient in each iteration. Furthermore, the expressions of the output signal to interference plus noise ratio (SINR) are also determined to provide further insights into the proposed approach. Conducted simulations validate the new scheme.

  • Joint Estimation of Frequency Offset and Channel Frequency Response Using EM Algorithm for OFDM Systems

    Masahiro FUJII  Makoto ITAMI  Kohji ITOH  

     
    PAPER

      Vol:
    E89-A No:11
      Page(s):
    3123-3130

    Orthogonal Frequency Division Multiplexing (OFDM) systems are very sensitive to the frequency offset of the local oscillator at the receiver while the symbol timing offset can be absorbed in the guard interval. For the same reason, estimation of the frequency characteristics, needed for OFDM to be adapted to the frequency selective fading, can only be carried out conventionally after the frequency offset has been compensated. And accurate estimation of large frequency offset certainly requires high precision estimate of the frequency characteristics. In this paper, we propose a new joint estimation method of the frequency offset and the channel frequency response using an Expectation-Maximization (EM) algorithm for OFDM systems. The proposed algorithm overcomes the limitation of the thus far proposed algorithm. By computer simulations, we show the proposed algorithm provides estimation accuracy close to its lower bound in a wide range of the frequency offset.

  • Round-Robin with VirtualClock Scheduling Algorithm in Multiservice Packet Networks

    Lei WANG  Mike H. MACGREGOR  

     
    PAPER-Network

      Vol:
    E89-B No:11
      Page(s):
    3040-3045

    In this paper, we present a scheduler that incorporates round robin service within a VirtualClock discipline. Time-stamp based scheduling algorithms attain a low local delay bound and performance guarantee, but are computationally complex. On the other hand, round robin schemes are simple to implement and have computational complexity of O(1), but they are well known for their output burstiness and short-term unfairness. In order to overcome this problem, we combine round robin with VirtualClock in an algorithm we call VCRR. VCRR possesses better fairness than simple round robin, low jitter and a good scheduling delay bound. At the same time, VCRR preserves the O(1) time complexity of round robin. Simulation experiments show VCRR's efficiency in terms of delay performance, jitter and fairness.

  • On Step-by-Step Complete Decoding Triple-Error-Correcting Binary BCH Codes

    Shyue-Win WEI  

     
    LETTER-Coding Theory

      Vol:
    E89-A No:11
      Page(s):
    3360-3362

    According to the properties found in the algebraic complete decoding method for triple-error-correcting binary Bose-Chaudhuri-Hocquenghem (BCH) codes, a step-by-step complete decoding algorithm of this code is presented.

  • Signal Reconstruction with Boundary-Matching via Iterative Algorithm

    Chau-Yun HSU  Tsung-Ming LO  

     
    PAPER-Digital Signal Processing

      Vol:
    E89-A No:11
      Page(s):
    3283-3289

    In various applications of signals transmission and processing, there is always a possibility of loss of samples. The iterative algorithm of Papoulis-Gerchberg (PG) is famous for solving the band-limited lost samples recovery problem. Two problems are known in this domain: (1) a band-limited signal practically is difficult to be obtained and (2) the convergence rate is too slow. By inserting a subtraction by a polynomial in the PG algorithm, using boundary-matched concept, a significant increase in performance and speed of its convergence has been achieved. In this paper, we propose an efficient approach to restore lost samples by adding a preprocess which meets the periodic boundary conditions of Fast Fourier transform in the iteration method. The accuracy of lost samples reconstruction by using the PG algorithm can be greatly improved with the aid of mapping the input data sequence of satisfying the boundary conditions. Further, we also developed another approach that force the signal to meet a new critical boundary conditions in Fourier domain that make the parameters of the preprocessing can be easily obtained. The simulation indicates that the mean square error (MSE) of the recovery and the convergence rate with the preprocess concept is better and faster than the one without preprocess concept. Our both proposed approaches can also be applied to other cases of signal restoration, which allow Cadzow's iterative processing method, with more convenience and flexibility.

  • Time Complexity Analysis of the Legal Firing Sequence Problem of Petri Nets with Inhibitor Arcs

    Satoshi TAOKA  Toshimasa WATANABE  

     
    PAPER-Concurrent Systems

      Vol:
    E89-A No:11
      Page(s):
    3216-3226

    Petri nets with inhibitor arcs are referred to as inhibitor-arc Petri nets. It is shown that modeling capability of inhibitor-arc Petri nets is equivalent to that of Turing machines. The subject of this paper is the legal firing sequence problem (INLFS) for inhibitor-arc Petri nets: given an inhibitor-arc Petri net IN, an initial marking M0 and a firing count vector X, find a firing sequence δ such that its firing starts from M0 and each transition t appears in δ exactly X(t) times as prescribed by X. The paper is the first step of research for time complexity analysis and designing algorithms of INLFS, one of the most fundamental problems for inhibitor-arc Petri nets having more modeling capability than ordinary Peri nets. The recognition version of INLFS, denoted as RINLFS, means a decision problem, asking a "yes" or "no" answer on the existence of a solution δ to INLFS. The main results are the following (1) and (2). (1) Proving (1-1) and (1-2) when the underlying Petri net of IN is an unweighted state machine: (1-1) INLFS can be solved in pseudo-polynomial (O(|X|)) time for IN of non-adjacent type having only one special place called a rivet; (1-2) RINLFS is NP-hard for IN with at least three rivets; (2) Proving that RINLFS for IN whose underlying Petri net is unweighted and forward conflict-free is NP-hard. Heuristic algorithms for solving INLFS are going to be proposed in separate papers.

  • Automated Design of Analog Circuits Starting with Idealized Elements

    Naoyuki UNNO  Nobuo FUJII  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E89-A No:11
      Page(s):
    3313-3319

    This paper presents an automated design of analog circuits starting with idealized elements. Our system first synthesizes circuits using idealized elements by a genetic algorithm (GA). GA evolves circuit topologies and transconductances of idealized elements to achieve the given specifications. The use of idealized elements effectively reduces search space and make the synthesis efficient. Second, idealized elements in a generated circuit are replaced by MOSFETs. Through the two processes, a circuit satisfying the given specifications can be obtained. The capability of this method was demonstrated through experiments of synthesis of a trans-impedance amplifier and a cubing circuit and benchmark tests. The results of the benchmark tests show the proposed CAD is more than 10 times faster than the CAD which does not use idealized elements.

  • Systematic Interpretation of Redundant Arithmetic Adders in Binary and Multiple-Valued Logic

    Naofumi HOMMA  Takafumi AOKI  Tatsuo HIGUCHI  

     
    PAPER

      Vol:
    E89-C No:11
      Page(s):
    1645-1654

    This paper presents an algorithm-level interpretation of fast adder structures in binary/multiple-valued logic. The key idea is to employ a unified representation of addition algorithms called Counter Tree Diagrams (CTDs). The use of CTDs makes it possible to describe and analyze addition algorithms at various levels of abstraction. A high-level CTD represents a network of coarse-grained components associated with multiple-valued logic devices, while a low-level CTD represents a network of primitive components directly mapped onto binary logic devices. The level of abstraction in circuit representation can be changed by decomposition of CTDs. We can derive possible variations of adder structures by decomposing a high-level CTD into low-level CTDs. This paper demonstrates the interpretation of redundant arithmetic adders based on CTDs. We first introduce an extension of CTDs to represent possible redundant arithmetic adders with limited carry propagation. Using the extended version of CTDs, we can classify the conventional adder structures including those using emerging devices into three types in a systematic way.

  • Network Design Scheme for Virtual Private Network Services

    Tomonori TAKEDA  Ryuichi MATSUZAKI  Ichiro INOUE  Shigeo URUSHIDANI  

     
    PAPER-Network

      Vol:
    E89-B No:11
      Page(s):
    3046-3054

    This paper proposes a network design scheme for Virtual Private Network (VPN) services. Traditionally, network design to compute required amount of resource is based on static point-to-point resource demand. This scheme is effective for traditional private line services. However, since VPN services allow multi-site connectivity for customers, it may not be appropriate to design a network based on static point-to-point resource demand. In particular, this scheme is not effective when the traffic pattern changes over time. Therefore, network design for VPN services introduces a new challenge in order to comply with traffic flexibility. There are conventional studies tackling this issue. In those studies, by defining a resource demand model considering flexibility, and designing the network based on this model, amount of resource required can be computed. However, there are some deficiencies in those studies. This paper proposes a new network design scheme, consisting of two components. The first one is a new resource demand model, created by extending conventional resource demand models, that can specify resource demand more precisely. The second one is a new network design algorithm for this resource demand model. Simulations are conducted to evaluate the performance of the proposed network design scheme, and the results show significant performance improvement against conventional schemes. In addition, deployment considerations of the proposed scheme are analyzed.

  • Error Analysis of the Multilevel Fast Multipole Algorithm

    Shinichiro OHNUKI  Weng Cho CHEW  

     
    PAPER-Electromagnetic Theory

      Vol:
    E89-C No:11
      Page(s):
    1676-1681

    The computational error of the multilevel fast multipole algorithm is studied. The error convergence rate, achievable minimum error, and error bound are investigated for various element distributions. We will discuss the boundary between the large and small buffer cases in terms of machine precision. The needed buffer size to reach double precision accuracy will be clarified.

  • Geometric Properties of Quasi-Additive Learning Algorithms

    Kazushi IKEDA  

     
    PAPER-Control, Neural Networks and Learning

      Vol:
    E89-A No:10
      Page(s):
    2812-2817

    The family of Quasi-Additive (QA) algorithms is a natural generalization of the perceptron learning, which is a kind of on-line learning having two parameter vectors: One is an accumulation of input vectors and the other is a weight vector for prediction associated with the former by a nonlinear function. We show that the vectors have a dually-flat structure from the information-geometric point of view, and this representation makes it easier to discuss the convergence properties.

  • Performance of Scheduling Algorithms under Mobility for Multimedia Services in OFDM Systems

    Haiying Julie ZHU  Roshdy H.M. HAFEZ  

     
    PAPER

      Vol:
    E89-B No:10
      Page(s):
    2670-2677

    Scheduling algorithms are playing a key role in overall system performance of broadband wireless systems (BWS). Maximal SNR (MaxSNR) and Round Robin (RR) are two conventional scheduling strategies which emphasize efficiency and fairness respectively. Proportional Fair (PF) algorithm provides tradeoff between efficiency and fairness. In this paper, we apply PF to IEEE 802.16a OFDM based BWS and name it OPF. We also propose a new algorithm for multimedia services: Normalized Multimedia Adaptive OPF (NMAOPF). Adaptive modulation and coding scheme is applied in time varying and frequency selective fading wireless channel. System performances are compared in efficiency and fairness with and without user mobility. Efficiency is in terms of throughput, mean packet delay and packet drop ratio; fairness is in terms of user satisfaction rate and average user rate. Joint PHY and MAC layer simulation results show that: within the traffic range of 55 to 70 Mbps, compared with RR and MaxSNR, the performance of OPF is in between. Our proposed NMAOPF outperforms all others without user mobility, while under mobility, it is not as good as MaxSNR but better than OPF and RR.

  • Encoding LDPC Codes Using the Triangular Factorization

    Yuichi KAJI  

     
    PAPER-Coding Theory

      Vol:
    E89-A No:10
      Page(s):
    2510-2518

    An algorithm for encoding low-density parity check (LDPC) codes is investigated. The algorithm computes parity check symbols by solving a set of sparse equations, and the triangular factorization is employed to solve the equations efficiently. It is shown analytically and experimentally that the proposed algorithm is more efficient than the Richardson's encoding algorithm if the code has a small gap.

  • Efficient DSP Architecture for Viterbi Decoding with Small Trace Back Latency

    Weon Heum PARK  Myung Hoon SUNWOO  Seong Keun OH  

     
    PAPER-Fundamental Theories for Communications

      Vol:
    E89-B No:10
      Page(s):
    2813-2818

    This paper proposes efficient DSP instructions and their hardware architecture for the Viterbi algorithm. The implementation of the Viterbi algorithm on a DSP chip has been attracting more interest for its flexibility, programmability, etc. The proposed architecture can reduce the Trace Back (TB) latency and can support various wireless communication standards. The proposed instructions perform the Add Compare Select (ACS) and TB operations in parallel and the architecture has special hardware, called the Offset Calculation Unit (OCU), which automatically calculates data addresses for acceleration of the trellis butterfly computations. When the constraint length K is 5, the proposed architecture can reduce the decoding cycles about 17% compared with Carmel DSP and about 45% compared with TMS320C55x.

981-1000hit(2137hit)