The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] PAR(2741hit)

481-500hit(2741hit)

  • Sparse Representation for Color Image Super-Resolution with Image Quality Difference Evaluation

    Zi-wen WANG  Guo-rui FENG  Ling-yan FAN  Jin-wei WANG  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2016/10/19
      Vol:
    E100-D No:1
      Page(s):
    150-159

    The sparse representation models have been widely applied in image super-resolution. The certain optimization problem is supposed and can be solved by the iterative shrinkage algorithm. During iteration, the update of dictionaries and similar patches is necessary to obtain prior knowledge to better solve such ill-conditioned problem as image super-resolution. However, both the processes of iteration and update often spend a lot of time, which will be a bottleneck in practice. To solve it, in this paper, we present the concept of image quality difference based on generalized Gaussian distribution feature which has the same trend with the variation of Peak Signal to Noise Ratio (PSNR), and we update dictionaries or similar patches from the termination strategy according to the adaptive threshold of the image quality difference. Based on this point, we present two sparse representation algorithms for image super-resolution, one achieves the further improvement in image quality and the other decreases running time on the basis of image quality assurance. Experimental results also show that our quantitative results on several test datasets are in line with exceptions.

  • Optimal Construction of Frequency-Hopping Sequence Sets with Low-Hit-Zone under Periodic Partial Hamming Correlation

    Changyuan WANG  Daiyuan PENG  Xianhua NIU  Hongyu HAN  

     
    LETTER-Cryptography and Information Security

      Vol:
    E100-A No:1
      Page(s):
    304-307

    In this paper, a new class of low-hit-zone (LHZ) frequency-hopping sequence sets (LHZ FHS sets) is constructed based upon the Cartesian product, and the periodic partial Hamming correlation within its LHZ are studied. Studies have shown that the new LHZ FHS sets are optimal according to the periodic partial Hamming correlation bounds of FHS set, and some known FHS sets are the special cases of this new construction.

  • Light Space Partitioned Shadow Maps

    Bin TANG  Jianxin LUO  Guiqiang NI  Weiwei DUAN  Yi GAO  

     
    LETTER-Computer Graphics

      Pubricized:
    2016/10/04
      Vol:
    E100-D No:1
      Page(s):
    234-237

    This letter proposes a Light Space Partitioned Shadow Maps (LSPSMs) algorithm which implements shadow rendering based on a novel partitioning scheme in light space. In stead of splitting the view frustum like traditional Z-partitioning methods, we split partitions from the projection of refined view frustum in light space. The partitioning scheme is performed dual-directionally while limiting the wasted space. Partitions are created in dynamic number corresponding to the light and view directions. Experiments demonstrate that high quality shadows can be rendered in high efficiency with our algorithm.

  • Initial Value Problem Formulation TDBEM with 4-D Domain Decomposition Method and Application to Wake Fields Analysis

    Hideki KAWAGUCHI  Thomas WEILAND  

     
    PAPER

      Vol:
    E100-C No:1
      Page(s):
    37-44

    The Time Domain Boundary Element Method (TDBEM) has its advantages in the analysis of transient electromagnetic fields (wake fields) induced by a charged particle beam with curved trajectory in a particle accelerator. On the other hand, the TDBEM has disadvantages of huge required memory and computation time compared with those of the Finite Difference Time Domain (FDTD) method or the Finite Integration Technique (FIT). This paper presents a comparison of the FDTD method and 4-D domain decomposition method of the TDBEM based on an initial value problem formulation for the curved trajectory electron beam, and application to a full model simulation of the bunch compressor section of the high-energy particle accelerators.

  • Insufficient Vectorization: A New Method to Exploit Superword Level Parallelism

    Wei GAO  Lin HAN  Rongcai ZHAO  Yingying LI  Jian LIU  

     
    PAPER-Software System

      Pubricized:
    2016/09/29
      Vol:
    E100-D No:1
      Page(s):
    91-106

    Single-instruction multiple-data (SIMD) extension provides an energy-efficient platform to scale the performance of media and scientific applications while still retaining post-programmability. However, the major challenge is to translate the parallel resources of the SIMD hardware into real application performance. Currently, all the slots in the vector register are used when compilers exploit SIMD parallelism of programs, which can be called sufficient vectorization. Sufficient vectorization means all the data in the vector register is valid. Because all the slots which vector register provides must be used, the chances of vectorizing programs with low SIMD parallelism are abandoned by sufficient vectorization method. In addition, the speedup obtained by full use of vector register sometimes is not as great as that obtained by partial use. Specifically, the length of vector register provided by SIMD extension becomes longer, sufficient vectorization method cannot exploit the SIMD parallelism of programs completely. Therefore, insufficient vectorization method is proposed, which refer to partial use of vector register. First, the adaptation scene of insufficient vectorization is analyzed. Second, the methods of computing inter-iteration and intra-iteration SIMD parallelism for loops are put forward. Furthermore, according to the relationship between the parallelism and vector factor a method is established to make the choice of vectorization method, in order to vectorize programs as well as possible. Finally, code generation strategy for insufficient vectorization is presented. Benchmark test results show that insufficient vectorization method vectorized more programs than sufficient vectorization method by 107.5% and the performance achieved by insufficient vectorization method is 12.1% higher than that achieved by sufficient vectorization method.

  • Multi-Track Joint Decoding Schemes Using Two-Dimensional Run-Length Limited Codes for Bit-Patterned Media Magnetic Recording

    Hidetoshi SAITO  

     
    PAPER-Signal Processing for Storage

      Vol:
    E99-A No:12
      Page(s):
    2248-2255

    This paper proposes an effective signal processing scheme using a modulation code with two-dimensional (2D) run-length limited (RLL) constraints for bit-patterned media magnetic recording (BPMR). This 2D signal processing scheme is applied to be one of two-dimensional magnetic recording (TDMR) schemes for shingled magnetic recording on bit patterned media (BPM). A TDMR scheme has been pointed out an important key technology for increasing areal density toward 10Tb/in2. From the viewpoint of 2D signal processing for TDMR, multi-track joint decoding scheme is desirable to increase an effective transfer rate because this scheme gets readback signals from several adjacent parallel tracks and detect recorded data written in these tracks simultaneously. Actually, the proposed signal processing scheme for BPMR gets mixed readback signal sequences from the parallel tracks using a single reading head and these readback signal sequences are equalized to a frequency response given by a desired 2D generalized partial response system. In the decoding process, it leads to an increase in the effective transfer rate by using a single maximum likelihood (ML) sequence detector because the recorded data on the parallel tracks are decoded for each time slot. Furthermore, a new joint pattern-dependent noise-predictive (PDNP) sequence detection scheme is investigated for multi-track recording with media noise. This joint PDNP detection is embed in a ML detector and can be useful to eliminate media noise. Using computer simulation, it is shown that the joint PDNP detection scheme is able to compensate media noise in the equalizer output which is correlated and data-dependent.

  • GPU-Accelerated Bulk Execution of Multiple-Length Multiplication with Warp-Synchronous Programming Technique

    Takumi HONDA  Yasuaki ITO  Koji NAKANO  

     
    PAPER-GPU computing

      Pubricized:
    2016/08/24
      Vol:
    E99-D No:12
      Page(s):
    3004-3012

    In this paper, we present a GPU implementation of bulk multiple-length multiplications. The idea of our GPU implementation is to adopt a warp-synchronous programming technique. We assign each multiple-length multiplication to one warp that consists of 32 threads. In parallel processing using multiple threads, usually, it is costly to synchronize execution of threads and communicate within threads. In warp-synchronous programming technique, however, execution of threads in a warp can be synchronized instruction by instruction without any barrier synchronous operations. Also, inter-thread communication can be performed by warp shuffle functions without accessing shared memory. The experimental results show that our GPU implementation on NVIDIA GeForce GTX 980 attains a speed-up factor of 52 for 1024-bit multiple-length multiplication over the sequential CPU implementation. Moreover, we use this 1024-bit multiple-length multiplication for larger size of bits as a sub-routine. The GPU implementation attains a speed-up factor of 21 for 65536-bit multiple-length multiplication.

  • Comparing Performance of Hierarchical Identity-Based Signature Schemes

    Peixin CHEN  Yilun WU  Jinshu SU  Xiaofeng WANG  

     
    LETTER-Information Network

      Pubricized:
    2016/09/01
      Vol:
    E99-D No:12
      Page(s):
    3181-3184

    The key escrow problem and high computational cost are the two major problems that hinder the wider adoption of hierarchical identity-based signature (HIBS) scheme. HIBS schemes with either escrow-free (EF) or online/offline (OO) model have been proved secure in our previous work. However, there is no much EF or OO scheme that has been evaluated experimentally. In this letter, several EF/OO HIBS schemes are considered. We study the algorithmic complexity of the schemes both theoretically and experimentally. Scheme performance and practicability of EF and OO models are discussed.

  • Adaptive Sidelobe Cancellation Technique for Atmospheric Radars Containing Arrays with Nonuniform Gain

    Taishi HASHIMOTO  Koji NISHIMURA  Toru SATO  

     
    PAPER-Antennas and Propagation

      Pubricized:
    2016/06/21
      Vol:
    E99-B No:12
      Page(s):
    2583-2591

    The design and performance evaluation is presented of a partially adaptive array that suppresses clutter from low elevation angles in atmospheric radar observations. The norm-constrained and directionally constrained minimization of power (NC-DCMP) algorithm has been widely used to suppress clutter in atmospheric radars, because it can limit the signal-to-noise ratio (SNR) loss to a designated amount, which is the most important design factor for atmospheric radars. To suppress clutter from low elevation angles, adding supplemental antennas that have high response to the incoming directions of clutter has been considered to be more efficient than to divide uniformly the high-gain main array. However, the proper handling of the gain differences of main and sub-arrays has not been well studied. We performed numerical simulations to show that using the proper gain weighting, the sub-array configuration has better clutter suppression capability per unit SNR loss than the uniformly divided arrays of the same size. The method developed is also applied to an actual observation dataset from the MU radar at Shigaraki, Japan. The properly gain-weighted NC-DCMP algorithm suppresses the ground clutter sufficiently with an average SNR loss of about 1 dB less than that of the uniform-gain configuration.

  • Blind Identification of Multichannel Systems Based on Sparse Bayesian Learning

    Kai ZHANG  Hongyi YU  Yunpeng HU  Zhixiang SHEN  Siyu TAO  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2016/06/28
      Vol:
    E99-B No:12
      Page(s):
    2614-2622

    Reliable wireless communication often requires accurate knowledge of the underlying multipath channels. Numerous measurement campaigns have shown that physical multipath channels tend to exhibit a sparse structure. Conventional blind channel identification (BCI) strategies such as the least squares, which are known to be optimal under the assumption of rich multipath channels, are ill-suited to exploiting the inherent sparse nature of multipath channels. Recently, l1-norm regularized least-squares-type approaches have been proposed to address this problem with a single parameter governing all coefficients, which is equivalent to maximum a posteriori probability estimation with a Laplacian prior for the channel coefficients. Since Laplace prior is not conjugate to the Gaussian likelihood, no closed form of Bayesian inference is possible. Following a different approach, this paper deals with blind channel identification of a single-input multiple-output (SIMO) system based on sparse Bayesian learning (SBL). The inherent sparse nature of wireless multipath channels is exploited by incorporating a transformative cross relation formulation into a general Bayesian framework, in which the filter coefficients are governed by independent scalar parameters. A fast iterative Bayesian inference method is then applied to the proposed model for obtaining sparse solutions, which completely eliminates the need for computationally costly parameter fine tuning, which is necessary in the l1-norm regularization method. Simulation results are provided to demonstrate the superior effectiveness of the proposed channel estimation algorithm over the conventional least squares (LS) scheme as well as the l1-norm regularization method. It is shown that the proposed algorithm exhibits superior estimation performance compared to both LS and l1-norm regularization methods.

  • Fully Parallelized LZW Decompression for CUDA-Enabled GPUs

    Shunji FUNASAKA  Koji NAKANO  Yasuaki ITO  

     
    PAPER-GPU computing

      Pubricized:
    2016/08/25
      Vol:
    E99-D No:12
      Page(s):
    2986-2994

    The main contribution of this paper is to present a work-optimal parallel algorithm for LZW decompression and to implement it in a CUDA-enabled GPU. Since sequential LZW decompression creates a dictionary table by reading codes in a compressed file one by one, it is not easy to parallelize it. We first present a work-optimal parallel LZW decompression algorithm on the CREW-PRAM (Concurrent-Read Exclusive-Write Parallel Random Access Machine), which is a standard theoretical parallel computing model with a shared memory. We then go on to present an efficient implementation of this parallel algorithm on a GPU. The experimental results show that our GPU implementation performs LZW decompression in 1.15 milliseconds for a gray scale TIFF image with 4096×3072 pixels stored in the global memory of GeForce GTX 980. On the other hand, sequential LZW decompression for the same image stored in the main memory of Intel Core i7 CPU takes 50.1 milliseconds. Thus, our parallel LZW decompression on the global memory of the GPU is 43.6 times faster than a sequential LZW decompression on the main memory of the CPU for this image. To show the applicability of our GPU implementation for LZW decompression, we evaluated the SSD-GPU data loading time for three scenarios. The experimental results show that the scenario using our LZW decompression on the GPU is faster than the others.

  • Linear Programming Decoding of Binary Linear Codes for Symbol-Pair Read Channel

    Shunsuke HORII  Toshiyasu MATSUSHIMA  Shigeichi HIRASAWA  

     
    PAPER-Coding Theory and Techniques

      Vol:
    E99-A No:12
      Page(s):
    2170-2178

    In this study, we develop a new algorithm for decoding binary linear codes for symbol-pair read channels. The symbol-pair read channel was recently introduced by Cassuto and Blaum to model channels with higher write resolutions than read resolutions. The proposed decoding algorithm is based on linear programming (LP). For LDPC codes, the proposed algorithm runs in time polynomial in the codeword length. It is proved that the proposed LP decoder has the maximum-likelihood (ML) certificate property, i.e., the output of the decoder is guaranteed to be the ML codeword when it is integral. We also introduce the fractional pair distance dfp of the code, which is a lower bound on the minimum pair distance. It is proved that the proposed LP decoder corrects up to ⌈dfp/2⌉-1 errors.

  • RFS: An LSM-Tree-Based File System for Enhanced Microdata Performance

    Lixin WANG  Yutong LU  Wei ZHANG  Yan LEI  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2016/09/06
      Vol:
    E99-D No:12
      Page(s):
    3035-3046

    File system workloads are increasing write-heavy. The growing capacity of RAM in modern nodes allows many reads to be satisfied from memory while writes must be persisted to disk. Today's sophisticated local file systems like Ext4, XFS and Btrfs optimize for reads but suffer from workloads dominated by microdata (including metadata and tiny files). In this paper we present an LSM-tree-based file system, RFS, which aims to take advantages of the write optimization of LSM-tree to provide enhanced microdata performance, while offering matching performance for large files. RFS incrementally partitions the namespace into several metadata columns on a per-directory basis, preserving disk locality for directories and reducing the write amplification of LSM-trees. A write-ordered log-structured layout is used to store small files efficiently, rather than embedding the contents of small files into inodes. We also propose an optimization of global bloom filters for efficient point lookups. Experiments show our library version of RFS can handle microwrite-intensive workloads 2-10 times faster than existing solutions such as Ext4, Btrfs and XFS.

  • An Efficient Algorithm of Discrete Particle Swarm Optimization for Multi-Objective Task Assignment

    Nannan QIAO  Jiali YOU  Yiqiang SHENG  Jinlin WANG  Haojiang DENG  

     
    PAPER-Distributed system

      Pubricized:
    2016/08/24
      Vol:
    E99-D No:12
      Page(s):
    2968-2977

    In this paper, a discrete particle swarm optimization method is proposed to solve the multi-objective task assignment problem in distributed environment. The objectives of optimization include the makespan for task execution and the budget caused by resource occupation. A two-stage approach is designed as follows. In the first stage, several artificial particles are added into the initialized swarm to guide the search direction. In the second stage, we redefine the operators of the discrete PSO to implement addition, subtraction and multiplication. Besides, a fuzzy-cost-based elite selection is used to improve the computational efficiency. Evaluation shows that the proposed algorithm achieves Pareto improvement in comparison to the state-of-the-art algorithms.

  • Enhancing Entropy Throttling: New Classes of Injection Control in Interconnection Networks

    Takashi YOKOTA  Kanemitsu OOTSU  Takeshi OHKAWA  

     
    PAPER-Interconnection network

      Pubricized:
    2016/08/25
      Vol:
    E99-D No:12
      Page(s):
    2911-2922

    State-of-the-art parallel computers, which are growing in parallelism, require a lot of things in their interconnection networks. Although wide spectrum of efforts in research and development for effective and practical interconnection networks are reported, the problem is still open. One of the largest issues is congestion control that intends to maximize the network performance in terms of throughput and latency. Throttling, or injection limitation, is one of the center ideas of congestion control. We have proposed a new class of throttling method, Entropy Throttling, whose foundation is entropy concept of packets. The throttling method is successful in part, however, its potentials are not sufficiently discussed. This paper aims at exploiting capabilities of the Entropy Throttling method via comprehensive evaluation. Major contributions of this paper are to introduce two ideas of hysteresis function and guard time and also to clarify wide performance characteristics in steady and unsteady communication situations. By introducing the new ideas, we extend the Entropy throttling method. The extended methods improve communication performance at most 3.17 times in the best case and 1.47 times in average compared with non-throttling cases in collective communication, while the method can sustain steady communication performance.

  • A Bipartite Graph-Based Ranking Approach to Query Subtopics Diversification Focused on Word Embedding Features

    Md Zia ULLAH  Masaki AONO  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2016/09/05
      Vol:
    E99-D No:12
      Page(s):
    3090-3100

    Web search queries are usually vague, ambiguous, or tend to have multiple intents. Users have different search intents while issuing the same query. Understanding the intents through mining subtopics underlying a query has gained much interest in recent years. Query suggestions provided by search engines hold some intents of the original query, however, suggested queries are often noisy and contain a group of alternative queries with similar meaning. Therefore, identifying the subtopics covering possible intents behind a query is a formidable task. Moreover, both the query and subtopics are short in length, it is challenging to estimate the similarity between a pair of short texts and rank them accordingly. In this paper, we propose a method for mining and ranking subtopics where we introduce multiple semantic and content-aware features, a bipartite graph-based ranking (BGR) method, and a similarity function for short texts. Given a query, we aggregate the suggested queries from search engines as candidate subtopics and estimate the relevance of them with the given query based on word embedding and content-aware features by modeling a bipartite graph. To estimate the similarity between two short texts, we propose a Jensen-Shannon divergence based similarity function through the probability distributions of the terms in the top retrieved documents from a search engine. A diversified ranked list of subtopics covering possible intents of a query is assembled by balancing the relevance and novelty. We experimented and evaluated our method on the NTCIR-10 INTENT-2 and NTCIR-12 IMINE-2 subtopic mining test collections. Our proposed method outperforms the baselines, known related methods, and the official participants of the INTENT-2 and IMINE-2 competitions.

  • A Bayesian Approach to Image Recognition Based on Separable Lattice Hidden Markov Models

    Kei SAWADA  Akira TAMAMORI  Kei HASHIMOTO  Yoshihiko NANKAKU  Keiichi TOKUDA  

     
    PAPER-Pattern Recognition

      Pubricized:
    2016/09/05
      Vol:
    E99-D No:12
      Page(s):
    3119-3131

    This paper proposes a Bayesian approach to image recognition based on separable lattice hidden Markov models (SL-HMMs). The geometric variations of the object to be recognized, e.g., size, location, and rotation, are an essential problem in image recognition. SL-HMMs, which have been proposed to reduce the effect of geometric variations, can perform elastic matching both horizontally and vertically. This makes it possible to model not only invariances to the size and location of the object but also nonlinear warping in both dimensions. The maximum likelihood (ML) method has been used in training SL-HMMs. However, in some image recognition tasks, it is difficult to acquire sufficient training data, and the ML method suffers from the over-fitting problem when there is insufficient training data. This study aims to accurately estimate SL-HMMs using the maximum a posteriori (MAP) and variational Bayesian (VB) methods. The MAP and VB methods can utilize prior distributions representing useful prior information, and the VB method is expected to obtain high generalization ability by marginalization of model parameters. Furthermore, to overcome the local maximum problem in the MAP and VB methods, the deterministic annealing expectation maximization algorithm is applied for training SL-HMMs. Face recognition experiments performed on the XM2VTS database indicated that the proposed method offers significantly improved image recognition performance. Additionally, comparative experiment results showed that the proposed method was more robust to geometric variations than convolutional neural networks.

  • Time Delay Estimation via Co-Prime Aliased Sparse FFT

    Bei ZHAO  Chen CHENG  Zhenguo MA  Feng YU  

     
    LETTER-Digital Signal Processing

      Vol:
    E99-A No:12
      Page(s):
    2566-2570

    Cross correlation is a general way to estimate time delay of arrival (TDOA), with a computational complexity of O(n log n) using fast Fourier transform. However, since only one spike is required for time delay estimation, complexity can be further reduced. Guided by Chinese Remainder Theorem (CRT), this paper presents a new approach called Co-prime Aliased Sparse FFT (CASFFT) in O(n1-1/d log n) multiplications and O(mn) additions, where m is smooth factor and d is stage number. By adjusting these parameters, it can achieve a balance between runtime and noise robustness. Furthermore, it has clear advantage in parallelism and runtime for a large range of signal-to-noise ratio (SNR) conditions. The accuracy and feasibility of this algorithm is analyzed in theory and verified by experiment.

  • An Index Based on Irregular Identifier Space Partition for Quick Multiple Data Access in Wireless Data Broadcasting

    SeokJin IM  HeeJoung HWANG  

     
    LETTER-Data Engineering, Web Information Systems

      Pubricized:
    2016/07/20
      Vol:
    E99-D No:11
      Page(s):
    2809-2813

    This letter proposes an Index based on Irregular Partition of data identifiers (IIP), to enable clients to quickly access multiple data items on a wireless broadcast channel. IIP improves the access time by reducing the index waiting time when clients access multiple data items, through the use of irregular partitioning of the identifier space of data items. Our performance evaluation shows that with respect to access time, the proposed IIP outperforms the existing index schemes supporting multiple data access.

  • Set-to-Set Disjoint Paths Routing in Torus-Connected Cycles

    Antoine BOSSARD  Keiichi KANEKO  

     
    LETTER-Dependable Computing

      Pubricized:
    2016/08/10
      Vol:
    E99-D No:11
      Page(s):
    2821-2823

    Extending the very popular tori interconnection networks[1]-[3], Torus-Connected Cycles (TCC) have been proposed as a novel network topology for massively parallel systems [5]. Here, the set-to-set disjoint paths routing problem in a TCC is solved. In a TCC(k,n), it is proved that paths of lengths at most kn2+2n can be selected in O(kn2) time.

481-500hit(2741hit)