The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] OMP(3945hit)

781-800hit(3945hit)

  • On the Computational Complexity of the Linear Solvability of Information Flow Problems with Hierarchy Constraint

    Yuki TAKEDA  Yuichi KAJI  Minoru ITO  

     
    PAPER-Networks and Network Coding

      Vol:
    E99-A No:12
      Page(s):
    2211-2217

    An information flow problem is a graph-theoretical formalization of the transportation of information over a complicated network. It is known that a linear network code plays an essential role in a certain type of information flow problems, but it is not understood clearly how contributing linear network codes are for other types of information flow problems. One basic problem concerning this aspect is the linear solvability of information flow problems, which is to decide if there is a linear network code that is a solution to the given information flow problem. Lehman et al. characterize the linear solvability of information flow problems in terms of constraints on the sets of source and sink nodes. As an extension of Lehman's investigation, this study introduces a hierarchy constraint of messages, and discusses the computational complexity of the linear solvability of information flow problems with the hierarchy constraints. Nine classes of problems are newly defined, and classified to one of three categories that were discovered by Lehman et al.

  • Enhancing Entropy Throttling: New Classes of Injection Control in Interconnection Networks

    Takashi YOKOTA  Kanemitsu OOTSU  Takeshi OHKAWA  

     
    PAPER-Interconnection network

      Pubricized:
    2016/08/25
      Vol:
    E99-D No:12
      Page(s):
    2911-2922

    State-of-the-art parallel computers, which are growing in parallelism, require a lot of things in their interconnection networks. Although wide spectrum of efforts in research and development for effective and practical interconnection networks are reported, the problem is still open. One of the largest issues is congestion control that intends to maximize the network performance in terms of throughput and latency. Throttling, or injection limitation, is one of the center ideas of congestion control. We have proposed a new class of throttling method, Entropy Throttling, whose foundation is entropy concept of packets. The throttling method is successful in part, however, its potentials are not sufficiently discussed. This paper aims at exploiting capabilities of the Entropy Throttling method via comprehensive evaluation. Major contributions of this paper are to introduce two ideas of hysteresis function and guard time and also to clarify wide performance characteristics in steady and unsteady communication situations. By introducing the new ideas, we extend the Entropy throttling method. The extended methods improve communication performance at most 3.17 times in the best case and 1.47 times in average compared with non-throttling cases in collective communication, while the method can sustain steady communication performance.

  • A New Algorithm for Reducing Components of a Gaussian Mixture Model

    Naoya YOKOYAMA  Daiki AZUMA  Shuji TSUKIYAMA  Masahiro FUKUI  

     
    PAPER

      Vol:
    E99-A No:12
      Page(s):
    2425-2434

    In statistical methods, such as statistical static timing analysis, Gaussian mixture model (GMM) is a useful tool for representing a non-Gaussian distribution and handling correlation easily. In order to repeat various statistical operations such as summation and maximum for GMMs efficiently, the number of components should be restricted around two. In this paper, we propose a method for reducing the number of components of a given GMM to two (2-GMM). Moreover, since the distribution of each component is represented often by a linear combination of some explanatory variables, we propose a method to compute the covariance between each explanatory variable and the obtained 2-GMM, that is, the sensitivity of 2-GMM to each explanatory variable. In order to evaluate the performance of the proposed methods, we show some experimental results. The proposed methods minimize the normalized integral square error of probability density function of 2-GMM by the sacrifice of the accuracy of sensitivities of 2-GMM.

  • Comparing Performance of Hierarchical Identity-Based Signature Schemes

    Peixin CHEN  Yilun WU  Jinshu SU  Xiaofeng WANG  

     
    LETTER-Information Network

      Pubricized:
    2016/09/01
      Vol:
    E99-D No:12
      Page(s):
    3181-3184

    The key escrow problem and high computational cost are the two major problems that hinder the wider adoption of hierarchical identity-based signature (HIBS) scheme. HIBS schemes with either escrow-free (EF) or online/offline (OO) model have been proved secure in our previous work. However, there is no much EF or OO scheme that has been evaluated experimentally. In this letter, several EF/OO HIBS schemes are considered. We study the algorithmic complexity of the schemes both theoretically and experimentally. Scheme performance and practicability of EF and OO models are discussed.

  • A Fast Mask Manufacturability and Process Variation Aware OPC Algorithm with Exploiting a Novel Intensity Estimation Model

    Ahmed AWAD  Atsushi TAKAHASHI  Chikaaki KODAMA  

     
    PAPER

      Vol:
    E99-A No:12
      Page(s):
    2363-2374

    With being pushed into sub-16nm regime, advanced technology nodes printing in optical micro-lithography relies heavily on aggressive Optical Proximity Correction (OPC) in the foreseeable future. Although acceptable pattern fidelity is utilized under process variations, mask design time and mask manufacturability form crucial parameters whose tackling in the OPC recipe is highly demanded by the industry. In this paper, we propose an intensity based OPC algorithm to find a highly manufacturable mask solution for a target pattern with acceptable pattern fidelity under process variations within a short computation time. This is achieved through utilizing a fast intensity estimation model in which intensity is numerically correlated with local mask density and kernel type to estimate the intensity in a short time and with acceptable estimation accuracy. This estimated intensity is used to guide feature shifting, alignment, and concatenation following linearly interpolated variational intensity error model to achieve high mask manufacturability with preserving acceptable pattern fidelity under process variations. Experimental results show the effectiveness of our proposed algorithm on the public benchmarks.

  • Inter-Person Occlusion Handling with Social Interaction for Online Multi-Pedestrian Tracking

    Yuke LI  Weiming SHEN  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2016/09/15
      Vol:
    E99-D No:12
      Page(s):
    3165-3171

    Inter-person occlusion handling is a critical issue in the field of tracking, and it has been extensively researched. Several state-of-the-art methods have been proposed, such as focusing on the appearance of the targets or utilizing knowledge of the scene. In contrast with the approaches proposed in the literature, we propose to address this issue using a social interaction model, which allows us to explore spatio-temporal information pertaining to the targets involved in the occlusion situation. Our experimental results show promising results compared with those obtained using other methods.

  • Fully Parallelized LZW Decompression for CUDA-Enabled GPUs

    Shunji FUNASAKA  Koji NAKANO  Yasuaki ITO  

     
    PAPER-GPU computing

      Pubricized:
    2016/08/25
      Vol:
    E99-D No:12
      Page(s):
    2986-2994

    The main contribution of this paper is to present a work-optimal parallel algorithm for LZW decompression and to implement it in a CUDA-enabled GPU. Since sequential LZW decompression creates a dictionary table by reading codes in a compressed file one by one, it is not easy to parallelize it. We first present a work-optimal parallel LZW decompression algorithm on the CREW-PRAM (Concurrent-Read Exclusive-Write Parallel Random Access Machine), which is a standard theoretical parallel computing model with a shared memory. We then go on to present an efficient implementation of this parallel algorithm on a GPU. The experimental results show that our GPU implementation performs LZW decompression in 1.15 milliseconds for a gray scale TIFF image with 4096×3072 pixels stored in the global memory of GeForce GTX 980. On the other hand, sequential LZW decompression for the same image stored in the main memory of Intel Core i7 CPU takes 50.1 milliseconds. Thus, our parallel LZW decompression on the global memory of the GPU is 43.6 times faster than a sequential LZW decompression on the main memory of the CPU for this image. To show the applicability of our GPU implementation for LZW decompression, we evaluated the SSD-GPU data loading time for three scenarios. The experimental results show that the scenario using our LZW decompression on the GPU is faster than the others.

  • Multiple Object Segmentation in Videos Using Max-Flow Decomposition

    Yihang BO  Hao JIANG  

     
    PAPER-Vision

      Vol:
    E99-A No:12
      Page(s):
    2547-2557

    In this paper, we propose a novel decomposition method to segment multiple object regions simultaneously in cluttered videos. This method formulates object regions segmentation as a labeling problem in which we assign object IDs to the superpixels in a sequence of video frames so that the unary color matching cost is low, the assignment induces compact segments, and the superpixel labeling is consistent through time. Multi-object segmentation in a video is a combinatorial problem. We propose a binary linear formulation. Since the integer linear programming is hard to solve directly, we relax it and further decompose the relaxation into a sequence of much simpler max-flow problems. The proposed method is guaranteed to converge in a finite number of steps to the global optimum of the relaxation. It also has a high chance to obtain all integer solution and therefore achieves the global optimum. The rounding of the relaxation result gives an N-approximation solution, where N is the number of objects. Comparing to directly solving the integer program, the novel decomposition method speeds up the computation by orders of magnitude. Our experiments show that the proposed method is robust against object pose variation, occlusion and is more accurate than the competing methods while at the same time maintains the efficiency.

  • Average Coding Rate of a Multi-Shot Tunstall Code with an Arbitrary Parsing Tree Sequence

    Mitsuharu ARIMURA  

     
    LETTER-Source Coding and Data Compression

      Vol:
    E99-A No:12
      Page(s):
    2281-2285

    Average coding rate of a multi-shot Tunstall code, which is a variation of variable-to-fixed length (VF) lossless source codes, for stationary memoryless sources is investigated. A multi-shot VF code parses a given source sequence to variable-length blocks and encodes them to fixed-length codewords. If we consider the situation that the parsing count is fixed, overall multi-shot VF code can be treated as a one-shot VF code. For this setting of Tunstall code, the compression performance is evaluated using two criterions. The first one is the average coding rate which is defined as the codeword length divided by the average block length. The second one is the expectation of the pointwise coding rate. It is proved that both of the above average coding rate converge to the entropy of a stationary memoryless source under the assumption that the geometric mean of the leaf counts of the multi-shot Tunstall parsing trees goes to infinity.

  • Efficient Multiplication Based on Dickson Bases over Any Finite Fields

    Sun-Mi PARK  Ku-Young CHANG  Dowon HONG  Changho SEO  

     
    PAPER-Algorithms and Data Structures

      Vol:
    E99-A No:11
      Page(s):
    2060-2074

    We propose subquadratic space complexity multipliers for any finite field $mathbb{F}_{q^n}$ over the base field $mathbb{F}_q$ using the Dickson basis, where q is a prime power. It is shown that a field multiplication in $mathbb{F}_{q^n}$ based on the Dickson basis results in computations of Toeplitz matrix vector products (TMVPs). Therefore, an efficient computation of a TMVP yields an efficient multiplier. In order to derive efficient $mathbb{F}_{q^n}$ multipliers, we develop computational schemes for a TMVP over $mathbb{F}_{q}$. As a result, the $mathbb{F}_{2^n}$ multipliers, as special cases of the proposed $mathbb{F}_{q^n}$ multipliers, have lower time complexities as well as space complexities compared with existing results. For example, in the case that n is a power of 3, the proposed $mathbb{F}_{2^n}$ multiplier for an irreducible Dickson trinomial has about 14% reduced space complexity and lower time complexity compared with the best known results.

  • Set-to-Set Disjoint Paths Routing in Torus-Connected Cycles

    Antoine BOSSARD  Keiichi KANEKO  

     
    LETTER-Dependable Computing

      Pubricized:
    2016/08/10
      Vol:
    E99-D No:11
      Page(s):
    2821-2823

    Extending the very popular tori interconnection networks[1]-[3], Torus-Connected Cycles (TCC) have been proposed as a novel network topology for massively parallel systems [5]. Here, the set-to-set disjoint paths routing problem in a TCC is solved. In a TCC(k,n), it is proved that paths of lengths at most kn2+2n can be selected in O(kn2) time.

  • A Color Scheme Method by Interactive Evolutionary Computing Considering Contrast of Luminance and Design Property

    Keiko YAMASHITA  Kaoru ARAKAWA  

     
    PAPER-Image

      Vol:
    E99-A No:11
      Page(s):
    1981-1989

    A method of color scheme is proposed considering contrast of luminance between adjacent regions and design property. This method aims at setting the contrast of luminance high, in order to make the image understandable to visually handicapped people. This method also realizes preferable color design for visually normal people by assigning color components from color combination samples. Interactive evolutionary computing is adopted to design the luminance and the color, so that the luminance and color components are assigned to each region appropriately on the basis of human subjective criteria. Here, the luminance is designed first, and then color components are assigned, keeping the luminance unchanged. Since samples of fine color combinations are applied, the obtained color design is also fine and harmonic. Computer simulations verify the high performance of this system.

  • Reseeding-Oriented Test Power Reduction for Linear-Decompression-Based Test Compression Architectures

    Tian CHEN  Dandan SHEN  Xin YI  Huaguo LIANG  Xiaoqing WEN  Wei WANG  

     
    PAPER-Computer System

      Pubricized:
    2016/07/25
      Vol:
    E99-D No:11
      Page(s):
    2672-2681

    Linear feedback shift register (LFSR) reseeding is an effective method for test data reduction. However, the test patterns generated by LFSR reseeding generally have high toggle rate and thus cause high test power. Therefore, it is feasible to fill X bits in deterministic test cubes with 0 or 1 properly before encoding the seed to reduce toggle rate. However, X-filling will increase the number of specified bits, thus increase the difficulty of seed encoding, what's more, the size of LFSR will increase as well. This paper presents a test frame which takes into consideration both compression ratio and power consumption simultaneously. In the first stage, the proposed reseeding-oriented X-filling proceeds for shift power (shift filling) and capture power (capture filling) reduction. Then, encode the filled test cubes using the proposed Compatible Block Code (CBC). The CBC can X-ize specified bits, namely turning specified bits into X bits, and can resolve the conflict between low-power filling and seed encoding. Experiments performed on ISCAS'89 benchmark circuits show that our scheme attains a compression ratio of 94.1% and reduces capture power by at least 15% and scan-in power by more than 79.5%.

  • Fast Coding Unit Size Decision Based on Probabilistic Graphical Model in High Efficiency Video Coding Inter Prediction

    Xiantao JIANG  Tian SONG  Wen SHI  Takafumi KATAYAMA  Takashi SHIMAMOTO  Lisheng WANG  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2016/08/08
      Vol:
    E99-D No:11
      Page(s):
    2836-2839

    In this work, a high efficiency coding unit (CU) size decision algorithm is proposed for high efficiency video coding (HEVC) inter coding. The CU splitting or non-splitting is modeled as a binary classification problem based on probability graphical model (PGM). This method incorporates two sub-methods: CU size termination decision and CU size skip decision. This method focuses on the trade-off between encoding efficiency and encoding complexity, and it has a good performance. Particularly in the high resolution application, simulation results demonstrate that the proposed algorithm can reduce encoding time by 53.62%-57.54%, while the increased BD-rate are only 1.27%-1.65%, compared to the HEVC software model.

  • Edge-Based Adaptive Sampling for Image Block Compressive Sensing

    Lijing MA  Huihui BAI  Mengmeng ZHANG  Yao ZHAO  

     
    LETTER-Image

      Vol:
    E99-A No:11
      Page(s):
    2095-2098

    In this paper, a novel scheme of the adaptive sampling of block compressive sensing is proposed for natural images. In view of the contents of images, the edge proportion in a block can be used to represent its sparsity. Furthermore, according to the edge proportion, the adaptive sampling rate can be adaptively allocated for better compressive sensing recovery. Given that there are too many blocks in an image, it may lead to a overhead cost for recording the ratio of measurement of each block. Therefore, K-means method is applied to classify the blocks into clusters and for each cluster a kind of ratio of measurement can be allocated. In addition, we design an iterative termination condition to reduce time-consuming in the iteration of compressive sensing recovery. The experimental results show that compared with the corresponding methods, the proposed scheme can acquire a better reconstructed image at the same sampling rate.

  • Design of a Compact Sound Localization Device on a Stand-Alone FPGA-Based Platform

    Mauricio KUGLER  Teemu TOSSAVAINEN  Susumu KUROYANAGI  Akira IWATA  

     
    PAPER-Computer System

      Pubricized:
    2016/07/26
      Vol:
    E99-D No:11
      Page(s):
    2682-2693

    Sound localization systems are widely studied and have several potential applications, including hearing aid devices, surveillance and robotics. However, few proposed solutions target portable systems, such as wearable devices, which require a small unnoticeable platform, or unmanned aerial vehicles, in which weight and low power consumption are critical aspects. The main objective of this research is to achieve real-time sound localization capability in a small, self-contained device, without having to rely on large shaped platforms or complex microphone arrays. The proposed device has two surface-mount microphones spaced only 20 mm apart. Such reduced dimensions present challenges for the implementation, as differences in level and spectra become negligible, and only time-difference of arrival (TDoA) can be used as a localization cue. Three main issues have to be addressed in order to accomplish these objectives. To achieve real-time processing, the TDoA is calculated using zero-crossing spikes applied to the hardware-friendly Jeffers model. In order to make up for the reduction in resolution due to the small dimensions, the signal is upsampled several-fold within the system. Finally, a coherence-based spectral masking is used to select only frequency components with relevant TDoA information. The proposed system was implemented on a field-programmable gate array (FPGA) based platform, due to the large amount of concurrent and independent tasks, which can be efficiently parallelized in reconfigurable hardware devices. Experimental results with white-noise and environmental sounds show high accuracies for both anechoic and reverberant conditions.

  • An Algorithm of Connecting Broken Objects Based on the Skeletons

    Chao XU  Dongxiang ZHOU  Yunhui LIU  

     
    LETTER-Pattern Recognition

      Pubricized:
    2016/08/10
      Vol:
    E99-D No:11
      Page(s):
    2832-2835

    The segmentation of Mycobacterium tuberculosis images forms the basis for the computer-aided diagnosis of tuberculosis. The segmented objects are often broken due to the low-contrast objects and the limits of segmentation method. This will result in decreasing the accuracy of segmentation and recognition. A simple and effective post-processing method is proposed to connect the broken objects. The broken objects in the segmented binary images are connected based on the information obtained from their skeletons. Experimental results demonstrate the effectiveness of our proposed method.

  • Job Mapping and Scheduling on Free-Space Optical Networks

    Yao HU  Ikki FUJIWARA  Michihiro KOIBUCHI  

     
    PAPER-Computer System

      Pubricized:
    2016/08/16
      Vol:
    E99-D No:11
      Page(s):
    2694-2704

    A number of parallel applications run on a high-performance computing (HPC) system simultaneously. Job mapping and scheduling become crucial to improve system utilization, because fragmentation prevents an incoming job from being assigned even if there are enough compute nodes unused. Wireless supercomputers and datacenters with free-space optical (FSO) terminals have been proposed to replace the conventional wired interconnection so that a diverse application workload can be better supported by changing their network topologies. In this study we firstly present an efficient job mapping by swapping the endpoints of FSO links in a wireless HPC system. Our evaluation shows that an FSO-equipped wireless HPC system can achieve shorter average queuing length and queuing time for all the dispatched user jobs. Secondly, we consider the use of a more complicated and enhanced scheduling algorithm, which can further improve the system utilization over different host networks, as well as the average response time for all the dispatched user jobs. Finally, we present the performance advantages of the proposed wireless HPC system under more practical assumptions such as different cabinet capacities and diverse subtopology packings.

  • Measurement Matrices Construction for Compressed Sensing Based on Finite Field Quasi-Cyclic LDPC Codes

    Hua XU  Hao YANG  Wenjuan SHI  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2016/06/16
      Vol:
    E99-B No:11
      Page(s):
    2332-2339

    Measurement matrix construction is critically important to signal sampling and reconstruction for compressed sensing. From a practical point of view, deterministic construction of the measurement matrix is better than random construction. In this paper, we propose a novel deterministic method to construct a measurement matrix for compressed sensing, CS-FF (compressed sensing-finite field) algorithm. For this proposed algorithm, the constructed measurement matrix is from the finite field Quasi-cyclic Low Density Parity Check (QC-LDPC) code and thus it has quasi-cyclic structure. Furthermore, we construct three groups of measurement matrices. The first group matrices are the proposed matrix and other matrices including deterministic construction matrices and random construction matrices. The other two group matrices are both constructed by our method. We compare the recovery performance of these matrices. Simulation results demonstrate that the recovery performance of our matrix is superior to that of the other matrices. In addition, simulation results show that the compression ratio is an important parameter to analyse and predict the recovery performance of the proposed measurement matrix. Moreover, these matrices have less storage requirement than that of a random one, and they achieve a better trade-off between complexity and performance. Therefore, from practical perspective, the proposed scheme is hardware friendly and easily implemented, and it is suitable to compressed sensing for its quasi-cyclic structure and good recovery performance.

  • Control of Morphology and Alignment of Liquid Crystal Droplets in Molecular-Aligned Polymer for Substrate-Free Displays Open Access

    Daisuke SASAKI  Yosei SHIBATA  Takahiro ISHINABE  Hideo FUJIKAKE  

     
    INVITED PAPER

      Vol:
    E99-C No:11
      Page(s):
    1234-1239

    We have proposed composite films composed of a molecular-aligned polymer and liquid crystal (LC) for substrate-free liquid crystal displays with high-contrast images. We successfully controlled the molecular alignment of the LC and formed molecular-aligned LC droplets in the polymer by controlling the fluidity of the LC/monomer mixture and the curing rate of the monomer.

781-800hit(3945hit)