The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] PA(8249hit)

641-660hit(8249hit)

  • Accelerating the Held-Karp Algorithm for the Symmetric Traveling Salesman Problem

    Kazuro KIMURA  Shinya HIGA  Masao OKITA  Fumihiko INO  

     
    PAPER-Fundamentals of Information System

      Pubricized:
    2019/08/23
      Vol:
    E102-D No:12
      Page(s):
    2329-2340

    In this paper, we propose an acceleration method for the Held-Karp algorithm that solves the symmetric traveling salesman problem by dynamic programming. The proposed method achieves acceleration with two techniques. First, we locate data-independent subproblems so that the subproblems can be solved in parallel. Second, we reduce the number of subproblems by a meet in the middle (MITM) technique, which computes the optimal path from both clockwise and counterclockwise directions. We show theoretical analysis on the impact of MITM in terms of the time and space complexities. In experiments, we compared the proposed method with a previous method running on a single-core CPU. Experimental results show that the proposed method on an 8-core CPU was 9.5-10.5 times faster than the previous method on a single-core CPU. Moreover, the proposed method on a graphics processing unit (GPU) was 30-40 times faster than that on an 8-core CPU. As a side effect, the proposed method reduced the memory usage by 48%.

  • Analysis and Investigation of Frame Invariance and Particle Behavior for Piecewise-Linear Particle Swarm Optimizer

    Tomoyuki SASAKI  Hidehiro NAKANO  

     
    PAPER-Nonlinear Problems

      Vol:
    E102-A No:12
      Page(s):
    1956-1967

    Particle swarm optimization (PSO) is a swarm intelligence algorithm and has good search performance and simplicity in implementation. Because of its properties, PSO has been applied to various optimization problems. However, the search performance of the classical PSO (CPSO) depends on reference frame of solution spaces for each objective function. CPSO is an invariant algorithm through translation and scale changes to reference frame of solution spaces but is a rotationally variant algorithm. As such, the search performance of CPSO is worse in solving rotated problems than in solving non-rotated problems. In the reference frame invariance, the search performance of an optimization algorithm is independent on rotation, translation, or scale changes to reference frame of solution spaces, which is a property of preferred optimization algorithms. In our previous study, piecewise-linear particle swarm optimizer (PPSO) has been proposed, which is effective in solving rotated problems. Because PPSO particles can move in solution spaces freely without depending on the coordinate systems, PPSO algorithm may have rotational invariance. However, theoretical analysis of reference frame invariance of PPSO has not been done. In addition, although behavior of each particle depends on PPSO parameters, good parameter conditions in solving various optimization problems have not been sufficiently clarified. In this paper, we analyze the reference frame invariance of PPSO theoretically, and investigated whether or not PPSO is invariant under reference frame alteration. We clarify that control parameters of PPSO which affect movement of each particle and performance of PPSO through numerical simulations.

  • Constructing Two Completely Independent Spanning Trees in Balanced Hypercubes

    Yi-Xian YANG  Kung-Jui PAI  Ruay-Shiung CHANG  Jou-Ming CHANG  

     
    LETTER-Fundamentals of Information Systems

      Pubricized:
    2019/06/17
      Vol:
    E102-D No:12
      Page(s):
    2409-2412

    A set of spanning trees of a graphs G are called completely independent spanning trees (CISTs for short) if for every pair of vertices x, y∈V(G), the paths joining x and y in any two trees have neither vertex nor edge in common, except x and y. Constructing CISTs has applications on interconnection networks such as fault-tolerant routing and secure message transmission. In this paper, we investigate the problem of constructing two CISTs in the balanced hypercube BHn, which is a hypercube-variant network and is superior to hypercube due to having a smaller diameter. As a result, the diameter of CISTs we constructed equals to 9 for BH2 and 6n-2 for BHn when n≥3.

  • Designing a High Performance SRAM-DRAM Hybrid Memory Architecture for Packet Buffers

    Yongwoon SONG  Dongkeon CHOI  Hyukjun LEE  

     
    BRIEF PAPER-Integrated Electronics

      Pubricized:
    2019/06/25
      Vol:
    E102-C No:12
      Page(s):
    849-852

    The performance of a network router/switch has improved significantly over past decades with explosively increasing internet and data center traffic. The performance of a router heavily depends on the memory system, e.g. DRAM based packet buffers, which often limits the scalability of a router. However, a widening gap between memory I/O bus and memory cell array speed and decreasing row buffer locality from increasing channels and banks severely reduce the performance gain from state-of-the-art memory technology such as DDR4 or HBM2 DRAM. Prior works improved memory bandwidth by maintaining SRAM-based per-queue or per-bank input/output buffers in the memory controller to support a DRAM-based packet buffer. The buffers temporarily store packets when bank conflicts occur but are unable to prevent interference-inducing traffic from thrashing DRAM's row buffers. In this study, we directly integrate SRAM into the DRAM-based packet buffer and map those packets degrading row buffer locality of DRAM into SRAM. This maximizes locality and parallelism of DRAM accesses. The proposed scheme can benefit any existing schemes. Experimental results show 22.41% improvement over the best existing scheme for a single channel in terms of the memory bandwidth utilization under harsh congested scenarios.

  • On the Bit Error Probability of OFDM and FBMC-OQAM Systems in Rayleigh and Rician Multipath Fading Channels Open Access

    Liming LI  Yang WANG  Liqin DING  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2019/06/17
      Vol:
    E102-B No:12
      Page(s):
    2276-2285

    Filter bank multicarrier with offset quadrature amplitude modulation (FBMC-OQAM) is considered an alternative to conventional orthogonal frequency division multiplexing (OFDM) to meet the various requirements proposed by future communication networks. Among the different perspectives on the merits of FBMC-OQAM and OFDM, a straightforward metric is the bit error probability (BEP). This paper presents a general analytical framework for BEP evaluation that is applicable to FBMC-OQAM and OFDM systems in both Rayleigh and Rician multipath fading channels. Explicit BEP expressions are derived for Gray-coded pulse amplitude modulation (PAM) and square quadrature amplitude modulation (QAM) signals with arbitrary constellation sizes. The theoretical analysis results show excellent agreement with the numerical simulation results in different channel scenarios.

  • SDChannelNets: Extremely Small and Efficient Convolutional Neural Networks

    JianNan ZHANG  JiJun ZHOU  JianFeng WU  ShengYing YANG  

     
    LETTER-Biocybernetics, Neurocomputing

      Pubricized:
    2019/09/10
      Vol:
    E102-D No:12
      Page(s):
    2646-2650

    Convolutional neural networks (CNNS) have a strong ability to understand and judge images. However, the enormous parameters and computation of CNNS have limited its application in resource-limited devices. In this letter, we used the idea of parameter sharing and dense connection to compress the parameters in the convolution kernel channel direction, thus greatly reducing the number of model parameters. On this basis, we designed Shared and Dense Channel-wise Convolutional Networks (SDChannelNets), mainly composed of Depth-wise Separable SD-Channel-wise Convolution layer. The advantage of SDChannelNets is that the number of model parameters is greatly reduced without or with little loss of accuracy. We also introduced a hyperparameter that can effectively balance the number of parameters and the accuracy of a model. We evaluated the model proposed by us through two popular image recognition tasks (CIFAR-10 and CIFAR-100). The results showed that SDChannelNets had similar accuracy to other CNNs, but the number of parameters was greatly reduced.

  • Hybrid QAM-Based Labels Generated by Two Multi-Level PSK Codes

    Takahiro KODAMA  Gabriella CINCOTTI  

     
    PAPER-Fiber-Optic Transmission for Communications

      Pubricized:
    2019/05/31
      Vol:
    E102-B No:12
      Page(s):
    2199-2204

    Hybrid 200Gchip/s QAM-based opto-electrical labels with high orthogonality are generated using the convolution of optical 16-level and electrical 4-level PSK codes. The combined simultaneous use of optical and electrical encoding increases system flexibility and code orthogonality, as well as code recognition performance. By performing 50 G-class low-speed LN-PM-based electrical processing on the 200 Gchip/s PSK-based optical code labels generated by a multiport optical encoder, the value of PCR indicating the code orthogonality is increased significantly, and the receiver sensitivity is improved by 0.5dB to achieve LER =10-9 in the next-generation optical packet switching networks.

  • Distributed Transmission for Secure Wireless Links Based on a Secret-Sharing Method

    Masaaki YAMANAKA  ShenCong WEI  Jingbo ZOU  Shuichi OHNO  Shinichi MIYAMOTO  Seiichi SAMPEI  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2019/06/17
      Vol:
    E102-B No:12
      Page(s):
    2286-2296

    This paper proposes a secure distributed transmission method that establishes multiple transmission routes in space to a destination. In the method, the transmitted information is divided into pieces of information by a secret-sharing method, and the generated pieces are separately transmitted to the destination through different transmission routes using individually-controlled antenna directivities. As the secret-sharing method can divide the transmitted information into pieces in such a manner that nothing about the original information is revealed unless all the divided pieces are obtained, the secrecy of the transmitted information is greatly improved from an information-theoretic basis. However, one problem is that it does not perform well in the vicinity around the receiver. This is due to the characteristics of distributed transmission that all distributed pieces of information must eventually gather at the destination; an eavesdropper can obtain the necessary pieces to reconstruct the original information. Then, this paper expands the distributed transmission method into a two-way communication scheme. By adopting the distributed transmission in both communication directions, a secure link can be provided as a feedback channel to enhance the secrecy of the transmitted information. The generation of the shared pieces of information is given with signal forms, and the secrecy of the proposed method is evaluated based on the signal transmission error rates as determined by computer simulation.

  • Sparse Time-Varying Complex AR (TV-CAR) Speech Analysis Based on Adaptive LASSO

    Keiichi FUNAKI  

     
    LETTER-Speech and Hearing

      Vol:
    E102-A No:12
      Page(s):
    1910-1914

    Linear Prediction (LP) analysis is commonly used in speech processing. LP is based on Auto-Regressive (AR) model and it estimates the AR model parameter from signals with l2-norm optimization. Recently, sparse estimation is paid attention since it can extract significant features from big data. The sparse estimation is realized by l1 or l0-norm optimization or regularization. Sparse LP analysis methods based on l1-norm optimization have been proposed. Since excitation of speech is not white Gaussian, a sparse LP estimation can estimate more accurate parameter than the conventional l2-norm based LP. These are time-invariant and real-valued analysis. We have been studied Time-Varying Complex AR (TV-CAR) analysis for an analytic signal and have evaluated the performance on speech processing. The TV-CAR methods are l2-norm methods. In this paper, we propose the sparse TV-CAR analysis based on adaptive LASSO (Least absolute shrinkage and selection operator) that is l1-norm regularization and evaluate the performance on F0 estimation of speech using IRAPT (Instantaneous RAPT). The experimental results show that the sparse TV-CAR methods perform better for a high level of additive Pink noise.

  • A Topology Control Strategy with Efficient Path for Predictable Delay-Tolerant Networks

    Dawei YAN  Cong LIU  Peng YOU  Shaowei YONG  Dongfang GUAN  Yu XING  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2019/06/25
      Vol:
    E102-B No:12
      Page(s):
    2183-2198

    In wireless networks, efficient topology improves the performance of network protocols. The previous research mainly focuses on how to construct a cost-efficient network structure from a static and connected topology. Due to lack of continuous connectivity in the underlying topology, most traditional topology control methods are not applicable to the delay or disruption tolerant networks (DTNs). In this paper, we consider the topology control problem in a predictable DTN where the dynamic topology is known a priori or can be predicted over time. First, this dynamic topology is modeled by a directed space-time graph that includes spatial and temporal information. Second, the topology control problem of the predictable DTN is formulated as building a sparse structure. For any pair devices, there is an efficient path connecting them to improve the efficiency of the generated structure. Then, a topology control strategy is proposed for this optimization problem by using a kth shortest paths algorithm. Finally, simulations are conducted on random networks and a real-world DTN tracing date. The results demonstrate that the proposed method can significantly improve the efficiency of the generated structure and reduce the total cost.

  • Adaptive-Partial Template Update with Center-Shifting Recovery for High Frame Rate and Ultra-Low Delay Deformation Matching

    Songlin DU  Yuhao XU  Tingting HU  Takeshi IKENAGA  

     
    PAPER-Image

      Vol:
    E102-A No:12
      Page(s):
    1872-1881

    High frame rate and ultra-low delay matching system plays an important role in various human-machine interactive applications, which demands better performance in matching deformable and out-of-plane rotating objects. Although many algorithms have been proposed for deformation tracking and matching, few of them are suitable for hardware implementation due to complicated operations and large time consumption. This paper proposes a hardware-oriented template update and recovery method for high frame rate and ultra-low delay deformation matching system. In the proposed method, the new template is generated in real time by partially updating the template descriptor and adding new keypoints simultaneously with the matching process in pixels (proposal #1), which avoids the large inter-frame delay. The size and shape of region of interest (ROI) are made flexible and the Hamming threshold used for brute-force matching is adjusted according to pixel position and the flexible ROI (proposal #2), which solves the problem of template drift. The template is recovered by the previous one with a relative center-shifting vector when it is judged as lost via region-wise difference check (proposal #3). Evaluation results indicate that the proposed method successfully achieves the real-time processing of 784fps at the resolution of 640×480 on field-programmable gate array (FPGA), with a delay of 0.808ms/frame, as well as achieves satisfactory deformation matching results in comparison with other general methods.

  • Representative Spatial Selection and Temporal Combination for 60fps Real-Time 3D Tracking of Twelve Volleyball Players on GPU

    Xina CHENG  Yiming ZHAO  Takeshi IKENAGA  

     
    PAPER-Image

      Vol:
    E102-A No:12
      Page(s):
    1882-1890

    Real-time 3D players tracking plays an important role in sports analysis, especially for the live services of sports broadcasting, which have a strict limitation on processing time. For these kinds of applications, 3D trajectories of players contribute to high-level game analysis such as tactic analysis and commercial applications such as TV contents. Thus real-time implementation for 3D players tracking is expected. In order to achieve real-time for 60fps videos with high accuracy, (that means the processing time should be less than 16.67ms per frame), the factors that limit the processing time of target algorithm include: 1) Large image area of each player. 2) Repeated processing of multiple players in multiple views. 3) Complex calculation of observation algorithm. To deal with the above challenges, this paper proposes a representative spatial selection and temporal combination based real-time implementation for multi-view volleyball players tracking on the GPU device. First, the representative spatial pixel selection, which detects the pixels that mostly represent one image region to scale down the image spatially, reduces the number of processing pixels. Second, the representative temporal likelihood combination shares observation calculation by using the temporal correlation between images so that the times of complex calculation is reduced. The experiments are based on videos of the Final and Semi-Final Game of 2014 Japan Inter High School Games of Men's Volleyball in Tokyo Metropolitan Gymnasium. On the GPU device GeForce GTX 1080Ti, the tracking system achieves real-time on 60fps videos and keeps the tracking accuracy higher than 97%.

  • Acoustic Design Support System of Compact Enclosure for Smartphone Using Deep Neural Network

    Kai NAKAMURA  Kenta IWAI  Yoshinobu KAJIKAWA  

     
    PAPER-Engineering Acoustics

      Vol:
    E102-A No:12
      Page(s):
    1932-1939

    In this paper, we propose an automatic design support system for compact acoustic devices such as microspeakers inside smartphones. The proposed design support system outputs the dimensions of compact acoustic devices with the desired acoustic characteristic. This system uses a deep neural network (DNN) to obtain the relationship between the frequency characteristic of the compact acoustic device and its dimensions. The training data are generated by the acoustic finite-difference time-domain (FDTD) method so that many training data can be easily obtained. We demonstrate the effectiveness of the proposed system through some comparisons between desired and designed frequency characteristics.

  • An Image Fusion Scheme for Single-Shot High Dynamic Range Imaging with Spatially Varying Exposures

    Chihiro GO  Yuma KINOSHITA  Sayaka SHIOTA  Hitoshi KIYA  

     
    PAPER-Image

      Vol:
    E102-A No:12
      Page(s):
    1856-1864

    This paper proposes a novel multi-exposure image fusion (MEF) scheme for single-shot high dynamic range imaging with spatially varying exposures (SVE). Single-shot imaging with SVE enables us not only to produce images without color saturation regions from a single-shot image, but also to avoid ghost artifacts in the producing ones. However, the number of exposures is generally limited to two, and moreover it is difficult to decide the optimum exposure values before the photographing. In the proposed scheme, a scene segmentation method is applied to input multi-exposure images, and then the luminance of the input images is adjusted according to both of the number of scenes and the relationship between exposure values and pixel values. The proposed method with the luminance adjustment allows us to improve the above two issues. In this paper, we focus on dual-ISO imaging as one of single-shot imaging. In an experiment, the proposed scheme is demonstrated to be effective for single-shot high dynamic range imaging with SVE, compared with conventional MEF schemes with exposure compensation.

  • Energy Minimization over m-Branched Enumeration for Generalized Linear Subspace Clustering Open Access

    Chao ZHANG  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2019/09/19
      Vol:
    E102-D No:12
      Page(s):
    2485-2492

    In this paper, we consider the clustering problem of independent general subspaces. That is, with given data points lay near or on the union of independent low-dimensional linear subspaces, we aim to recover the subspaces and assign the corresponding label to each data point. To settle this problem, we take advantages of both greedy strategy and energy minimization strategy to propose a simple yet effective algorithm based on the assumption that an m-branched (i.e., perfect m-ary) tree which is constructed by collecting m-nearest neighbor points in each node has a high probability of containing the near-exact subspace. Specifically, at first, subspace candidates are enumerated by multiple m-branched trees. Each tree starts with a data point and grows by collecting nearest neighbors in the breadth-first search order. Then, subspace proposals are further selected from the enumeration to initialize the energy minimization algorithm. Eventually, both the proposals and the labeling result are finalized by iterative re-estimation and labeling. Experiments with both synthetic and real-world data show that the proposed method can outperform state-of-the-art methods and is practical in real application.

  • Hardware-Aware Sum-Product Decoding in the Decision Domain Open Access

    Mizuki YAMADA  Keigo TAKEUCHI  Kiyoyuki KOIKE  

     
    PAPER-Coding Theory

      Vol:
    E102-A No:12
      Page(s):
    1980-1987

    We propose hardware-aware sum-product (SP) decoding for low-density parity-check codes. To simplify an implementation using a fixed-point number representation, we transform SP decoding in the logarithm domain to that in the decision domain. A polynomial approximation is proposed to implement an update rule of the proposed SP decoding efficiently. Numerical simulations show that the approximate SP decoding achieves almost the same performance as the exact SP decoding when an appropriate degree in the polynomial approximation is used, that it improves the convergence properties of SP and normalized min-sum decoding in the high signal-to-noise ratio regime, and that it is robust against quantization errors.

  • Skew-Aware Collective Communication for MapReduce Shuffling

    Harunobu DAIKOKU  Hideyuki KAWASHIMA  Osamu TATEBE  

     
    PAPER-Computer System

      Pubricized:
    2019/07/29
      Vol:
    E102-D No:12
      Page(s):
    2389-2399

    This paper proposes and examines the three in-memory shuffling methods designed to address problems in MapReduce shuffling caused by skewed data. Coupled Shuffle Architecture (CSA) employs a single pairwise all-to-all exchange to shuffle both blocks, units of shuffle transfer, and meta-blocks, which contain the metadata of corresponding blocks. Decoupled Shuffle Architecture (DSA) separates the shuffling of meta-blocks and blocks, and applies different all-to-all exchange algorithms to each shuffling process, attempting to mitigate the impact of stragglers in strongly skewed distributions. Decoupled Shuffle Architecture with Skew-Aware Meta-Shuffle (DSA w/ SMS) autonomously determines the proper placement of blocks based on the memory consumption of each worker process. This approach targets extremely skewed situations where some worker processes could exceed their node memory limitation. This study evaluates implementations of the three shuffling methods in our prototype in-memory MapReduce engine, which employs high performance interconnects such as InfiniBand and Intel Omni-Path. Our results suggest that DSA w/ SMS is the only viable solution for extremely skewed data distributions. We also present a detailed investigation of the performance of CSA and DSA in various skew situations.

  • On the Minimum Distance of Some Improper Array Codes

    Haiyang LIU  Lianrong MA  Hao ZHANG  

     
    LETTER-Coding Theory

      Vol:
    E102-A No:12
      Page(s):
    2021-2026

    For an odd prime q and an integer m≤q, we can construct a regular quasi-cyclic parity-check matrix HI(m,q) that specifies a linear block code CI(m,q), called an improper array code. In this letter, we prove the minimum distance of CI(4,q) is equal to 10 for any q≥11. In addition, we prove the minimum distance of CI(5,q) is upper bounded by 12 for any q≥11 and conjecture the upper bound is tight.

  • An Efficient Blacklistable Anonymous Credentials without TTP of Tracing Authority Using Pairing-Based Accumulator

    Yuu AIKOU  Shahidatul SADIAH  Toru NAKANISHI  

     
    PAPER-Cryptography and Information Security

      Vol:
    E102-A No:12
      Page(s):
    1968-1979

    In conventional ID-based user authentications, privacy issues may occur, since users' behavior histories are collected in Service Providers (SPs). Although anonymous authentications such as group signatures have been proposed, these schemes rely on a Trusted Third Party (TTP) capable of tracing misbehaving users. Thus, the privacy is not high, because the TTP of tracing authority can always trace users. Therefore, the anonymous credential system using a blacklist without the TTP of tracing authority has been proposed, where blacklisted anonymous users can be blocked. Recently, an RSA-based blacklistable anonymous credential system with efficiency improvement has been proposed. However, this system still has an efficiency problem: The data size in the authentication is O(K'), where K' is the maximum number of sessions in which the user can conduct. Furthermore, the O(K')-size data causes the user the computational cost of O(K') exponentiations. In this paper, a blacklistable anonymous credential system using a pairing-based accumulator is proposed. In the proposed system, the data size in the authentication is constant for parameters. Although the user's computational cost depends on parameters, the dependent cost is O(δBL·K) multiplications, instead of exponentiations, where δBL is the number of sessions added to the blacklist after the last authentication of the user, and K is the number of past sessions of the user. The demerit of the proposed system is O(n)-size public key, where n corresponds to the total number of all sessions of all users in the system. But, the user only has to download the public key once.

  • Accelerating the Smith-Waterman Algorithm Using the Bitwise Parallel Bulk Computation Technique on the GPU

    Takahiro NISHIMURA  Jacir Luiz BORDIM  Yasuaki ITO  Koji NAKANO  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2019/07/09
      Vol:
    E102-D No:12
      Page(s):
    2400-2408

    The bulk execution of a sequential algorithm is to execute it for many different inputs in turn or at the same time. It is known that the bulk execution of an oblivious sequential algorithm can be implemented to run efficiently on a GPU. The bulk execution supports fine grained bitwise parallelism, allowing it to achieve high acceleration over a straightforward sequential computation. The main contribution of this work is to present a Bitwise Parallel Bulk Computation (BPBC) to accelerate the Smith-Waterman Algorithm (SWA) using the affine gap penalty. Thus, our idea is to convert this computation into a circuit simulation using the BPBC technique to compute multiple instances simultaneously. The proposed BPBC technique for the SWA has been implemented on the GPU and CPU. Experimental results show that the proposed BPBC for the SWA accelerates the computation by over 646 times as compared to a single CPU implementation and by 6.9 times as compared to a multi-core CPU implementation with 160 threads.

641-660hit(8249hit)