The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] PU(3318hit)

901-920hit(3318hit)

  • VAWS: Constructing Trusted Open Computing System of MapReduce with Verified Participants Open Access

    Yan DING  Huaimin WANG  Lifeng WEI  Songzheng CHEN  Hongyi FU  Xinhai XU  

     
    PAPER

      Vol:
    E97-D No:4
      Page(s):
    721-732

    MapReduce is commonly used as a parallel massive data processing model. When deploying it as a service over the open systems, the computational integrity of the participants is becoming an important issue due to the untrustworthy workers. Current duplication-based solutions can effectively solve non-collusive attacks, yet most of them require a centralized worker to re-compute additional sampled tasks to defend collusive attacks, which makes the worker a bottleneck. In this paper, we try to explore a trusted worker scheduling framework, named VAWS, to detect collusive attackers and assure the integrity of data processing without extra re-computation. Based on the historical results of verification, we construct an Integrity Attestation Graph (IAG) in VAWS to identify malicious mappers and remove them from the framework. To further improve the efficiency of identification, a verification-couple selection method with the IAG guidance is introduced to detect the potential accomplices of the confirmed malicious worker. We have proven the effectiveness of our proposed method on the improvement of system performance in theoretical analysis. Intensive experiments show the accuracy of VAWS is over 97% and the overhead of computation is closed to the ideal value of 2 with the increasing of the number of map tasks in our scheme.

  • ILP-Based Bitwidth-Aware Subexpression Sharing for Area Minimization in Multiple Constant Multiplication

    Bu-Ching LIN  Juinn-Dar HUANG  Jing-Yang JOU  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E97-A No:4
      Page(s):
    931-939

    The notion of multiple constant multiplication (MCM) is extensively adopted in digital signal processing (DSP) applications such as finite impulse filter (FIR) designs. A set of adders is utilized to replace regular multipliers for the multiplications between input data and constant filter coefficients. Though many algorithms have been proposed to reduce the total number of adders in an MCM block for area minimization, they do not consider the actual bitwidth of each adder, which may not estimate the hardware cost well enough. Therefore, in this article we propose a bitwidth-aware MCM optimization algorithm that focuses on minimizing the total number of adder bits rather than the adder count. It first builds a subexpression graph based on the given coefficients, derives a set of constraints for adder bitwidth minimization, and then optimally solves the problem through integer linear programming (ILP). Experimental results show that the proposed algorithm can effectively reduce the required adder bit count and outperforms the existing state-of-the-art techniques.

  • Message Passing Decoder with Decoding on Zigzag Cycles for Non-binary LDPC Codes

    Takayuki NOZAKI  Kenta KASAI  Kohichi SAKANIWA  

     
    PAPER-Coding Theory

      Vol:
    E97-A No:4
      Page(s):
    975-984

    In this paper, we propose a message passing decoding algorithm which lowers decoding error rates in the error floor regions for non-binary low-density parity-check (LDPC) codes transmitted over the binary erasure channel (BEC) and the memoryless binary-input output-symmetric (MBIOS) channels. In the case for the BEC, this decoding algorithm is a combination with belief propagation (BP) decoding and maximum a posteriori (MAP) decoding on zigzag cycles, which cause decoding errors in the error floor region. We show that MAP decoding on the zigzag cycles is realized by means of a message passing algorithm. Moreover, we extend this decoding algorithm to the MBIOS channels. Simulation results demonstrate that the decoding error rates in the error floor regions by the proposed decoding algorithm are lower than those by the BP decoder.

  • Probabilistic Frequent Itemset Mining on a GPU Cluster Open Access

    Yusuke KOZAWA  Toshiyuki AMAGASA  Hiroyuki KITAGAWA  

     
    PAPER

      Vol:
    E97-D No:4
      Page(s):
    779-789

    Probabilistic frequent itemset mining, which discovers frequent itemsets from uncertain data, has attracted much attention due to inherent uncertainty in the real world. Many algorithms have been proposed to tackle this problem, but their performance is not satisfactory because handling uncertainty incurs high processing cost. To accelerate such computation, we utilize GPUs (Graphics Processing Units). Our previous work accelerated an existing algorithm with a single GPU. In this paper, we extend the work to employ multiple GPUs. Proposed methods minimize the amount of data that need to be communicated among GPUs, and achieve load balancing as well. Based on the methods, we also present algorithms on a GPU cluster. Experiments show that the single-node methods realize near-linear speedups, and the methods on a GPU cluster of eight nodes achieve up to a 7.1 times speedup.

  • An Intelligent Fighting Videogame Opponent Adapting to Behavior Patterns of the User

    Koichi MORIYAMA  Simón Enrique ORTIZ BRANCO  Mitsuhiro MATSUMOTO  Ken-ichi FUKUI  Satoshi KURIHARA  Masayuki NUMAO  

     
    PAPER-Information Network

      Vol:
    E97-D No:4
      Page(s):
    842-851

    In standard fighting videogames, users usually prefer playing against other users rather than against machines because opponents controlled by machines are in a rut and users can memorize their behaviors after repetitive plays. On the other hand, human players adapt to each other's behaviors, which makes fighting videogames interesting. Thus, in this paper, we propose an artificial agent for a fighting videogame that can adapt to its users, allowing users to enjoy the game even when playing alone. In particular, this work focuses on combination attacks, or combos, that give great damage to the opponent. The agent treats combos independently, i.e., it is composed of a subagent for predicting combos the user executes, that for choosing combos the agent executes, and that for controlling the whole agent. Human users evaluated the agent compared to static opponents, and the agent received minimal negative ratings.

  • Large-Scale Integrated Circuit Design Based on a Nb Nine-Layer Structure for Reconfigurable Data-Path Processors Open Access

    Akira FUJIMAKI  Masamitsu TANAKA  Ryo KASAGI  Katsumi TAKAGI  Masakazu OKADA  Yuhi HAYAKAWA  Kensuke TAKATA  Hiroyuki AKAIKE  Nobuyuki YOSHIKAWA  Shuichi NAGASAWA  Kazuyoshi TAKAGI  Naofumi TAKAGI  

     
    INVITED PAPER

      Vol:
    E97-C No:3
      Page(s):
    157-165

    We describe a large-scale integrated circuit (LSI) design of rapid single-flux-quantum (RSFQ) circuits and demonstrate several reconfigurable data-path (RDP) processor prototypes based on the ISTEC Advanced Process (ADP2). The ADP2 LSIs are made up of nine Nb layers and Nb/AlOx/Nb Josephson junctions with a critical current density of 10kA/cm2, allowing higher operating frequencies and integration. To realize truly large-scale RSFQ circuits, careful design is necessary, with several compromises in the device structure, logic gates, and interconnects, balancing the competing demands of integration density, design flexibility, and fabrication yield. We summarize numerical and experimental results related to the development of a cell-based design in the ADP2, which features a unit cell size reduced to 30-µm square and up to four strip line tracks in the unit cell underneath the logic gates. The ADP LSIs can achieve ∼10 times the device density and double the operating frequency with the same power consumption per junction as conventional LSIs fabricated using the Nb four-layer process. We report the design and test results of RDP processor prototypes using the ADP2 cell library. The RDP processors are composed of many arrays of floating-point units (FPUs) and switch networks, and serve as accelerators in a high-performance computing system. The prototypes are composed of two-dimensional arrays of several arithmetic logic units instead of FPUs. The experimental results include a successful demonstration of full operation and reconfiguration in a 2×2 RDP prototype made up of 11.5k junctions at 45GHz after precise timing design. Partial operation of a 4×4 RDP prototype made up of 28.5k-junctions is also demonstrated, indicating the scalability of our timing design.

  • Effective Frame Selection for Blind Source Separation Based on Frequency Domain Independent Component Analysis

    Yusuke MIZUNO  Kazunobu KONDO  Takanori NISHINO  Norihide KITAOKA  Kazuya TAKEDA  

     
    PAPER-Engineering Acoustics

      Vol:
    E97-A No:3
      Page(s):
    784-791

    Blind source separation is a technique that can separate sound sources without such information as source location, the number of sources, and the utterance content. Multi-channel source separation using many microphones separates signals with high accuracy, even if there are many sources. However, these methods have extremely high computational complexity, which must be reduced. In this paper, we propose a computational complexity reduction method for blind source separation based on frequency domain independent component analysis (FDICA) and examine temporal data that are effective for source separation. A frame with many sound sources is effective for FDICA source separation. We assume that a frame with a low kurtosis has many sound sources and preferentially select such frames. In our proposed method, we used the log power spectrum and the kurtosis of the magnitude distribution of the observed data as selection criteria and conducted source separation experiments using speech signals from twelve speakers. We evaluated the separation performances by the signal-to-interference ratio (SIR) improvement score. From our results, the SIR improvement score was 24.3dB when all the frames were used, and 23.3dB when the 300 frames selected by our criteria were used. These results clarified that our proposed selection criteria based on kurtosis and magnitude is effective. Furthermore, we significantly reduced the computational complexity because it is proportional to the number of selected frames.

  • Observer Design for Feedforward Nonlinear Systems with Delayed Output

    Sungryul LEE  

     
    LETTER-Systems and Control

      Vol:
    E97-A No:3
      Page(s):
    869-872

    This paper proposes the state observer design for feedforward nonlinear systems with delayed output. It is shown that by using the Lyapunov-Krasovskii functional, the proposed design method ensures the asymptotic stability of estimation error for an arbitrarily large output delay. Finally, an illustrative example is given in order to show the effectiveness of our design method.

  • A New Evolutionary Approach to Recommender Systems

    Hyun-Tae KIM  Jinung AN  Chang Wook AHN  

     
    LETTER-Biocybernetics, Neurocomputing

      Vol:
    E97-D No:3
      Page(s):
    622-625

    In this paper, a new evolutionary approach to recommender systems is presented. The aim of this work is to develop a new recommendation method that effectively adapts and immediately responds to the user's preference. To this end, content-based filtering is judiciously utilized in conjunction with interactive evolutionary computation (IEC). Specifically, a fitness-based truncation selection and a feature-wise crossover are devised to make full use of desirable properties of promising items within the IEC framework. Moreover, to efficiently search for proper items, the content-based filtering is modified in cooperation with data grouping. The experimental results demonstrate the effectiveness of the proposed approach, compared with existing methods.

  • Method for Reduction of Field Computation Time for Discrete Ray Tracing Method

    Masafumi TAKEMATSU  Junichi HONDA  Yuki KIMURA  Kazunori UCHIDA  

     
    PAPER-Electromagnetic Theory

      Vol:
    E97-C No:3
      Page(s):
    198-206

    This paper is concerned with a method to reduce the computation time of the Discrete Ray Tracing Method (DRTM) which was proposed to numerically analyze electromagnetic fields above Random Rough Surfaces (RRSs). The essence of DRTM is firstly to search rays between source and receiver and secondly to compute electric fields based on the traced rays. In the DRTM, the method discretizes not only RRSs but also ray tracing procedure. In order to reduce computation time for ray searching, the authors propose to modify the conventional algorithm discretizing RRSs with equal intervals to a new one which discretizes them with unequal intervals according to their profiles. The authors also use an approximation of Fresnel function which enables us to reduce field computation time. The authors discuss the reduction rate for computation time of the DRTM from the numerical view points of ray searching and field computation. Finally, this paper shows how much computation time is reduced by the new method.

  • A Robust Multimode Transmission Strategy for PU2RC with Quantized CQI Using Hierarchical Codebook

    Lei LV  Zhongpei ZHANG  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E97-B No:3
      Page(s):
    638-646

    Per-User Unitary Rate Control (PU2RC) performs poorly when the number of users is small and suffers from the sum-rate ceiling effect in the high signal-to-noise ratio (SNR) regime. In this paper, we propose a multimode transmission (MMT) strategy to overcome these inherent shortcomings of PU2RC. In the proposed MMT strategy, the transmitter finds out the optimal transmission mode and schedules users using each user's instantaneous channel quality information (CQI) parameters. First we assume that each user's CQI parameters are perfectly reported in order to introduce the proposed MMT strategy. Then we consider the quantization of CQI parameters using codebooks designed by the Lloyd algorithm. Moreover, we modify the CQI parameters to improve the system's robustness against quantization error. Finally, in order to reduce the quantization error, we design a hierarchical codebook to jointly quantize the modified CQI parameters by considering the correlation between them. Simulation results show that the proposed MMT strategy effectively overcomes the shortcomings of PU2RC and is robust against low quantization level of CQI parameters.

  • The Impact of Opportunistic User Scheduling on Outage Probability of CR-MIMO Systems

    Donghun LEE  Byung Jang JEONG  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E97-B No:3
      Page(s):
    686-690

    In this paper, we study the impact of opportunistic user scheduling on the outage probability of cognitive radio (CR) multiple-input multiple-output (MIMO) systems in the high power region where the peak transmit power constraint is higher than the peak interference constraint. The primary contributions of this paper are the derivation of exact closed-form expressions of the proposed scheduled CR-MIMO systems for outage probability and asymptotic analysis to quantify the diversity order and signal to noise ratio (SNR) gain. Through exact analytical results, we provide the achievable outage probability of the proposed scheduled systems as a function of SNR. Also, through asymptotic analysis, we show that the scheduled CR-MIMO systems provide some diversity order gain over the non-scheduled CR-MIMO systems which comes from multi-user diversity (MUD). Also, the SNR gain of the proposed scheduled systems is identical to that of the non-scheduled CR-MIMO systems.

  • Asynchronous Memory Machine Models with Barrier Synchronization

    Koji NAKANO  

     
    PAPER-Parallel and Distributed Computing

      Vol:
    E97-D No:3
      Page(s):
    431-441

    The Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM) are theoretical parallel computing models that capture the essence of the shared memory and the global memory of GPUs. It is assumed that warps (or groups of threads) on the DMM and the UMM work synchronously in a round-robin manner. However, warps work asynchronously in real GPUs, in the sense that they are randomly (or arbitrarily) dispatched for execution. The first contribution of this paper is to introduce asynchronous versions of these models in which warps are arbitrarily dispatched. In addition, we assume that threads can execute the “syncthreads” instruction for barrier synchronization. Since the barrier synchronization operation may be costly, we should evaluate and minimize the number of barrier synchronization operations executed by parallel algorithms. The second contribution of this paper is to show a parallel algorithm to the sum of n numbers in optimal computing time and few barrier synchronization steps. Our parallel algorithm computes the sum of n numbers in O(n/w+llog n) time units and O(log l/log w+log log w) barrier synchronization steps using wl threads on the asynchronous UMM with width w and latency l. Since the computation of the sum takes at least Ω(n/w+llog n) time units, this algorithm is time optimal. Finally, we show that the prefix-sums of n numbers can also be computed in O(n/w+llog n) time units and O(log l/log w+log log w) barrier synchronization steps using wl threads.

  • A Formulation of Composition for Cellular Automata on Groups

    Shuichi INOKUCHI  Takahiro ITO  Mitsuhiko FUJIO  Yoshihiro MIZOGUCHI  

     
    PAPER-Cellular Automata

      Vol:
    E97-D No:3
      Page(s):
    448-454

    We introduce the notion of 'Composition', 'Union' and 'Division' of cellular automata on groups. A kind of notions of compositions was investigated by Sato [10] and Manzini [6] for linear cellular automata, we extend the notion to general cellular automata on groups and investigated their properties. We observe the all unions and compositions generated by one-dimensional 2-neighborhood cellular automata over Z2 including non-linear cellular automata. Next we prove that the composition is right-distributive over union, but is not left-distributive. Finally, we conclude by showing reformulation of our definition of cellular automata on group which admit more than three states. We also show our formulation contains the representation using formal power series for linear cellular automata in Manzini [6].

  • Efficient Update Activation for Virtual Machines in IaaS Cloud Computing Environments

    Hiroshi YAMADA  Shuntaro TONOSAKI  Kenji KONO  

     
    PAPER-Software System

      Vol:
    E97-D No:3
      Page(s):
    469-479

    Infrastructure as a Service (IaaS), a form of cloud computing, is gaining attention for its ability to enable efficient server administration in dynamic workload environments. In such environments, however, updating the software stack or content files of virtual machines (VMs) is a time-consuming task, discouraging administrators from frequently enhancing their services and fixing security holes. This is because the administrator has to upload the whole new disk image to the cloud platform via the Internet, which is not yet fast enough that large amounts of data can be transferred smoothly. Although the administrator can apply incremental updates directly to the running VMs, he or she has to carefully consider the type of update and perform operations on all running VMs, such as application restarts. This is a tedious and error-prone task. This paper presents a technique for synchronizing VMs with less time and lower administrative burden. We introduce the Virtual Disk Image Repository, which runs on the cloud platform and automatically updates the virtual disk image and the running VMs with only the incremental update information. We also show a mechanism that performs necessary operations on the running VM such as restarting server processes, based on the types of files that are updated. We implement a prototype on Linux 2.6.31.14 and Amazon Elastic Compute Cloud. An experiment shows that our technique can synchronize VMs in an order-of-magnitude shorter time than the conventional disk-image-based VM method. Also, we discuss limitations of our technique and some directions for more efficient VM updates.

  • Performance Evaluation on RSSI-Based Wireless Capsule Endoscope Location Tracking with Particle Filter

    Takahiro ITO  Daisuke ANZAI  Jianqing WANG  

     
    PAPER

      Vol:
    E97-B No:3
      Page(s):
    579-586

    Tracking capsule endoscope location is one of the promising applications offered by implant body area networks (BANs). When tracking the capsule endoscope location, i.e., continuously localize it, it is effective to take the weighted sum of its past locations to its present location, in other words, to low-pass filter its past locations. Furthermore, creating an exact mathematical model of location transition will improve tracking performance. Therefore, in this paper, we investigate two tracking methods with received signal strength indicator (RSSI)-based localization in order to solve the capsule endoscope location tracking problem. One of the two tracking methods is finite impulse response (FIR) filter-based tracking, which tracks the capsule endoscope location by averaging its past locations. The other one is particle filter-based tracking in order to deal with a nonlinear transition model on the capsule endoscope. However, the particle filter requires that the particle weight is calculated according to its condition (namely, its likelihood value), while the transition model on capsule endoscope location has some model parameters which cannot be estimated from the received wireless signal. Therefore, for the purpose of applying the particle filter to capsule endoscope tracking, this paper makes some modifications in the resampling step of the particle filter algorithm. Our computer simulation results demonstrate that the two tracking methods can improve the performance as compared with the conventional maximum likelihood (ML) localization. Furthermore, we confirm that the particle filter-based tracking outperforms the conventional FIR filter-based tracking by taking the realistic capsule endoscope transition model into consideration.

  • Stabilization Technique for Region-of-Interest Trajectories Made from Video Watching Manipulations

    Daisuke OCHI  Hideaki KIMATA  Yoshinori KUSACHI  Kosuke TAKAHASHI  Akira KOJIMA  

     
    PAPER-Human-computer Interaction

      Vol:
    E97-D No:2
      Page(s):
    266-274

    Due to the recent progress made in camera and network environments, on-line video services enable people around the world to watch or share high-quality HD videos that can record a wider angle without losing objects' details in each image. As a result, users of these services can watch videos in different ways with different ROIs (Regions of Interest), especially when there are multiple objects in a scene, and thus there are few common ways for them to transfer their impressions for each scene directly. Posting messages is currently the usual way but it does not sufficiently enable all users to transfer their impressions. To transfer a user's impressions directly and provide users with a richer video watching experience, we propose a system that enables them to extract their favorite parts of videos as ROI trajectories through simple and intuitive manipulation of their tablet device. It also enables them to share a recorded trajectory with others after stabilizing it in a manner that should be satisfactory to every user. Using statistical analysis of user manipulations, we have demonstrated an approach to trajectory stabilization that can eliminate undesirable or uncomfortable elements due to tablet-specific manipulations. The system's validity has been confirmed by subjective evaluations.

  • Optimizing Virtual Machine Migration for Energy-Efficient Clouds

    Marat ZHANIKEEV  

     
    PAPER-Network

      Vol:
    E97-B No:2
      Page(s):
    450-458

    This paper proposes a new optimization problem and several implementation algorithms for energy-efficient clouds where energy efficiency is measured by the number of physical machines that can be removed from operation and turned off. The optimization problem is formulated is such a way that solutions are considered favorable not only when the number of migrations is minimized but also when the resulting layout has more free physical machines which can therefore be turned off to save electricity.

  • Pose-Free Face Swapping Based on a Deformable 3D Shape Morphable Model

    Yuan LIN  Shengjin WANG  

     
    PAPER-Computer Graphics

      Vol:
    E97-D No:2
      Page(s):
    305-314

    Traditional face swapping technologies require that the faces of source images and target images have similar pose and appearance (usually frontal). For overcoming this limit in applications this paper presents a pose-free face swapping method based on personalized 3D face modeling. By using a deformable 3D shape morphable model, a photo-realistic 3D face is reconstructed from a single frontal view image. With the aid of the generated 3D face, a virtual source image of the person with the same pose as the target face can be rendered, which is used as a source image for face swapping. To solve the problem of illumination difference between the target face and the source face, a color transfer merging method is proposed. It outperforms the original color transfer method in dealing with the illumination gap problem. An experiment shows that the proposed face reconstruction method is fast and efficient. In addition, we have conducted experiments of face swapping in a variety of scenarios such as children's story book, role play, and face de-identification stripping facial information used for identification, and promising results have been obtained.

  • Security Analysis of Collusion-Resistant Nearest Neighbor Query Scheme on Encrypted Cloud Data

    Youwen ZHU  Tsuyoshi TAKAGI  Rong HU  

     
    LETTER-Data Engineering, Web Information Systems

      Vol:
    E97-D No:2
      Page(s):
    326-330

    Recently, Yuan et al. (IEEE Infocom'13, pp.2652-2660) proposed an efficient secure nearest neighbor (SNN) search scheme on encrypted cloud database. Their scheme is claimed to be secure against the collusion attack of query clients and cloud server, because the colluding attackers cannot infer the encryption/decryption key. In this letter, we observe that the encrypted dataset in Yuan's scheme can be broken by the collusion attack without deducing the key, and present a simple but powerful attack to their scheme. Experiment results validate the high efficiency of our attacking approach. Additionally, we also indicate an upper bound of collusion-resistant ability of any accurate SNN query scheme.

901-920hit(3318hit)