The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] ACH(1072hit)

421-440hit(1072hit)

  • EPWM-OFDM Signal Transmission against Nonlinearities of E/O Converters in Radio over Fiber Channel

    Xiaoxue YU  Yasushi YAMAO  Motoharu MATSUURA  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E97-B No:2
      Page(s):
    484-494

    Radio over Fiber (RoF) is a promising technology that is suitable for broadband wireless access systems to cover in-building areas and outdoor dead-spots. However, one issue in RoF transmission that should be considered is the nonlinear distortion caused by Electrical/Optical (E/O) converters. Multicarrier RF (Radio Frequency) signal formats such as Orthogonal Frequency Division Multiplexing (OFDM), which are commonly employed in broadband wireless communications, are weak against nonlinearities. To enable the linear transmission of OFDM signal in RoF channel, we propose to employ the Envelop Pulse Width Modulation (EPWM) transmission scheme for RoF channel. Two commonly used E/O converters, Mach-Zehnder modulator and direct-modulation of Distributed Feedback Laser Diode (DFB LD), are employed to validate the proposal. Based on the measured nonlinearities of the E/O converters, they are mathematically modeled and their transmission performance are analyzed. A modified Rapp model is developed for the modeling of the DFB LD. Through simulations and experiments, the proposed scheme is shown to be effective in dealing with the nonlinearities of the E/O converters.

  • Polynomial Time Verification of Reachability in Sound Extended Free-Choice Workflow Nets

    Shingo YAMAGUCHI  

     
    PAPER

      Vol:
    E97-A No:2
      Page(s):
    468-475

    Workflow nets are a standard way for modeling and analyzing workflows. There are two aspects in a workflow: definition and instance. In form of workflow nets, a workflow definition and a workflow instance are respectively represented as a net structure and a marking. The correctness of the workflow definition can be checked by using a workflow nets' property, called soundness. On the other hand, the correctness of the workflow instance can be checked by using a Petri nets' well-known property, called reachability. The reachability problem is known to be intractable. In this paper, we have shown that the reachability problem for (i) sound extended free-choice workflow nets with a marking representing one workflow instance or (ii) acyclic well-structured workflow nets with a marking representing one or more workflow instances can be solved in polynomial time.

  • Security Evaluation of RG-DTM PUF Using Machine Learning Attacks

    Mitsuru SHIOZAKI  Kousuke OGAWA  Kota FURUHASHI  Takahiko MURAYAMA  Masaya YOSHIKAWA  Takeshi FUJINO  

     
    PAPER-Hardware Based Security

      Vol:
    E97-A No:1
      Page(s):
    275-283

    In modern hardware security applications, silicon physical unclonable functions (PUFs) are of interest for their potential use as a unique identity or secret key that is generated from inherent characteristics caused by process variations. However, arbiter-based PUFs utilizing the relative delay-time difference between equivalent paths have a security issue in which the generated challenge-response pairs (CRPs) can be predicted by a machine learning attack. We previously proposed the RG-DTM PUF, in which a response is decided from divided time domains allocated to response 0 or 1, to improve the uniqueness of the conventional arbiter-PUF in a small circuit. However, its resistance against machine learning attacks has not yet been studied. In this paper, we evaluate the resistance against machine learning attacks by using a support vector machine (SVM) and logistic regression (LR) in both simulations and measurements and compare the RG-DTM PUF with the conventional arbiter-PUF and with the XOR arbiter-PUF, which strengthens the resistance by using XORing output from multiple arbiter-PUFs. In numerical simulations, prediction rates using both SVM and LR were above 90% within 1,000 training CRPs on the arbiter-PUF. The machine learning attack using the SVM could never predict responses on the XOR arbiter-PUF with over six arbiter-PUFs, whereas the prediction rate eventually reached 95% using the LR and many training CRPs. On the RG-DTM PUF, when the division number of the time domains was over eight, the prediction rates using the SVM were equal to the probability by guess. The machine learning attack using LR has the potential to predict responses, although an adversary would need to steal a significant amount of CRPs. However, the resistance can exponentially be strengthened with an increase in the division number, just like with the XOR arbiter-PUF. Over one million CRPs are required to attack the 16-divided RG-DTM PUF. Differences between the RG-DTM PUF and the XOR arbiter-PUF relate to the area penalty and the power penalty. Specifically, the XOR arbiter-PUF has to make up for resistance against machine learning attacks by increasing the circuit area, while the RG-DTM PUF is resistant against machine learning attacks with less area penalty and power penalty since only capacitors are added to the conventional arbiter-PUF. We also attacked RG-DTM PUF chips, which were fabricated with 0.18-µm CMOS technology, to evaluate the effect of physical variations and unstable responses. The resistance against machine learning attacks was related to the delay-time difference distribution, but unstable responses had little influence on the attack results.

  • Sentence-Level Combination of Machine Translation Outputs with Syntactically Hybridized Translations

    Bo WANG  Yuanyuan ZHANG  Qian XU  

     
    LETTER-Natural Language Processing

      Vol:
    E97-D No:1
      Page(s):
    164-167

    We describe a novel idea to improve machine translation by combining multiple candidate translations and extra translations. Without manual work, extra translations can be generated by identifying and hybridizing the syntactic equivalents in candidate translations. Candidate and extra translations are then combined on sentence level for better general translation performance.

  • A Router-Aided Hierarchical P2P Traffic Localization Based on Variable Additional Delay Insertion

    Hiep HOANG-VAN  Yuki SHINOZAKI  Takumi MIYOSHI  Olivier FOURMAUX  

     
    PAPER

      Vol:
    E97-B No:1
      Page(s):
    29-39

    Most peer-to-peer (P2P) systems build their own overlay networks for implementing peer selection strategies without taking into account the locality on the underlay network. As a result, a large quantity of traffic crossing internet service providers (ISPs) or autonomous systems (ASes) is generated on the Internet. Controlling the P2P traffic is therefore becoming a big challenge for the ISPs. To control the cost of the cross-ISP/AS traffic, ISPs often throttle and/or even block P2P applications in their networks. In this paper, we propose a router-aided approach for localizing the P2P traffic hierarchically; it features the insertion of additional delay into each P2P packet based on geographical location of its destination. Compared to the existing approaches that solve the problem on the application layer, our proposed method does not require dedicated servers, cooperation between ISPs and P2P users, or modification of existing P2P application software. Therefore, the proposal can be easily utilized by all types of P2P applications. Experiments on P2P streaming applications indicate that our hierarchical traffic localization method not only reduces significantly the inter-domain traffic but also maintains a good performance of P2P applications.

  • An Accurate Packer Identification Method Using Support Vector Machine

    Ryoichi ISAWA  Tao BAN  Shanqing GUO  Daisuke INOUE  Koji NAKAO  

     
    PAPER-Foundations

      Vol:
    E97-A No:1
      Page(s):
    253-263

    PEiD is a packer identification tool widely used for malware analysis but its accuracy is becoming lower and lower recently. There exist two major reasons for that. The first is that PEiD does not provide a way to create signatures, though it adopts a signature-based approach. We need to create signatures manually, and it is difficult to catch up with packers created or upgraded rapidly. The second is that PEiD utilizes exact matching. If a signature contains any error, PEiD cannot identify the packer that corresponds to the signature. In this paper, we propose a new automated packer identification method to overcome the limitations of PEiD and report the results of our numerical study. Our method applies string-kernel-based support vector machine (SVM): it can measure the similarity between packed programs without our operations such as manually creating signature and it provides some error tolerant mechanism that can significantly reduce detection failure caused by minor signature violations. In addition, we use the byte sequence starting from the entry point of a packed program as a packer's feature given to SVM. That is, our method combines the advantages from signature-based approach and machine learning (ML) based approach. The numerical results on 3902 samples with 26 packer classes and 3 unpacked (not-packed) classes shows that our method achieves a high accuracy of 99.46% outperforming PEiD and an existing ML-based method that Sun et al. have proposed.

  • A Method of Parallelizing Consensuses for Accelerating Byzantine Fault Tolerance

    Junya NAKAMURA  Tadashi ARARAGI  Toshimitsu MASUZAWA  Shigeru MASUYAMA  

     
    PAPER-Dependable Computing

      Vol:
    E97-D No:1
      Page(s):
    53-64

    We propose a new method that accelerates asynchronous Byzantine Fault Tolerant (BFT) protocols designed on the principle of state machine replication. State machine replication protocols ensure consistency among replicas by applying operations in the same order to all of them. A naive way to determine the application order of the operations is to repeatedly execute the BFT consensus to determine the next executed operation, but this may introduce inefficiency caused by waiting for the completion of the previous execution of the consensus protocol. To reduce this inefficiency, our method allows parallel execution of the consensuses while keeping consistency of the consensus results at the replicas. In this paper, we also prove the correctness of our method and experimentally compare it with the existing method in terms of latency and throughput. The evaluation results show that our method makes a BFT protocol three or four times faster than the existing one when some machines or message transmissions are delayed.

  • Blind CFO Estimation Based on Decision Directed MVDR Approach for Interleaved OFDMA Uplink Systems

    Chih-Chang SHEN  Ann-Chen CHANG  

     
    PAPER-Wireless Communication Technologies

      Vol:
    E97-B No:1
      Page(s):
    137-145

    This paper deals with carrier frequency offset (CFO) estimation based on the minimum variance distortionless response (MVDR) criterion without using specific training sequences for interleaved orthogonal frequency division multiple access (OFDMA) uplink systems. In the presence of large CFOs, the estimator is proposed to find a new CFO vector based on the first-order Taylor series expansion of the one initially given. The problem of finding the new CFO vector is formulated as the closed form of a generalized eigenvalue problem, which allows one to readily solve it. Since raising the accuracy of residual CFO estimation can provide more accurate residual CFO compensation, this paper also present a decision-directed MVDR approach to improve the CFO estimation performance. However, the proposed estimator can estimate CFOs with less computation load. Several computer simulation results are provided for illustrating the effectiveness of the proposed blind estimate approach.

  • Optimal Parallel Algorithms for Computing the Sum, the Prefix-Sums, and the Summed Area Table on the Memory Machine Models

    Koji NAKANO  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2626-2634

    The main contribution of this paper is to show optimal parallel algorithms to compute the sum, the prefix-sums, and the summed area table on two memory machine models, the Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM). The DMM and the UMM are theoretical parallel computing models that capture the essence of the shared memory and the global memory of GPUs. These models have three parameters, the number p of threads, and the width w of the memory, and the memory access latency l. We first show that the sum of n numbers can be computed in $O({nover w}+{nlover p}+llog n)$ time units on the DMM and the UMM. We then go on to show that $Omega({nover w}+{nlover p}+llog n)$ time units are necessary to compute the sum. We also present a parallel algorithm that computes the prefix-sums of n numbers in $O({nover w}+{nlover p}+llog n)$ time units on the DMM and the UMM. Finally, we show that the summed area table of size $sqrt{n} imessqrt{n}$ can be computed in $O({nover w}+{nlover p}+llog n)$ time units on the DMM and the UMM. Since the computation of the prefix-sums and the summed area table is at least as hard as the sum computation, these parallel algorithms are also optimal.

  • Teachability of a Subclass of Simple Deterministic Languages

    Yasuhiro TAJIMA  

     
    PAPER-Fundamentals of Information Systems

      Vol:
    E96-D No:12
      Page(s):
    2733-2742

    We show teachability of a subclass of simple deterministic languages. The subclass we define is called stack uniform simple deterministic languages. Teachability is derived by showing the query learning algorithm for this language class. Our learning algorithm uses membership, equivalence and superset queries. Then, it terminates in polynomial time. It is already known that simple deterministic languages are polynomial time query learnable by context-free grammars. In contrast, our algorithm guesses a hypothesis by a stack uniform simple deterministic grammar, thus our result is strict teachability of the subclass of simple deterministic languages. In addition, we discuss parameters of the polynomial for teachability. The “thickness” is an important parameter for parsing and it should be one of parameters to evaluate the time complexity.

  • A Practical and Optimal Path Planning for Autonomous Parking Using Fast Marching Algorithm and Support Vector Machine

    Quoc Huy DO  Seiichi MITA  Keisuke YONEDA  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E96-D No:12
      Page(s):
    2795-2804

    This paper proposes a novel practical path planning framework for autonomous parking in cluttered environments with narrow passages. The proposed global path planning method is based on an improved Fast Marching algorithm to generate a path while considering the moving forward and backward maneuver. In addition, the Support Vector Machine is utilized to provide the maximum clearance from obstacles considering the vehicle dynamics to provide a safe and feasible path. The algorithm considers the most critical points in the map and the complexity of the algorithm is not affected by the shape of the obstacles. We also propose an autonomous parking scheme for different parking situation. The method is implemented on autonomous vehicle platform and validated in the real environment with narrow passages.

  • A Characterization of Optimal FF Coding Rate Using a New Optimistically Optimal Code

    Mitsuharu ARIMURA  Hiroki KOGA  Ken-ichi IWATA  

     
    LETTER-Source Coding

      Vol:
    E96-A No:12
      Page(s):
    2443-2446

    In this letter, we first introduce a stronger notion of the optimistic achievable coding rate and discuss a coding theorem. Next, we give a necessary and sufficient condition under which the coding rates of all the optimal FF codes asymptotically converge to a constant.

  • Offline Permutation Algorithms on the Discrete Memory Machine with Performance Evaluation on the GPU

    Akihiko KASAGI  Koji NAKANO  Yasuaki ITO  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2617-2625

    The Discrete Memory Machine (DMM) is a theoretical parallel computing model that captures the essence of the shared memory access of GPUs. Bank conflicts should be avoided for maximizing the bandwidth of the shared memory access. Offline permutation of an array is a task to copy all elements in array a into array b along a permutation given in advance. The main contribution of this paper is to implement a conflict-free permutation algorithm on the DMM in a GPU. We have also implemented straightforward permutation algorithms on the GPU. The experimental results for 1024 double (64-bit) numbers on NVIDIA GeForce GTX-680 show that the straightforward permutation algorithm takes 247.8 ns for the random permutation and 1684ns for the worst permutation that involves the maximum bank conflicts. Our conflict-free permutation algorithm runs in 167ns for any permutation including the random permutation and the worst permutation, although it performs more memory accesses. It follows that our conflict-free permutation is 1.48 times faster for the random permutation and 10.0 times faster for the worst permutation.

  • Improving Cache Partitioning Algorithms for Pseudo-LRU Policies

    Xi ZHANG  Chuanyi LIU  Zhenyu LIU  Dongsheng WANG  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2514-2523

    As the number of concurrently running applications on the chip multiprocessors (CMPs) is increasing, efficient management of the shared last-level cache (LLC) is crucial to guarantee overall performance. Recent studies have shown that cache partitioning can provide benefits in throughput, fairness and quality of service. Most prior arts apply true Least Recently Used (LRU) as the underlying cache replacement policy and rely on its stack property to work properly. However, in commodity processors, pseudo-LRU policies without stack property are commonly used instead of LRU for their simplicity and low storage overhead. Therefore, this study sets out to understand whether LRU-based cache partitioning techniques can be applied to commodity processors. In this work, we instead propose a cache partitioning mechanism for two popular pseudo-LRU policies: Not Recently Used (NRU) and Binary Tree (BT). Without the help of true LRU's stack property, we propose a profiling logic that applies curve approximation methods to derive the hit curve (hit counts under varied way allocations) for an application. We then propose a hybrid partitioning mechanism, which mitigates the gap between the predicted hit curve and the actual statistics. Simulation results demonstrate that our proposal can improve throughput by 15.3% on average and outperforms the stack-estimate proposal by 12.6% on average. Similar results can be achieved in weighted speedup. For the cache configurations under study, it requires less than 0.5% storage overhead compared to the last-level cache. In addition, we also show that profiling mechanism with only one true LRU ATD achieves comparable performance and can further reduce the hardware cost by nearly two thirds compared with the hybrid mechanism.

  • Synchronization-Aware Virtual Machine Scheduling for Parallel Applications in Xen

    Cheol-Ho HONG  Chuck YOO  

     
    LETTER

      Vol:
    E96-D No:12
      Page(s):
    2720-2723

    In this paper, we propose a synchronization-aware VM scheduler for parallel applications in Xen. The proposed scheduler prevents threads from waiting for a significant amount of time during synchronization. For this purpose, we propose an identification scheme that can identify the threads that have awaited other threads for a long time. In this scheme, a detection module that can infer the internal status of guest OSs was developed. We also present a scheduling policy that can accelerate bottlenecks of concurrent VMs. We implemented our VM scheduler in the recent Xen hypervisor with para-virtualized Linux-based operating systems. We show that our approach can improve the performance of concurrent VMs by up to 43% as compared to the credit scheduler.

  • A WAN-Optimized Live Storage Migration Mechanism toward Virtual Machine Evacuation upon Severe Disasters

    Takahiro HIROFUCHI  Mauricio TSUGAWA  Hidemoto NAKADA  Tomohiro KUDOH  Satoshi ITOH  

     
    PAPER

      Vol:
    E96-D No:12
      Page(s):
    2663-2674

    Wide-area VM migration is a technology with potential to aid IT services recovery since it can be used to evacuate virtualized servers to safe locations upon a critical disaster. However, the amount of data involved in a wide-area VM migration is substantially larger compared to VM migrations within LAN due to the need to transfer virtualized storage in addition to memory and CPU states. This increase of data makes it challenging to relocate VMs under a limited time window with electrical power. In this paper, we propose a mechanism to improve live storage migration across WAN. The key idea is to reduce the amount of data to be transferred by proactively caching virtual disk blocks to a backup site during regular VM operation. As a result of pre-cached disk blocks, the proposed mechanism can dramatically reduce the amount of data and consequently the time required to live migrate the entire VM state. The mechanism was evaluated using a prototype implementation under different workloads and network conditions, and we confirmed that it dramatically reduces the time to complete a VM live migration. By using the proposed mechanism, it is possible to relocate a VM from Japan to the United States in just under 40 seconds. This relocation would otherwise take over 1500 seconds, demonstrating that the proposed mechanism was able to reduce the migration time by 97.5%.

  • Periodic Pattern Coding for Last Level Cache Data Compression

    Haruhiko KANEKO  

     
    PAPER-Data Compression

      Vol:
    E96-A No:12
      Page(s):
    2351-2359

    In spite of continuous improvement of computational power of multi/many-core processors, the memory access performance of the processors has not been improved sufficiently, and thus the overall performance of recent processors is often restricted by the delay of off-chip memory accesses. Low-delay data compression for last level cache (LLC) would be effective to improve the processor performance because the compression increases the effective size of LLC, and thus reduces the number of off-chip memory accesses. This paper proposes a novel data compression method suitable for high-speed parallel decoding in the LLC. Since cache line data often have periodicity of certain lengths, such as 32- or 64-bit instructions, 32-bit integers, and 64-bit floating point numbers, an information word is encoded as a base pattern and a differential pattern between the original word and the base pattern. Evaluation using a GPU simulator shows that the compression ratio of the proposed coding is comparable to LZSS coding and X-Match Pro and superior to other conventional compression algorithms for cache memories. Also this paper presents an experimental decoder designed for ASIC, and the synthesized result shows that the decoder can decompress cache line data of length 32bytes in four clock cycles. Evaluation of the IPC on the GPU simulator shows that, for several benchmark programs, the IPC achieved by the proposed coding is higher than that by the conventional BΔI coding, where the maximum improvement of the IPC is 20%.

  • A Novel Pedestrian Detector on Low-Resolution Images: Gradient LBP Using Patterns of Oriented Edges

    Ahmed BOUDISSA  Joo Kooi TAN  Hyoungseop KIM  Takashi SHINOMIYA  Seiji ISHIKAWA  

     
    LETTER-Pattern Recognition

      Vol:
    E96-D No:12
      Page(s):
    2882-2887

    This paper introduces a simple algorithm for pedestrian detection on low resolution images. The main objective is to create a successful means for real-time pedestrian detection. While the framework of the system consists of edge orientations combined with the local binary patterns (LBP) feature extractor, a novel way of selecting the threshold is introduced. Using the mean-variance of the background examples this threshold improves significantly the detection rate as well as the processing time. Furthermore, it makes the system robust to uniformly cluttered backgrounds, noise and light variations. The test data is the INRIA pedestrian dataset and for the classification, a support vector machine with a radial basis function (RBF) kernel is used. The system performs at state-of-the-art detection rates while being intuitive as well as very fast which leaves sufficient processing time for further operations such as tracking and danger estimation.

  • Training Multiple Support Vector Machines for Personalized Web Content Filters

    Dung Duc NGUYEN  Maike ERDMANN  Tomoya TAKEYOSHI  Gen HATTORI  Kazunori MATSUMOTO  Chihiro ONO  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E96-D No:11
      Page(s):
    2376-2384

    The abundance of information published on the Internet makes filtering of hazardous Web pages a difficult yet important task. Supervised learning methods such as Support Vector Machines (SVMs) can be used to identify hazardous Web content. However, scalability is a big challenge, especially if we have to train multiple classifiers, since different policies exist on what kind of information is hazardous. We therefore propose two different strategies to train multiple SVMs for personalized Web content filters. The first strategy identifies common data clusters and then performs optimization on these clusters in order to obtain good initial solutions for individual problems. This initialization shortens the path to the optimal solutions and reduces the training time on individual training sets. The second approach is to train all SVMs simultaneously. We introduce an SMO-based kernel-biased heuristic that balances the reduction rate of individual objective functions and the computational cost of kernel matrix. The heuristic primarily relies on the optimality conditions of all optimization problems and secondly on the pre-calculated part of the whole kernel matrix. This strategy increases the amount of information sharing among learning tasks, thus reduces the number of kernel calculation and training time. In our experiments on inconsistently labeled training examples, both strategies were able to predict hazardous Web pages accurately (> 91%) with a training time of only 26% and 18% compared to that of the normal sequential training.

  • Utilizing Multiple Data Sources for Localization in Wireless Sensor Networks Based on Support Vector Machines

    Prakit JAROENKITTICHAI  Ekachai LEELARASMEE  

     
    PAPER-Mobile Information Network and Personal Communications

      Vol:
    E96-A No:11
      Page(s):
    2081-2088

    Localization in wireless sensor networks is the problem of estimating the geographical locations of wireless sensor nodes. We propose a framework to utilizing multiple data sources for localization scheme based on support vector machines. The framework can be used with both classification and regression formulation of support vector machines. The proposed method uses only connectivity information. Multiple hop count data sources can be generated by adjusting the transmission power of sensor nodes to change the communication ranges. The optimal choice of communication ranges can be determined by evaluating mutual information. We consider two methods for integrating multiple data sources together; unif method and align method. The improved localization accuracy of the proposed framework is verified by simulation study.

421-440hit(1072hit)