The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Y(22683hit)

1061-1080hit(22683hit)

  • A Case for Low-Latency Communication Layer for Distributed Operating Systems

    Sang-Hoon KIM  

     
    LETTER-Software System

      Pubricized:
    2021/09/06
      Vol:
    E104-D No:12
      Page(s):
    2244-2247

    There have been increasing demands for distributed operating systems to better utilize scattered resources over multiple nodes. This paper enlightens the challenges and requirements for the communication layers for distributed operating systems, and makes a case for a versatile, high-performance communication layer over InfiniBand network.

  • LTL Model Checking for Register Pushdown Systems

    Ryoma SENDA  Yoshiaki TAKATA  Hiroyuki SEKI  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2021/08/31
      Vol:
    E104-D No:12
      Page(s):
    2131-2144

    A pushdown system (PDS) is known as an abstract model of recursive programs. For PDS, model checking methods have been studied and applied to various software verification such as interprocedural data flow analysis and malware detection. However, PDS cannot manipulate data values from an infinite domain. A register PDS (RPDS) is an extension of PDS by adding registers to deal with data values in a restricted way. This paper proposes algorithms for LTL model checking problems for RPDS with simple and regular valuations, which are labelings of atomic propositions to configurations with reasonable restriction. First, we introduce RPDS and related models, and then define the LTL model checking problems for RPDS. Second, we give algorithms for solving these problems and also show that the problems are EXPTIME-complete. As practical examples, we show solutions of a malware detection and an XML schema checking in the proposed framework.

  • Interleaved Weighted Round-Robin: A Network Calculus Analysis Open Access

    Seyed Mohammadhossein TABATABAEE  Jean-Yves LE BOUDEC  Marc BOYER  

     
    INVITED PAPER

      Pubricized:
    2021/07/01
      Vol:
    E104-B No:12
      Page(s):
    1479-1493

    Weighted Round-Robin (WRR) is often used, due to its simplicity, for scheduling packets or tasks. With WRR, a number of packets equal to the weight allocated to a flow can be served consecutively, which leads to a bursty service. Interleaved Weighted Round-Robin (IWRR) is a variant that mitigates this effect. We are interested in finding bounds on worst-case delay obtained with IWRR. To this end, we use a network calculus approach and find a strict service curve for IWRR. The result is obtained using the pseudo-inverse of a function. We show that the strict service curve is the best obtainable one, and that delay bounds derived from it are tight (i.e., worst-case) for flows of packets of constant size. Furthermore, the IWRR strict service curve dominates the strict service curve for WRR that was previously published. We provide some numerical examples to illustrate the reduction in worst-case delays caused by IWRR compared to WRR.

  • Reliability Enhancement for 5G End-to-End Network Slice Provisioning to Survive Physical Node Failures Open Access

    Xiang WANG  Xin LU  Meiming FU  Jiayi LIU  Hongyan YANG  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2021/06/01
      Vol:
    E104-B No:12
      Page(s):
    1494-1505

    Leveraging on Network Function Virtualization (NFV) and Software Defined Networking (SDN), network slicing (NS) is recognized as a key technology that enables the 5G Infrastructure Provider (InP) to support diversified vertical services over a shared common physical infrastructure. 5G end-to-end (E2E) NS is a logical virtual network that spans across the 5G network. Existing works on improving the reliability of the 5G mainly focus on reliable wireless communications, on the other hand, the reliability of an NS also refers to the ability of the NS system to provide continued service. Hence, in this work, we focus on enhancing the reliability of the NS to cope with physical network node failures, and we investigate the NS deployment problem to improve the reliability of the system represented by the NS. The reliability of an NS is enhanced by two means: firstly, by considering the topology information of an NS, critical virtual nodes are backed up to allow failure recovery; secondly, the embedding of the augmented NS virtual network is optimized for failure avoidance. We formulate the embedding of the augmented virtual network (AVN) to maximize the survivability of the NS system as the survivable AVN embedding (S-AVNE) problem through an Integer Linear Program (ILP) formulation. Due to the complexity of the problem, a heuristic algorithm is introduced. Finally, we conduct intensive simulations to evaluate the performance of our algorithm with regard to improving the reliability of the NS system.

  • Radar Emitter Identification Based on Auto-Correlation Function and Bispectrum via Convolutional Neural Network

    Zhiling XIAO  Zhenya YAN  

     
    PAPER-Fundamental Theories for Communications

      Pubricized:
    2021/06/10
      Vol:
    E104-B No:12
      Page(s):
    1506-1513

    This article proposes to apply the auto-correlation function (ACF), bispectrum analysis, and convolutional neural networks (CNN) to implement radar emitter identification (REI) based on intrapulse features. In this work, we combine ACF with bispectrum for signal feature extraction. We first calculate the ACF of each emitter signal, and then the bispectrum of the ACF and obtain the spectrograms. The spectrum images are taken as the feature maps of the radar emitters and fed into the CNN classifier to realize automatic identification. We simulate signal samples of different modulation types in experiments. We also consider the feature extraction method directly using bispectrum analysis for comparison. The simulation results demonstrate that by combining ACF with bispectrum analysis, the proposed scheme can attain stronger robustness to noise, the spectrograms of our approach have more pronounced features, and our approach can achieve better identification performance at low signal-to-noise ratios.

  • Trace Representation of r-Ary Sequences Derived from Euler Quotients Modulo 2p

    Rayan MOHAMMED  Xiaoni DU  Wengang JIN  Yanzhong SUN  

     
    PAPER-Coding Theory

      Pubricized:
    2021/06/21
      Vol:
    E104-A No:12
      Page(s):
    1698-1703

    We introduce the r-ary sequence with period 2p2 derived from Euler quotients modulo 2p (p is an odd prime) where r is an odd prime divisor of (p-1). Then based on the cyclotomic theory and the theory of trace function in finite fields, we give the trace representation of the proposed sequence by determining the corresponding defining polynomial. Our results will be help for the implementation and the pseudo-random properties analysis of the sequences.

  • New Binary Quantum Codes Derived from Quasi-Twisted Codes with Hermitian Inner Product

    Yu YAO  Yuena MA  Jingjie LV  Hao SONG  Qiang FU  

     
    LETTER-Coding Theory

      Pubricized:
    2021/05/28
      Vol:
    E104-A No:12
      Page(s):
    1718-1722

    In this paper, a special class of two-generator quasi-twisted (QT) codes with index 2 will be presented. We explore the algebraic structure of the class of QT codes and the form of their Hermitian dual codes. A sufficient condition for self-orthogonality with Hermitian inner product is derived. Using the class of Hermitian self-orthogonal QT codes, we construct two new binary quantum codes [[70, 42, 7]]2, [[78, 30, 10]]2. According to Theorem 6 of Ref.[2], we further can get 9 new binary quantum codes. So a total of 11 new binary quantum codes are obtained and there are 10 quantum codes that can break the quantum Gilbert-Varshamov (GV) bound.

  • A Low-Latency Inference of Randomly Wired Convolutional Neural Networks on an FPGA

    Ryosuke KURAMOCHI  Hiroki NAKAHARA  

     
    PAPER

      Pubricized:
    2021/06/24
      Vol:
    E104-D No:12
      Page(s):
    2068-2077

    Convolutional neural networks (CNNs) are widely used for image processing tasks in both embedded systems and data centers. In data centers, high accuracy and low latency are desired for various tasks such as image processing of streaming videos. We propose an FPGA-based low-latency CNN inference for randomly wired convolutional neural networks (RWCNNs), whose layer structures are based on random graph models. Because RWCNNs have several convolution layers that have no direct dependencies between them, our architecture can process them efficiently using a pipeline method. At each layer, we need to use the calculation results of multiple layers as the input. We use an FPGA with HBM2 to enable parallel access to the input data with multiple HBM2 channels. We schedule the order of execution of the layers to improve the pipeline efficiency. We build a conflict graph using the scheduling results. Then, we allocate the calculation results of each layer to the HBM2 channels by coloring the graph. Because the pipeline execution needs to be properly controlled, we developed an automatic generation tool for hardware functions. We implemented the proposed architecture on the Alveo U50 FPGA. We investigated a trade-off between latency and recognition accuracy for the ImageNet classification task by comparing the inference performances for different input image sizes. We compared our accelerator with a conventional accelerator for ResNet-50. The results show that our accelerator reduces the latency by 2.21 times. We also obtained 12.6 and 4.93 times better efficiency than CPU and GPU, respectively. Thus, our accelerator for RWCNNs is suitable for low-latency inference.

  • Fogcached: A DRAM/NVMM Hybrid KVS Server for Edge Computing

    Kouki OZAWA  Takahiro HIROFUCHI  Ryousei TAKANO  Midori SUGAYA  

     
    PAPER

      Pubricized:
    2021/08/18
      Vol:
    E104-D No:12
      Page(s):
    2089-2096

    With the development of IoT devices and sensors, edge computing is leading towards new services like autonomous cars and smart cities. Low-latency data access is an essential requirement for such services, and a large-capacity cache server is needed on the edge side. However, it is not realistic to build a large capacity cache server using only DRAM because DRAM is expensive and consumes substantially large power. A hybrid main memory system is promising to address this issue, in which main memory consists of DRAM and non-volatile memory. It achieves a large capacity of main memory within the power supply capabilities of current servers. In this paper, we propose Fogcached, that is, the extension of a widely-used KVS (Key-Value Store) server program (i.e., Memcached) to exploit both DRAM and non-volatile main memory (NVMM). We used Intel Optane DCPM as NVMM for its prototype. Fogcached implements a Dual-LRU (Least Recently Used) mechanism that seamlessly extends the memory management of Memcached to hybrid main memory. Fogcached reuses the segmented LRU of Memcached to manage cached objects in DRAM, adds another segmented LRU for those in DCPM and bridges the LRUs by a mechanism to automatically replace cached objects between DRAM and DCPM. Cached objects are autonomously moved between the two memory devices according to their access frequencies. Through experiments, we confirmed that Fogcached improved the peak value of a latency distribution by about 40% compared to Memcached.

  • Proposal and Evaluation of IO Concentration-Aware Mechanisms to Improve Efficiency of Hybrid Storage Systems

    Kazuichi OE  Takeshi NANRI  

     
    PAPER

      Pubricized:
    2021/07/30
      Vol:
    E104-D No:12
      Page(s):
    2109-2120

    Hybrid storage techniques are useful methods to improve the cost performance for input-output (IO) intensive workloads. These techniques choose areas of concentrated IO accesses and migrate them to an upper tier to extract as much performance as possible through greater use of upper tier areas. Automated tiered storage with fast memory and slow flash storage (ATSMF) is a hybrid storage system situated between non-volatile memories (NVMs) and solid-state drives (SSDs). ATSMF aims to reduce the average response time for IO accesses by migrating areas of concentrated IO access from an SSD to an NVM. When a concentrated IO access finishes, the system migrates these areas from the NVM back to the SSD. Unfortunately, the published ATSMF implementation temporarily consumes much NVM capacity upon migrating concentrated IO access areas to NVM, because its algorithm executes NVM migration with high priority. As a result, it often delays evicting areas in which IO concentrations have ended to the SSD. Therefore, to reduce the consumption of NVM while maintaining the average response time, we developed new techniques for making ATSMF more practical. The first is a queue handling technique based on the number of IO accesses for NVM migration and eviction. The second is an eviction method that selects only write-accessed partial regions in finished areas. The third is a technique for variable eviction timing to balance the NVM consumption and average response time. Experimental results indicate that the average response times of the proposed ATSMF are almost the same as those of the published ATSMF, while the NVM consumption is three times lower in best case.

  • Performance Comparison of Training Datasets for System Call-Based Malware Detection with Thread Information

    Yuki KAJIWARA  Junjun ZHENG  Koichi MOURI  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/09/21
      Vol:
    E104-D No:12
      Page(s):
    2173-2183

    The number of malware, including variants and new types, is dramatically increasing over the years, posing one of the greatest cybersecurity threats nowadays. To counteract such security threats, it is crucial to detect malware accurately and early enough. The recent advances in machine learning technology have brought increasing interest in malware detection. A number of research studies have been conducted in the field. It is well known that malware detection accuracy largely depends on the training dataset used. Creating a suitable training dataset for efficient malware detection is thus crucial. Different works usually use their own dataset; therefore, a dataset is only effective for one detection method, and strictly comparing several methods using a common training dataset is difficult. In this paper, we focus on how to create a training dataset for efficiently detecting malware. To achieve our goal, the first step is to clarify the information that can accurately characterize malware. This paper concentrates on threads, by treating them as important information for characterizing malware. Specifically, on the basis of the dynamic analysis log from the Alkanet, a system call tracer, we obtain the thread information and classify the thread information processing into four patterns. Then the malware detection is performed using the number of transitions of system calls appearing in the thread as a feature. Our comparative experimental results showed that the primary thread information is important and useful for detecting malware with high accuracy.

  • Lempel-Ziv Factorization in Linear-Time O(1)-Workspace for Constant Alphabets

    Weijun LIU  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2021/08/30
      Vol:
    E104-D No:12
      Page(s):
    2145-2153

    Computing the Lempel-Ziv Factorization (LZ77) of a string is one of the most important problems in computer science. Nowadays, it has been widely used in many applications such as data compression, text indexing and pattern discovery, and already become the heart of many file compressors like gzip and 7zip. In this paper, we show a linear time algorithm called Xone for computing the LZ77, which has the same space requirement with the previous best space requirement for linear time LZ77 factorization called BGone. Xone greatly improves the efficiency of BGone. Experiments show that the two versions of Xone: XoneT and XoneSA are about 27% and 31% faster than BGoneT and BGoneSA, respectively.

  • Efficient Reboot-Based Recovery of In-Memory Databases

    Yuto JUMONJI  Hiroshi YAMADA  

     
    PAPER-Dependable Computing

      Pubricized:
    2021/08/26
      Vol:
    E104-D No:12
      Page(s):
    2164-2172

    Reboot-based recovery is a simple but powerful method to recover applications from failures and unstable states. Reboot-based recovery faces a challenge to apply it to a new type of applications, in-memory databases (DBs). Unlike legacy applications, since rebooting in-memory DBs loses memory objects including key-value pairs and DB blocks, it is required to restore them, causing severe performance degradation after the reboot. This paper presents an approach that allows us to perform reboot-based recovery of in-memory DBs with lower performance degradation. Our key insight is to decouple data content objects from all the memory objects. Our approach treats data items as data content objects, preserves data content objects on memory across reboots, and enforces restarted in-memory DBs to attach them. To show the effectiveness of our approach, we elaborate the idea into two real-world DBs, MyRocks and memcached. The prototypes successfully mitigate performance degradation after their reboot-based recovery.

  • Performance Modeling of Bitcoin Blockchain: Mining Mechanism and Transaction-Confirmation Process Open Access

    Shoji KASAHARA  

     
    INVITED PAPER

      Pubricized:
    2021/06/09
      Vol:
    E104-B No:12
      Page(s):
    1455-1464

    Bitcoin is one of popular cryptocurrencies widely used over the world, and its blockchain technology has attracted considerable attention. In Bitcoin system, it has been reported that transactions are prioritized according to transaction fees, and that transactions with high priorities are likely to be confirmed faster than those with low priorities. In this paper, we consider performance modeling of Bitcoin-blockchain system in order to characterize the transaction-confirmation time. We first introduce the Bitcoin system, focusing on proof-of-work, the consensus mechanism of Bitcoin blockchain. Then, we show some queueing models and its analytical results, discussing the implications and insights obtained from the queueing models.

  • Design of Ultra-Thin Wave Absorber with Square Patch Array Considering Electromagnetic Coupling between Patch Array and Back-Metal

    Sota MATSUMOTO  Ryosuke SUGA  Kiyomichi ARAKI  Osamu HASHIMOTO  

     
    BRIEF PAPER-Electromagnetic Theory

      Pubricized:
    2021/06/07
      Vol:
    E104-C No:12
      Page(s):
    681-684

    In this paper, an ultra-thin wave absorber using a resistive patch array closely-placed in front of a back-metal is designed. The positively large susceptance is required for the patch array to cancel out the negatively large input susceptance of the short-circuited ultra-thin spacer behind the array. It is found that the array needs the gap of 1mm, sheet resistance of less than 20Ω/sq. and patch width of more than 15mm to obtain the zero input susceptance of the absorber with the 1/30 wavelength spacer. Moreover, these parameters were designed considering the electromagnetic coupling between the array and back-metal, and the square patch array absorbers with the thickness from 1/30 to 1/150 wavelength were designed.

  • CLAHE Implementation and Evaluation on a Low-End FPGA Board by High-Level Synthesis

    Koki HONDA  Kaijie WEI  Masatoshi ARAI  Hideharu AMANO  

     
    PAPER

      Pubricized:
    2021/07/12
      Vol:
    E104-D No:12
      Page(s):
    2048-2056

    Automobile companies have been trying to replace side mirrors of cars with small cameras for reducing air resistance. It enables us to apply some image processing to improve the quality of the image. Contrast Limited Adaptive Histogram Equalization (CLAHE) is one of such techniques to improve the quality of the image for the side mirror camera, which requires a large computation performance. Here, an implementation method of CLAHE on a low-end FPGA board by high-level synthesis is proposed. CLAHE has two main processing parts: cumulative distribution function (CDF) generation, and bilinear interpolation. During the CDF generation, the effect of increasing loop initiation interval can be greatly reduced by placing multiple Processing Elements (PEs). and during the interpolation, latency and BRAM usage were reduced by revising how to hold CDF and calculation method. Finally, by connecting each module with streaming interfaces, using data flow pragmas, overlapping processing, and hiding data transfer, our HLS implementation achieved a comparable result to that of HDL. We parameterized the components of the algorithm so that the number of tiles and the size of the image can be easily changed. The source code for this research can be downloaded from https://github.com/kokihonda/fpga_clahe.

  • An Improved U-Net Architecture for Image Dehazing

    Wenyi GE  Yi LIN  Zhitao WANG  Guigui WANG  Shihan TAN  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2021/09/14
      Vol:
    E104-D No:12
      Page(s):
    2218-2225

    In this paper, we present a simple yet powerful deep neural network for natural image dehazing. The proposed method is designed based on U-Net architecture and we made some design changes to make it better. We first use Group Normalization to replace Batch Normalization to solve the problem of insufficient batch size due to hardware limitations. Second, we introduce FReLU activation into the U-Net block, which can achieve capturing complicated visual layouts with regular convolutions. Experimental results on public benchmarks demonstrate the effectiveness of the modified components. On the SOTS Indoor and Outdoor datasets, it obtains PSNR of 32.23 and 31.64 respectively, which are comparable performances with state-of-the-art methods. The code is publicly available online soon.

  • Analysis on Asymptotic Optimality of Round-Robin Scheduling for Minimizing Age of Information with HARQ Open Access

    Zhiyuan JIANG  Yijie HUANG  Shunqing ZHANG  Shugong XU  

     
    INVITED PAPER

      Pubricized:
    2021/07/01
      Vol:
    E104-B No:12
      Page(s):
    1465-1478

    In a heterogeneous unreliable multiaccess network, wherein terminals share a common wireless channel with distinct error probabilities, existing works have shown that a persistent round-robin (RR-P) scheduling policy can be arbitrarily worse than the optimum in terms of Age of Information (AoI) under standard Automatic Repeat reQuest (ARQ). In this paper, practical Hybrid ARQ (HARQ) schemes which are widely-used in today's wireless networks are considered. We show that RR-P is very close to optimum with asymptotically many terminals in this case, by explicitly deriving tight, closed-form AoI gaps between optimum and achievable AoI by RR-P. In particular, it is rigorously proved that for RR-P, under HARQ models concerning fading channels (resp. finite-blocklength regime), the relative AoI gap compared with the optimum is within a constant of 6.4% (resp. 6.2% with error exponential decay rate of 0.5). In addition, RR-P enjoys the distinctive advantage of implementation simplicity with channel-unaware and easy-to-decentralize operations, making it favorable in practice. A further investigation considering constraint imposed on the number of retransmissions is presented. The performance gap is indicated through numerical simulations.

  • Experimental Demonstration of a Hard-Type Oscillator Using a Resonant Tunneling Diode and Its Comparison with a Soft-Type Oscillator

    Koichi MAEZAWA  Tatsuo ITO  Masayuki MORI  

     
    BRIEF PAPER-Semiconductor Materials and Devices

      Pubricized:
    2021/06/07
      Vol:
    E104-C No:12
      Page(s):
    685-688

    A hard-type oscillator is defined as an oscillator having stable fixed points within a stable limit cycle. For resonant tunneling diode (RTD) oscillators, using hard-type configuration has a significant advantage that it can suppress spurious oscillations in a bias line. We have fabricated hard-type oscillators using an InGaAs-based RTD, and demonstrated a proper operation. Furthermore, the oscillating properties have been compared with a soft-type oscillator having a same parameters. It has been demonstrated that the same level of the phase noise can be obtained with a much smaller power consumption of approximately 1/20.

  • Weighted PCA-LDA Based Color Quantization Method Suppressing Saturation Decrease

    Seiichi KOJIMA  Momoka HARADA  Yoshiaki UEDA  Noriaki SUETAKE  

     
    LETTER-Image

      Pubricized:
    2021/06/02
      Vol:
    E104-A No:12
      Page(s):
    1728-1732

    In this letter, we propose a new color quantization method suppressing saturation decrease. In the proposed method, saturation-based weight and intensity-based weight are used so that vivid colors are selected as the representative colors preferentially. Experiments show that the proposed method tends to select vivid colors even if they occupy only a small area in the image.

1061-1080hit(22683hit)