The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] (42807hit)

1981-2000hit(42807hit)

  • Proposal and Evaluation of IO Concentration-Aware Mechanisms to Improve Efficiency of Hybrid Storage Systems

    Kazuichi OE  Takeshi NANRI  

     
    PAPER

      Pubricized:
    2021/07/30
      Vol:
    E104-D No:12
      Page(s):
    2109-2120

    Hybrid storage techniques are useful methods to improve the cost performance for input-output (IO) intensive workloads. These techniques choose areas of concentrated IO accesses and migrate them to an upper tier to extract as much performance as possible through greater use of upper tier areas. Automated tiered storage with fast memory and slow flash storage (ATSMF) is a hybrid storage system situated between non-volatile memories (NVMs) and solid-state drives (SSDs). ATSMF aims to reduce the average response time for IO accesses by migrating areas of concentrated IO access from an SSD to an NVM. When a concentrated IO access finishes, the system migrates these areas from the NVM back to the SSD. Unfortunately, the published ATSMF implementation temporarily consumes much NVM capacity upon migrating concentrated IO access areas to NVM, because its algorithm executes NVM migration with high priority. As a result, it often delays evicting areas in which IO concentrations have ended to the SSD. Therefore, to reduce the consumption of NVM while maintaining the average response time, we developed new techniques for making ATSMF more practical. The first is a queue handling technique based on the number of IO accesses for NVM migration and eviction. The second is an eviction method that selects only write-accessed partial regions in finished areas. The third is a technique for variable eviction timing to balance the NVM consumption and average response time. Experimental results indicate that the average response times of the proposed ATSMF are almost the same as those of the published ATSMF, while the NVM consumption is three times lower in best case.

  • Performance Comparison of Training Datasets for System Call-Based Malware Detection with Thread Information

    Yuki KAJIWARA  Junjun ZHENG  Koichi MOURI  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/09/21
      Vol:
    E104-D No:12
      Page(s):
    2173-2183

    The number of malware, including variants and new types, is dramatically increasing over the years, posing one of the greatest cybersecurity threats nowadays. To counteract such security threats, it is crucial to detect malware accurately and early enough. The recent advances in machine learning technology have brought increasing interest in malware detection. A number of research studies have been conducted in the field. It is well known that malware detection accuracy largely depends on the training dataset used. Creating a suitable training dataset for efficient malware detection is thus crucial. Different works usually use their own dataset; therefore, a dataset is only effective for one detection method, and strictly comparing several methods using a common training dataset is difficult. In this paper, we focus on how to create a training dataset for efficiently detecting malware. To achieve our goal, the first step is to clarify the information that can accurately characterize malware. This paper concentrates on threads, by treating them as important information for characterizing malware. Specifically, on the basis of the dynamic analysis log from the Alkanet, a system call tracer, we obtain the thread information and classify the thread information processing into four patterns. Then the malware detection is performed using the number of transitions of system calls appearing in the thread as a feature. Our comparative experimental results showed that the primary thread information is important and useful for detecting malware with high accuracy.

  • Lempel-Ziv Factorization in Linear-Time O(1)-Workspace for Constant Alphabets

    Weijun LIU  

     
    PAPER-Fundamentals of Information Systems

      Pubricized:
    2021/08/30
      Vol:
    E104-D No:12
      Page(s):
    2145-2153

    Computing the Lempel-Ziv Factorization (LZ77) of a string is one of the most important problems in computer science. Nowadays, it has been widely used in many applications such as data compression, text indexing and pattern discovery, and already become the heart of many file compressors like gzip and 7zip. In this paper, we show a linear time algorithm called Xone for computing the LZ77, which has the same space requirement with the previous best space requirement for linear time LZ77 factorization called BGone. Xone greatly improves the efficiency of BGone. Experiments show that the two versions of Xone: XoneT and XoneSA are about 27% and 31% faster than BGoneT and BGoneSA, respectively.

  • Neural Incremental Speech Recognition Toward Real-Time Machine Speech Translation

    Sashi NOVITASARI  Sakriani SAKTI  Satoshi NAKAMURA  

     
    PAPER-Speech and Hearing

      Pubricized:
    2021/08/27
      Vol:
    E104-D No:12
      Page(s):
    2195-2208

    Real-time machine speech translation systems mimic human interpreters and translate incoming speech from a source language to the target language in real-time. Such systems can be achieved by performing low-latency processing in ASR (automatic speech recognition) module before passing the output to MT (machine translation) and TTS (text-to-speech synthesis) modules. Although several studies recently proposed sequence mechanisms for neural incremental ASR (ISR), these frameworks have a more complicated training mechanism than the standard attention-based ASR because they have to decide the incremental step and learn the alignment between speech and text. In this paper, we propose attention-transfer ISR (AT-ISR) that learns the knowledge from attention-based non-incremental ASR for a low delay end-to-end speech recognition. ISR comes with a trade-off between delay and performance, so we investigate how to reduce AT-ISR delay without a significant performance drop. Our experiment shows that AT-ISR achieves a comparable performance to the non-incremental ASR when the incremental recognition begins after the speech utterance reaches 25% of the complete utterance length. Additional experiments to investigate the effect of ISR on translation tasks are also performed. The focus is to find the optimum granularity of the output unit. The results reveal that our end-to-end subword-level ISR resulted in the best translation quality with the lowest WER and the lowest uncovered-word rate.

  • Efficient Reboot-Based Recovery of In-Memory Databases

    Yuto JUMONJI  Hiroshi YAMADA  

     
    PAPER-Dependable Computing

      Pubricized:
    2021/08/26
      Vol:
    E104-D No:12
      Page(s):
    2164-2172

    Reboot-based recovery is a simple but powerful method to recover applications from failures and unstable states. Reboot-based recovery faces a challenge to apply it to a new type of applications, in-memory databases (DBs). Unlike legacy applications, since rebooting in-memory DBs loses memory objects including key-value pairs and DB blocks, it is required to restore them, causing severe performance degradation after the reboot. This paper presents an approach that allows us to perform reboot-based recovery of in-memory DBs with lower performance degradation. Our key insight is to decouple data content objects from all the memory objects. Our approach treats data items as data content objects, preserves data content objects on memory across reboots, and enforces restarted in-memory DBs to attach them. To show the effectiveness of our approach, we elaborate the idea into two real-world DBs, MyRocks and memcached. The prototypes successfully mitigate performance degradation after their reboot-based recovery.

  • Performance Modeling of Bitcoin Blockchain: Mining Mechanism and Transaction-Confirmation Process Open Access

    Shoji KASAHARA  

     
    INVITED PAPER

      Pubricized:
    2021/06/09
      Vol:
    E104-B No:12
      Page(s):
    1455-1464

    Bitcoin is one of popular cryptocurrencies widely used over the world, and its blockchain technology has attracted considerable attention. In Bitcoin system, it has been reported that transactions are prioritized according to transaction fees, and that transactions with high priorities are likely to be confirmed faster than those with low priorities. In this paper, we consider performance modeling of Bitcoin-blockchain system in order to characterize the transaction-confirmation time. We first introduce the Bitcoin system, focusing on proof-of-work, the consensus mechanism of Bitcoin blockchain. Then, we show some queueing models and its analytical results, discussing the implications and insights obtained from the queueing models.

  • Design of Ultra-Thin Wave Absorber with Square Patch Array Considering Electromagnetic Coupling between Patch Array and Back-Metal

    Sota MATSUMOTO  Ryosuke SUGA  Kiyomichi ARAKI  Osamu HASHIMOTO  

     
    BRIEF PAPER-Electromagnetic Theory

      Pubricized:
    2021/06/07
      Vol:
    E104-C No:12
      Page(s):
    681-684

    In this paper, an ultra-thin wave absorber using a resistive patch array closely-placed in front of a back-metal is designed. The positively large susceptance is required for the patch array to cancel out the negatively large input susceptance of the short-circuited ultra-thin spacer behind the array. It is found that the array needs the gap of 1mm, sheet resistance of less than 20Ω/sq. and patch width of more than 15mm to obtain the zero input susceptance of the absorber with the 1/30 wavelength spacer. Moreover, these parameters were designed considering the electromagnetic coupling between the array and back-metal, and the square patch array absorbers with the thickness from 1/30 to 1/150 wavelength were designed.

  • Statistical-Mechanical Analysis of Adaptive Volterra Filter with the LMS Algorithm Open Access

    Kimiko MOTONAKA  Tomoya KOSEKI  Yoshinobu KAJIKAWA  Seiji MIYOSHI  

     
    PAPER-Digital Signal Processing

      Pubricized:
    2021/06/01
      Vol:
    E104-A No:12
      Page(s):
    1665-1674

    The Volterra filter is one of the digital filters that can describe nonlinearity. In this paper, we analyze the dynamic behaviors of an adaptive signal-processing system including the Volterra filter by a statistical-mechanical method. On the basis of the self-averaging property that holds when the tapped delay line is assumed to be infinitely long, we derive simultaneous differential equations in a deterministic and closed form, which describe the behaviors of macroscopic variables. We obtain the exact solution by solving the equations analytically. In addition, the validity of the theory derived is confirmed by comparison with numerical simulations.

  • CLAHE Implementation and Evaluation on a Low-End FPGA Board by High-Level Synthesis

    Koki HONDA  Kaijie WEI  Masatoshi ARAI  Hideharu AMANO  

     
    PAPER

      Pubricized:
    2021/07/12
      Vol:
    E104-D No:12
      Page(s):
    2048-2056

    Automobile companies have been trying to replace side mirrors of cars with small cameras for reducing air resistance. It enables us to apply some image processing to improve the quality of the image. Contrast Limited Adaptive Histogram Equalization (CLAHE) is one of such techniques to improve the quality of the image for the side mirror camera, which requires a large computation performance. Here, an implementation method of CLAHE on a low-end FPGA board by high-level synthesis is proposed. CLAHE has two main processing parts: cumulative distribution function (CDF) generation, and bilinear interpolation. During the CDF generation, the effect of increasing loop initiation interval can be greatly reduced by placing multiple Processing Elements (PEs). and during the interpolation, latency and BRAM usage were reduced by revising how to hold CDF and calculation method. Finally, by connecting each module with streaming interfaces, using data flow pragmas, overlapping processing, and hiding data transfer, our HLS implementation achieved a comparable result to that of HDL. We parameterized the components of the algorithm so that the number of tiles and the size of the image can be easily changed. The source code for this research can be downloaded from https://github.com/kokihonda/fpga_clahe.

  • An Improved U-Net Architecture for Image Dehazing

    Wenyi GE  Yi LIN  Zhitao WANG  Guigui WANG  Shihan TAN  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2021/09/14
      Vol:
    E104-D No:12
      Page(s):
    2218-2225

    In this paper, we present a simple yet powerful deep neural network for natural image dehazing. The proposed method is designed based on U-Net architecture and we made some design changes to make it better. We first use Group Normalization to replace Batch Normalization to solve the problem of insufficient batch size due to hardware limitations. Second, we introduce FReLU activation into the U-Net block, which can achieve capturing complicated visual layouts with regular convolutions. Experimental results on public benchmarks demonstrate the effectiveness of the modified components. On the SOTS Indoor and Outdoor datasets, it obtains PSNR of 32.23 and 31.64 respectively, which are comparable performances with state-of-the-art methods. The code is publicly available online soon.

  • Analysis on Asymptotic Optimality of Round-Robin Scheduling for Minimizing Age of Information with HARQ Open Access

    Zhiyuan JIANG  Yijie HUANG  Shunqing ZHANG  Shugong XU  

     
    INVITED PAPER

      Pubricized:
    2021/07/01
      Vol:
    E104-B No:12
      Page(s):
    1465-1478

    In a heterogeneous unreliable multiaccess network, wherein terminals share a common wireless channel with distinct error probabilities, existing works have shown that a persistent round-robin (RR-P) scheduling policy can be arbitrarily worse than the optimum in terms of Age of Information (AoI) under standard Automatic Repeat reQuest (ARQ). In this paper, practical Hybrid ARQ (HARQ) schemes which are widely-used in today's wireless networks are considered. We show that RR-P is very close to optimum with asymptotically many terminals in this case, by explicitly deriving tight, closed-form AoI gaps between optimum and achievable AoI by RR-P. In particular, it is rigorously proved that for RR-P, under HARQ models concerning fading channels (resp. finite-blocklength regime), the relative AoI gap compared with the optimum is within a constant of 6.4% (resp. 6.2% with error exponential decay rate of 0.5). In addition, RR-P enjoys the distinctive advantage of implementation simplicity with channel-unaware and easy-to-decentralize operations, making it favorable in practice. A further investigation considering constraint imposed on the number of retransmissions is presented. The performance gap is indicated through numerical simulations.

  • Experimental Demonstration of a Hard-Type Oscillator Using a Resonant Tunneling Diode and Its Comparison with a Soft-Type Oscillator

    Koichi MAEZAWA  Tatsuo ITO  Masayuki MORI  

     
    BRIEF PAPER-Semiconductor Materials and Devices

      Pubricized:
    2021/06/07
      Vol:
    E104-C No:12
      Page(s):
    685-688

    A hard-type oscillator is defined as an oscillator having stable fixed points within a stable limit cycle. For resonant tunneling diode (RTD) oscillators, using hard-type configuration has a significant advantage that it can suppress spurious oscillations in a bias line. We have fabricated hard-type oscillators using an InGaAs-based RTD, and demonstrated a proper operation. Furthermore, the oscillating properties have been compared with a soft-type oscillator having a same parameters. It has been demonstrated that the same level of the phase noise can be obtained with a much smaller power consumption of approximately 1/20.

  • A Design Methodology of Wi-Fi RTT Ranging for Lateration

    Tetsuya MANABE  Koichi AIHARA  Naoki KOJIMA  Yusuke HIRAYAMA  Taichi SUZUKI  

     
    PAPER-Intelligent Transport System

      Pubricized:
    2021/06/01
      Vol:
    E104-A No:12
      Page(s):
    1704-1713

    This paper indicates a design methodology of Wi-Fi round-trip time (RTT) ranging for lateration through the performance evaluation experiments. The Wi-Fi RTT-based lateration needs to operate plural access points (APs) at the same time. However, the relationship between the number of APs in operation and ranging performance has not been clarified in the conventional researches. Then, we evaluate the ranging performance of Wi-Fi RTT for lateration focusing on the number of APs and channel-usage conditions. As the results, we confirm that the ranging result acquisition rates decreases caused by increasing the number of APs simultaneously operated and/or increasing the channel-usage rates. In addition, based on positioning performance comparison between the Wi-Fi RTT-based lateration and the Wi-Fi fingerprint method, we clarify the points of notice that positioning by Wi-Fi RTT-based lateration differs from the conventional radio-intensity-based positioning. Consequently, we show a design methodology of Wi-Fi RTT ranging for lateration as the following three points: the important indicators for evaluation, the severeness of the channel selection, and the number of APs for using. The design methodology will help to realize the high-quality location-based services.

  • Weighted PCA-LDA Based Color Quantization Method Suppressing Saturation Decrease

    Seiichi KOJIMA  Momoka HARADA  Yoshiaki UEDA  Noriaki SUETAKE  

     
    LETTER-Image

      Pubricized:
    2021/06/02
      Vol:
    E104-A No:12
      Page(s):
    1728-1732

    In this letter, we propose a new color quantization method suppressing saturation decrease. In the proposed method, saturation-based weight and intensity-based weight are used so that vivid colors are selected as the representative colors preferentially. Experiments show that the proposed method tends to select vivid colors even if they occupy only a small area in the image.

  • GECNN for Weakly Supervised Semantic Segmentation of 3D Point Clouds

    Zifen HE  Shouye ZHU  Ying HUANG  Yinhui ZHANG  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/09/24
      Vol:
    E104-D No:12
      Page(s):
    2237-2243

    This paper presents a novel method for weakly supervised semantic segmentation of 3D point clouds using a novel graph and edge convolutional neural network (GECNN) towards 1% and 10% point cloud with labels. Our general framework facilitates semantic segmentation by encoding both global and local scale features via a parallel graph and edge aggregation scheme. More specifically, global scale graph structure cues of point clouds are captured by a graph convolutional neural network, which is propagated from pairwise affinity representation over the whole graph established in a d-dimensional feature embedding space. We integrate local scale features derived from a dynamic edge feature aggregation convolutional neural networks that allows us to fusion both global and local cues of 3D point clouds. The proposed GECNN model is trained by using a comprehensive objective which consists of incomplete, inexact, self-supervision and smoothness constraints based on partially labeled points. The proposed approach enforces global and local consistency constraints directly on the objective losses. It inherently handles the challenges of segmenting sparse 3D point clouds with limited annotations in a large scale point cloud space. Our experiments on the ShapeNet and S3DIS benchmarks demonstrate the effectiveness of the proposed approach for efficient (within 20 epochs) learning of large scale point cloud semantics despite very limited labels.

  • Multimodal-Based Stream Integrated Neural Networks for Pain Assessment

    Ruicong ZHI  Caixia ZHOU  Junwei YU  Tingting LI  Ghada ZAMZMI  

     
    PAPER-Human-computer Interaction

      Pubricized:
    2021/09/10
      Vol:
    E104-D No:12
      Page(s):
    2184-2194

    Pain is an essential physiological phenomenon of human beings. Accurate assessment of pain is important to develop proper treatment. Although self-report method is the gold standard in pain assessment, it is not applicable to individuals with communicative impairment. Non-verbal pain indicators such as pain related facial expressions and changes in physiological parameters could provide valuable insights for pain assessment. In this paper, we propose a multimodal-based Stream Integrated Neural Network with Different Frame Rates (SINN) that combines facial expression and biomedical signals for automatic pain assessment. The main contributions of this research are threefold. (1) There are four-stream inputs of the SINN for facial expression feature extraction. The variant facial features are integrated with biomedical features, and the joint features are utilized for pain assessment. (2) The dynamic facial features are learned in both implicit and explicit manners to better represent the facial changes that occur during pain experience. (3) Multiple modalities are utilized to identify various pain states, including facial expression and biomedical signals. The experiments are conducted on publicly available pain datasets, and the performance is compared with several deep learning models. The experimental results illustrate the superiority of the proposed model, and it achieves the highest accuracy of 68.2%, which is up to 5% higher than the basic deep learning models on pain assessment with binary classification.

  • Weight Sparseness for a Feature-Map-Split-CNN Toward Low-Cost Embedded FPGAs

    Akira JINGUJI  Shimpei SATO  Hiroki NAKAHARA  

     
    PAPER

      Pubricized:
    2021/09/27
      Vol:
    E104-D No:12
      Page(s):
    2040-2047

    Convolutional neural network (CNN) has a high recognition rate in image recognition and are used in embedded systems such as smartphones, robots and self-driving cars. Low-end FPGAs are candidates for embedded image recognition platforms because they achieve real-time performance at a low cost. However, CNN has significant parameters called weights and internal data called feature maps, which pose a challenge for FPGAs for performance and memory capacity. To solve these problems, we exploit a split-CNN and weight sparseness. The split-CNN reduces the memory footprint by splitting the feature map into smaller patches and allows the feature map to be stored in the FPGA's high-throughput on-chip memory. Weight sparseness reduces computational costs and achieves even higher performance. We designed a dedicated architecture of a sparse CNN and a memory buffering scheduling for a split-CNN and implemented this on the PYNQ-Z1 FPGA board with a low-end FPGA. An experiment on classification using VGG16 shows that our implementation is 3.1 times faster than the GPU, and 5.4 times faster than an existing FPGA implementation.

  • Achieving Ultra-Low Latency for Network Coding-Aware Multicast Fronthaul Transmission in Cache-Enabled C-RANs

    Qinglong LIU  Chongfu ZHANG  

     
    LETTER-Coding Theory

      Pubricized:
    2021/06/15
      Vol:
    E104-A No:12
      Page(s):
    1723-1727

    In cloud radio access networks (C-RANs) architecture, the Hybrid Automatic Repeat Request (HARQ) protocol imposes a strict limit on the latency between the baseband unit (BBU) pool and the remote radio head (RRH), which is a key challenge in the adoption of C-RANs. In this letter, we propose a joint edge caching and network coding strategy (ENC) in the C-RANs with multicast fronthaul to improve the performance of HARQ and thus achieve ultra-low latency in 5G cellular systems. We formulate the edge caching design as an optimization problem for maximizing caching utility so as to obtain the optimal caching time. Then, for real-time data flows with different latency constraints, we propose a scheduling policy based on network coding group (NCG) to maximize coding opportunities and thus improve the overall latency performance of multicast fronthaul transmission. We evaluate the performance of ENC by conducting simulation experiments based on NS-3. Numerical results show that ENC can efficiently reduce the delivery delay.

  • Formalization and Analysis of Ceph Using Process Algebra

    Ran LI  Huibiao ZHU  Jiaqi YIN  

     
    PAPER-Software System

      Pubricized:
    2021/09/28
      Vol:
    E104-D No:12
      Page(s):
    2154-2163

    Ceph is an object-based parallel distributed file system that provides excellent performance, reliability, and scalability. Additionally, Ceph provides its Cephx authentication system to authenticate users, so that it can identify users and realize authentication. In this paper, we first model the basic architecture of Ceph using process algebra CSP (Communicating Sequential Processes). With the help of the model checker PAT (Process Analysis Toolkit), we feed the constructed model to PAT and then verify several related properties, including Deadlock Freedom, Data Reachability, Data Write Integrity, Data Consistency and Authentication. The verification results show that the original model cannot cater to the Authentication property. Therefore, we formalize a new model of Ceph where Cephx is adopted. In the light of the new verification results, it can be found that Cephx satisfies all these properties.

  • Time-Optimal Self-Stabilizing Leader Election on Rings in Population Protocols Open Access

    Daisuke YOKOTA  Yuichi SUDO  Toshimitsu MASUZAWA  

     
    PAPER-Algorithms and Data Structures

      Pubricized:
    2021/06/03
      Vol:
    E104-A No:12
      Page(s):
    1675-1684

    We propose a self-stabilizing leader election protocol on directed rings in the model of population protocols. Given an upper bound N on the population size n, the proposed protocol elects a unique leader within O(nN) expected steps starting from any configuration and uses O(N) states. This convergence time is optimal if a given upper bound N is asymptotically tight, i.e., N=O(n).

1981-2000hit(42807hit)