The search functionality is under construction.
The search functionality is under construction.

IEICE TRANSACTIONS on Information

  • Impact Factor

    0.59

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.4

Advance publication (published online immediately after acceptance)

Volume E104-D No.11  (Publication Date:2021/11/01)

    Special Section on Next-generation Security Applications and Practice
  • FOREWORD Open Access

    Ilsun YOU  

     
    FOREWORD

      Page(s):
    1793-1794
  • MPTCP-meLearning: A Multi-Expert Learning-Based MPTCP Extension to Enhance Multipathing Robustness against Network Attacks

    Yuanlong CAO  Ruiwen JI  Lejun JI  Xun SHAO  Gang LEI  Hao WANG  

     
    PAPER

      Pubricized:
    2021/07/08
      Page(s):
    1795-1804

    With multiple network interfaces are being widely equipped in modern mobile devices, the Multipath TCP (MPTCP) is increasingly becoming the preferred transport technique since it can uses multiple network interfaces simultaneously to spread the data across multiple network paths for throughput improvement. However, the MPTCP performance can be seriously affected by the use of a poor-performing path in multipath transmission, especially in the presence of network attacks, in which an MPTCP path would abrupt and frequent become underperforming caused by attacks. In this paper, we propose a multi-expert Learning-based MPTCP variant, called MPTCP-meLearning, to enhance MPTCP performance robustness against network attacks. MPTCP-meLearning introduces a new kind of predictor to possibly achieve better quality prediction accuracy for each of multiple paths, by leveraging a group of representative formula-based predictors. MPTCP-meLearning includes a novel mechanism to intelligently manage multiple paths in order to possibly mitigate the out-of-order reception and receive buffer blocking problems. Experimental results demonstrate that MPTCP-meLearning can achieve better transmission performance and quality of service than the baseline MPTCP scheme.

  • A Design of Automated Vulnerability Information Management System for Secure Use of Internet-Connected Devices Based on Internet-Wide Scanning Methods

    Taeeun KIM  Hwankuk KIM  

     
    PAPER

      Pubricized:
    2021/08/02
      Page(s):
    1805-1813

    Any Internet-connected device is vulnerable to being hacked and misused. Hackers can find vulnerable IoT devices, infect malicious codes, build massive IoT botnets, and remotely control IoT devices through C&C servers. Many studies have been attempted to apply various security features on IoT devices to prevent IoT devices from being exploited by attackers. However, unlike high-performance PCs, IoT devices are lightweight, low-power, and low-cost devices and have limitations on performance of processing and memory, making it difficult to install heavy security functions. Instead of access to applying security functions on IoT devices, Internet-wide scanning (e.g., Shodan) studies have been attempted to quickly discover and take security measures massive IoT devices with weak security. Over the Internet, scanning studies remotely also exist realistic limitations such as low accuracy in analyzing security vulnerabilities due to a lack of device information or filtered by network security devices. In this paper, we propose a system for remotely collecting information from Internet-connected devices and using scanning techniques to identify and manage vulnerability information from IoT devices. The proposed system improves the open-source Zmap engine to solve a realistic problem when attempting to scan through real Internet. As a result, performance measurements show equal or superior results compared to previous Shodan, Zmap-based scanning.

  • An Effective Feature Extraction Mechanism for Intrusion Detection System

    Cheng-Chung KUO  Ding-Kai TSENG  Chun-Wei TSAI  Chu-Sing YANG  

     
    PAPER

      Pubricized:
    2021/07/27
      Page(s):
    1814-1827

    The development of an efficient detection mechanism to determine malicious network traffic has been a critical research topic in the field of network security in recent years. This study implemented an intrusion-detection system (IDS) based on a machine learning algorithm to periodically convert and analyze real network traffic in the campus environment in almost real time. The focuses of this study are on determining how to improve the detection rate of an IDS and how to detect more non-well-known port attacks apart from the traditional rule-based system. Four new features are used to increase the discriminant accuracy. In addition, an algorithm for balancing the data set was used to construct the training data set, which can also enable the learning model to more accurately reflect situations in real environment.

  • The Uncontrolled Web: Measuring Security Governance on the Web

    Yuta TAKATA  Hiroshi KUMAGAI  Masaki KAMIZONO  

     
    PAPER

      Pubricized:
    2021/07/08
      Page(s):
    1828-1838

    While websites are becoming more and more complex daily, the difficulty of managing them is also increasing. It is important to conduct regular maintenance against these complex websites to strengthen their security and improve their cyber resilience. However, misconfigurations and vulnerabilities are still being discovered on some pages of websites and cyberattacks against them are never-ending. In this paper, we take the novel approach of applying the concept of security governance to websites; and, as part of this, measuring the consistency of software settings and versions used on these websites. More precisely, we analyze multiple web pages with the same domain name and identify differences in the security settings of HTTP headers and versions of software among them. After analyzing over 8,000 websites of popular global organizations, our measurement results show that over half of the tested websites exhibit differences. For example, we found websites running on a web server whose version changes depending on access and using a JavaScript library with different versions across over half of the tested pages. We identify the cause of such governance failures and propose improvement plans.

  • PDPM: A Patient-Defined Data Privacy Management with Nudge Theory in Decentralized E-Health Environments

    Seolah JANG  Sandi RAHMADIKA  Sang Uk SHIN  Kyung-Hyune RHEE  

     
    PAPER

      Pubricized:
    2021/08/24
      Page(s):
    1839-1849

    A private decentralized e-health environment, empowered by blockchain technology, grants authorized healthcare entities to legitimately access the patient's medical data without relying on a centralized node. Every activity from authorized entities is recorded immutably in the blockchain transactions. In terms of privacy, the e-health system preserves a default privacy option as an initial state for every patient since the patients may frequently customize their medical data over time for several purposes. Moreover, adjustments in the patient's privacy contexts are often solely from the patient's initiative without any doctor or stakeholders' recommendation. Therefore, we design, implement, and evaluate user-defined data privacy utilizing nudge theory for decentralized e-health systems named PDPM to tackle these issues. Patients can determine the privacy of their medical records to be closed to certain parties. Data privacy management is dynamic, which can be executed on the blockchain via the smart contract feature. Tamper-proof user-defined data privacy can resolve the dispute between the e-health entities related to privacy management and adjustments. In short, the authorized entities cannot deny any changes since every activity is recorded in the ledgers. Meanwhile, the nudge theory technique supports providing the best patient privacy recommendations based on their behaviour activities even though the final decision rests on the patient. Finally, we demonstrate how to use PDPM to realize user-defined data privacy management in decentralized e-health environments.

  • Analysis against Security Issues of Voice over 5G

    Hyungjin CHO  Seongmin PARK  Youngkwon PARK  Bomin CHOI  Dowon KIM  Kangbin YIM  

     
    PAPER

      Pubricized:
    2021/07/13
      Page(s):
    1850-1856

    In Feb 2021, As the competition for commercialization of 5G mobile communication has been increasing, 5G SA Network and Vo5G are expected to be commercialized soon. 5G mobile communication aims to provide 20 Gbps transmission speed which is 20 times faster than 4G mobile communication, connection of at least 1 million devices per 1 km2, and 1 ms transmission delay which is 10 times shorter than 4G. To meet this, various technological developments were required, and various technologies such as Massive MIMO (Multiple-Input and Multiple-Output), mmWave, and small cell network were developed and applied in the area of 5G access network. However, in the core network area, the components constituting the LTE (Long Term Evolution) core network are utilized as they are in the NSA (Non-Standalone) architecture, and only the changes in the SA (Standalone) architecture have occurred. Also, in the network area for providing the voice service, the IMS (IP Multimedia Subsystem) infrastructure is still used in the SA architecture. Here, the issue is that while 5G mobile communication is evolving openly to provide various services, security elements are vulnerable to various cyber-attacks because they maintain the same form as before. Therefore, in this paper, we will look at what the network standard for 5G voice service provision consists of, and what are the vulnerable problems in terms of security. And We Suggest Possible Attack Scenario using Security Issue, We also want to consider whether these problems can actually occur and what is the countermeasure.

  • Verifiable Credential Proof Generation and Verification Model for Decentralized SSI-Based Credit Scoring Data

    Kang Woo CHO  Byeong-Gyu JEONG  Sang Uk SHIN  

     
    PAPER

      Pubricized:
    2021/07/27
      Page(s):
    1857-1868

    The continuous development of the mobile computing environment has led to the emergence of fintech to enable convenient financial transactions in this environment. Previously proposed financial identity services mostly adopted centralized servers that are prone to single-point-of-failure problems and performance bottlenecks. Blockchain-based self-sovereign identity (SSI), which emerged to address this problem, is a technology that solves centralized problems and allows decentralized identification. However, the verifiable credential (VC), a unit of SSI data transactions, guarantees unlimited right to erasure for self-sovereignty. This does not suit the specificity of the financial transaction network, which requires the restriction of the right to erasure for credit evaluation. This paper proposes a model for VC generation and revocation verification for credit scoring data. The proposed model includes double zero knowledge - succinct non-interactive argument of knowledge (zk-SNARK) proof in the VC generation process between the holder and the issuer. In addition, cross-revocation verification takes place between the holder and the verifier. As a result, the proposed model builds a trust platform among the holder, issuer, and verifier while maintaining the decentralized SSI attributes and focusing on the VC life cycle. The model also improves the way in which credit evaluation data are processed as VCs by granting opt-in and the special right to erasure.

  • An Efficient Public Verifiable Certificateless Multi-Receiver Signcryption Scheme for IoT Environments

    Dae-Hwi LEE  Won-Bin KIM  Deahee SEO  Im-Yeong LEE  

     
    PAPER

      Pubricized:
    2021/07/14
      Page(s):
    1869-1879

    Lightweight cryptographic systems for services delivered by the recently developed Internet of Things (IoT) are being continuously researched. However, existing Public Key Infrastructure (PKI)-based cryptographic algorithms are difficult to apply to IoT services delivered using lightweight devices. Therefore, encryption, authentication, and signature systems based on Certificateless Public Key Cryptography (CL-PKC), which are lightweight because they do not use the certificates of existing PKI-based cryptographic algorithms, are being studied. Of the various public key cryptosystems, signcryption is efficient, and ensures integrity and confidentiality. Recently, CL-based signcryption (CL-SC) schemes have been intensively studied, and a multi-receiver signcryption (MRSC) protocol for environments with multiple receivers, i.e., not involving end-to-end communication, has been proposed. However, when using signcryption, confidentiality and integrity may be violated by public key replacement attacks. In this paper, we develop an efficient CL-based MRSC (CL-MRSC) scheme using CL-PKC for IoT environments. Existing signcryption schemes do not offer public verifiability, which is required if digital signatures are used, because only the receiver can verify the validity of the message; sender authenticity is not guaranteed by a third party. Therefore, we propose a CL-MRSC scheme in which communication participants (such as the gateways through which messages are transmitted) can efficiently and publicly verify the validity of encrypted messages.

  • Leakage-Resilient and Proactive Authenticated Key Exchange (LRP-AKE), Reconsidered

    SeongHan SHIN  

     
    PAPER

      Pubricized:
    2021/08/05
      Page(s):
    1880-1893

    In [31], Shin et al. proposed a Leakage-Resilient and Proactive Authenticated Key Exchange (LRP-AKE) protocol for credential services which provides not only a higher level of security against leakage of stored secrets but also secrecy of private key with respect to the involving server. In this paper, we discuss a problem in the security proof of the LRP-AKE protocol, and then propose a modified LRP-AKE protocol that has a simple and effective measure to the problem. Also, we formally prove its AKE security and mutual authentication for the entire modified LRP-AKE protocol. In addition, we describe several extensions of the (modified) LRP-AKE protocol including 1) synchronization issue between the client and server's stored secrets; 2) randomized ID for the provision of client's privacy; and 3) a solution to preventing server compromise-impersonation attacks. Finally, we evaluate the performance overhead of the LRP-AKE protocol and show its test vectors. From the performance evaluation, we can confirm that the LRP-AKE protocol has almost the same efficiency as the (plain) Diffie-Hellman protocol that does not provide authentication at all.

  • Provable-Security Analysis of Authenticated Encryption Based on Lesamnta-LW in the Ideal Cipher Model

    Shoichi HIROSE  Hidenori KUWAKADO  Hirotaka YOSHIDA  

     
    PAPER

      Pubricized:
    2021/07/08
      Page(s):
    1894-1901

    Hirose, Kuwakado and Yoshida proposed a nonce-based authenticated encryption scheme Lae0 based on Lesamnta-LW in 2019. Lesamnta-LW is a block-cipher-based iterated hash function included in the ISO/IEC 29192-5 lightweight hash-function standard. They also showed that Lae0 satisfies both privacy and authenticity if the underlying block cipher is a pseudorandom permutation. Unfortunately, their result implies only about 64-bit security for instantiation with the dedicated block cipher of Lesamnta-LW. In this paper, we analyze the security of Lae0 in the ideal cipher model. Our result implies about 120-bit security for instantiation with the block cipher of Lesamnta-LW.

  • CoLaFUZE: Coverage-Guided and Layout-Aware Fuzzing for Android Drivers

    Tianshi MU  Huabing ZHANG  Jian WANG  Huijuan LI  

     
    PAPER

      Pubricized:
    2021/07/28
      Page(s):
    1902-1912

    With the commercialization of 5G mobile phones, Android drivers are increasing rapidly to utilize a large quantity of newly emerging feature-rich hardware. Most of these drivers are developed by third-party vendors and lack proper vulnerabilities review, posing a number of new potential risks to security and privacy. However, the complexity and diversity of Android drivers make the traditional analysis methods inefficient. For example, the driver-specific argument formats make traditional syscall fuzzers difficult to generate valid inputs, the pointer-heavy code makes static analysis results incomplete, and pointer casting hides the actual type. Triggering code deep in Android drivers remains challenging. We present CoLaFUZE, a coverage-guided and layout-aware fuzzing tool for automatically generating valid inputs and exploring the driver code. CoLaFUZE employs a kernel module to capture the data copy operation and redirect it to the fuzzing engine, ensuring that the correct size of the required data is transferred to the driver. CoLaFUZE leverages dynamic analysis and symbolic execution to recover the driver interfaces and generates valid inputs for the interfaces. Furthermore, the seed mutation module of CoLaFUZE leverages coverage information to achieve better seed quality and expose bugs deep in the driver. We evaluate CoLaFUZE on 5 modern Android mobile phones from the top vendors, including Google, Xiaomi, Samsung, Sony, and Huawei. The results show that CoLaFUZE can explore more code coverage compared with the state-of-the-art fuzzer, and CoLaFUZE successfully found 11 vulnerabilities in the testing devices.

  • Regular Section
  • Solving 3D Container Loading Problems Using Physics Simulation for Genetic Algorithm Evaluation

    Shuhei NISHIYAMA  Chonho LEE  Tomohiro MASHITA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/08/06
      Page(s):
    1913-1922

    In this work, an optimization method for the 3D container loading problem with multiple constraints is proposed. The method consists of a genetic algorithm to generate an arrangement of cargo and a fitness evaluation using a physics simulation. The fitness function considers not only the maximization of the container density and fitness value but also several different constraints such as weight, stack-ability, fragility, and orientation of cargo pieces. We employed a container shaking simulation for the fitness evaluation to include constraint effects during loading and transportation. We verified that the proposed method successfully provides the optimal cargo arrangement for small-scale problems with about 10 pieces of cargo.

  • Frank-Wolfe Algorithm for Learning SVM-Type Multi-Category Classifiers

    Kenya TAJIMA  Yoshihiro HIROHASHI  Esmeraldo ZARA  Tsuyoshi KATO  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/08/11
      Page(s):
    1923-1929

    The multi-category support vector machine (MC-SVM) is one of the most popular machine learning algorithms. There are numerous MC-SVM variants, although different optimization algorithms were developed for diverse learning machines. In this study, we developed a new optimization algorithm that can be applied to several MC-SVM variants. The algorithm is based on the Frank-Wolfe framework that requires two subproblems, direction-finding and line search, in each iteration. The contribution of this study is the discovery that both subproblems have a closed form solution if the Frank-Wolfe framework is applied to the dual problem. Additionally, the closed form solutions on both the direction-finding and line search exist even for the Moreau envelopes of the loss functions. We used several large datasets to demonstrate that the proposed optimization algorithm rapidly converges and thereby improves the pattern recognition performance.

  • Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network

    Mariana RODRIGUES MAKIUCHI  Tifani WARNITA  Nakamasa INOUE  Koichi SHINODA  Michitaka YOSHIMURA  Momoko KITAZAWA  Kei FUNAKI  Yoko EGUCHI  Taishiro KISHIMOTO  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/08/03
      Page(s):
    1930-1940

    We propose a non-invasive and cost-effective method to automatically detect dementia by utilizing solely speech audio data. We extract paralinguistic features for a short speech segment and use Gated Convolutional Neural Networks (GCNN) to classify it into dementia or healthy. We evaluate our method on the Pitt Corpus and on our own dataset, the PROMPT Database. Our method yields the accuracy of 73.1% on the Pitt Corpus using an average of 114 seconds of speech data. In the PROMPT Database, our method yields the accuracy of 74.7% using 4 seconds of speech data and it improves to 80.8% when we use all the patient's speech data. Furthermore, we evaluate our method on a three-class classification problem in which we included the Mild Cognitive Impairment (MCI) class and achieved the accuracy of 60.6% with 40 seconds of speech data.

  • Influence of Access to Reading Material during Concept Map Recomposition in Reading Comprehension and Retention

    Pedro GABRIEL FONTELES FURTADO  Tsukasa HIRASHIMA  Nawras KHUDHUR  Aryo PINANDITO  Yusuke HAYASHI  

     
    PAPER-Educational Technology

      Pubricized:
    2021/08/02
      Page(s):
    1941-1950

    This study investigated the influence of reading time while building a closed concept map on reading comprehension and retention. It also investigated the effect of having access to the text during closed concept map creation on reading comprehension and retention. Participants from Amazon Mechanical Turk (N =101) read a text, took an after-text test, and took part in one of three conditions, “Map & Text”, “Map only”, and “Double Text”, took an after-activity test, followed by a two-week retention period and then one final delayed test. Analysis revealed that higher reading times were associated with better reading comprehension and better retention. Furthermore, when comparing “Map & Text” to the “Map only” condition, short-term reading comprehension was improved, but long-term retention was not improved. This suggests that having access to the text while building closed concept maps can improve reading comprehension, but long term learning can only be improved if students invest time accessing both the map and the text.

  • Gait Phase Partitioning and Footprint Detection Using Mutually Constrained Piecewise Linear Approximation with Dynamic Programming

    Makoto YASUKAWA  Yasushi MAKIHARA  Toshinori HOSOI  Masahiro KUBO  Yasushi YAGI  

     
    PAPER-Rehabilitation Engineering and Assistive Technology

      Pubricized:
    2021/08/02
      Page(s):
    1951-1962

    Human gait analysis has been widely used in medical and health fields. It is essential to extract spatio-temporal gait features (e.g., single support duration, step length, and toe angle) by partitioning the gait phase and estimating the footprint position/orientation in such fields. Therefore, we propose a method to partition the gait phase given a foot position sequence using mutually constrained piecewise linear approximation with dynamic programming, which not only represents normal gait well but also pathological gait without training data. We also propose a method to detect footprints by accumulating toe edges on the floor plane during stance phases, which enables us to detect footprints more clearly than a conventional method. Finally, we extract four spatial/temporal gait parameters for accuracy evaluation: single support duration, double support duration, toe angle, and step length. We conducted experiments to validate the proposed method using two types of gait patterns, that is, healthy and mimicked hemiplegic gait, from 10 subjects. We confirmed that the proposed method could estimate the spatial/temporal gait parameters more accurately than a conventional skeleton-based method regardless of the gait pattern.

  • A Multi-Task Scheme for Supervised DNN-Based Single-Channel Speech Enhancement by Using Speech Presence Probability as the Secondary Training Target

    Lei WANG  Jie ZHU  Kangbo SUN  

    This paper has been cancelled due to violation of duplicate submission policy on IEICE Transactions on Information and Systems.
     
    PAPER-Speech and Hearing

      Pubricized:
    2021/08/05
      Page(s):
    1963-1970

    To cope with complicated interference scenarios in realistic acoustic environment, supervised deep neural networks (DNNs) are investigated to estimate different user-defined targets. Such techniques can be broadly categorized into magnitude estimation and time-frequency mask estimation techniques. Further, the mask such as the Wiener gain can be estimated directly or derived by the estimated interference power spectral density (PSD) or the estimated signal-to-interference ratio (SIR). In this paper, we propose to incorporate the multi-task learning in DNN-based single-channel speech enhancement by using the speech presence probability (SPP) as a secondary target to assist the target estimation in the main task. The domain-specific information is shared between two tasks to learn a more generalizable representation. Since the performance of multi-task network is sensitive to the weight parameters of loss function, the homoscedastic uncertainty is introduced to adaptively learn the weights, which is proven to outperform the fixed weighting method. Simulation results show the proposed multi-task scheme improves the speech enhancement performance overall compared to the conventional single-task methods. And the joint direct mask and SPP estimation yields the best performance among all the considered techniques.

  • DNN-Based Low-Musical-Noise Single-Channel Speech Enhancement Based on Higher-Order-Moments Matching

    Satoshi MIZOGUCHI  Yuki SAITO  Shinnosuke TAKAMICHI  Hiroshi SARUWATARI  

     
    PAPER-Speech and Hearing

      Pubricized:
    2021/07/30
      Page(s):
    1971-1980

    We propose deep neural network (DNN)-based speech enhancement that reduces musical noise and achieves better auditory impressions. The musical noise is an artifact generated by nonlinear signal processing and negatively affects the auditory impressions. We aim to develop musical-noise-free speech enhancement methods that suppress the musical noise generation and produce perceptually-comfortable enhanced speech. DNN-based speech enhancement using a soft mask achieves high noise reduction but generates musical noise in non-speech regions. Therefore, first, we define kurtosis matching for DNN-based low-musical-noise speech enhancement. Kurtosis is the fourth-order moment and is known to correlate with the amount of musical noise. The kurtosis matching is a penalty term of the DNN training and works to reduce the amount of musical noise. We further extend this scheme to standardized-moment matching. The extended scheme involves using moments whose orders are higher than kurtosis and generalizes the conventional musical-noise-free method based on kurtosis matching. We formulate standardized-moment matching and explore how effectively the higher-order moments reduce the amount of musical noise. Experimental evaluation results 1) demonstrate that kurtosis matching can reduce musical noise without negatively affecting noise suppression and 2) newly reveal that the sixth-moment matching also achieves low-musical-noise speech enhancement as well as kurtosis matching.

  • Flexible Bayesian Inference by Weight Transfer for Robust Deep Neural Networks

    Thi Thu Thao KHONG  Takashi NAKADA  Yasuhiko NAKASHIMA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/07/28
      Page(s):
    1981-1991

    Adversarial attacks are viewed as a danger to Deep Neural Networks (DNNs), which reveal a weakness of deep learning models in security-critical applications. Recent findings have been presented adversarial training as an outstanding defense method against adversaries. Nonetheless, adversarial training is a challenge with respect to big datasets and large networks. It is believed that, unless making DNN architectures larger, DNNs would be hard to strengthen the robustness to adversarial examples. In order to avoid iteratively adversarial training, our algorithm is Bayes without Bayesian Learning (BwoBL) that performs the ensemble inference to improve the robustness. As an application of transfer learning, we use learned parameters of pretrained DNNs to build Bayesian Neural Networks (BNNs) and focus on Bayesian inference without costing Bayesian learning. In comparison with no adversarial training, our method is more robust than activation functions designed to enhance adversarial robustness. Moreover, BwoBL can easily integrate into any pretrained DNN, not only Convolutional Neural Networks (CNNs) but also other DNNs, such as Self-Attention Networks (SANs) that outperform convolutional counterparts. BwoBL is also convenient to apply to scaling networks, e.g., ResNet and EfficientNet, with better performance. Especially, our algorithm employs a variety of DNN architectures to construct BNNs against a diversity of adversarial attacks on a large-scale dataset. In particular, under l norm PGD attack of pixel perturbation ε=4/255 with 100 iterations on ImageNet, our proposal in ResNets, SANs, and EfficientNets increase by 58.18% top-5 accuracy on average, which are combined with naturally pretrained ResNets, SANs, and EfficientNets. This enhancement is 62.26% on average below l2 norm C&W attack. The combination of our proposed method with pretrained EfficientNets on both natural and adversarial images (EfficientNet-ADV) drastically boosts the robustness resisting PGD and C&W attacks without additional training. Our EfficientNet-ADV-B7 achieves the cutting-edge top-5 accuracy, which is 92.14% and 94.20% on adversarial ImageNet generated by powerful PGD and C&W attacks, respectively.

  • Smaller Residual Network for Single Image Depth Estimation

    Andi HENDRA  Yasushi KANAZAWA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/08/17
      Page(s):
    1992-2001

    We propose a new framework for estimating depth information from a single image. Our framework is relatively small and straightforward by employing a two-stage architecture: a residual network and a simple decoder network. Our residual network in this paper is a remodeled of the original ResNet-50 architecture, which consists of only thirty-eight convolution layers in the residual block following by pair of two up-sampling and layers. While the simple decoder network, stack of five convolution layers, accepts the initial depth to be refined as the final output depth. During training, we monitor the loss behavior and adjust the learning rate hyperparameter in order to improve the performance. Furthermore, instead of using a single common pixel-wise loss, we also compute loss based on gradient-direction, and their structure similarity. This setting in our network can significantly reduce the number of network parameters, and simultaneously get a more accurate image depth map. The performance of our approach has been evaluated by conducting both quantitative and qualitative comparisons with several prior related methods on the publicly NYU and KITTI datasets.

  • Synthetic Scene Character Generator and Ensemble Scheme with the Random Image Feature Method for Japanese and Chinese Scene Character Recognition

    Fuma HORIE  Hideaki GOTO  Takuo SUGANUMA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/08/24
      Page(s):
    2002-2010

    Scene character recognition has been intensively investigated for a couple of decades because it has a great potential in many applications including automatic translation, signboard recognition, and reading assistance for the visually-impaired. However, scene characters are difficult to recognize at sufficient accuracy owing to various noise and image distortions. In addition, Japanese scene character recognition is more challenging and requires a large amount of character data for training because thousands of character classes exist in the language. Some researchers proposed training data augmentation techniques using Synthetic Scene Character Data (SSCD) to compensate for the shortage of training data. In this paper, we propose a Random Filter which is a new method for SSCD generation, and introduce an ensemble scheme with the Random Image Feature (RI-Feature) method. Since there has not been a large Japanese scene character dataset for the evaluation of the recognition systems, we have developed an open dataset JPSC1400, which consists of a large number of real Japanese scene characters. It is shown that the accuracy has been improved from 70.9% to 83.1% by introducing the RI-Feature method to the ensemble scheme.

  • Usage Log-Based Testing of Embedded Software and Identification of Dependencies among Environmental Components

    Sooyong JEONG  Sungdeok CHA  Woo Jin LEE  

     
    LETTER-Software Engineering

      Pubricized:
    2021/07/28
      Page(s):
    2011-2014

    Embedded software often interacts with multiple inputs from various sensors whose dependency is often complex or partially known to developers. With incomplete information on dependency, testing is likely to be insufficient in detecting errors. We propose a method to enhance testing coverage of embedded software by identifying subtle and often neglected dependencies using information contained in usage log. Usage log, traditionally used primarily for investigative purpose following accidents, can also make useful contribution during testing of embedded software. Our approach relies on first individually developing behavioral model for each environmental input, performing compositional analysis while identifying feasible but untested dependencies from usage log, and generating additional test cases that correspond to untested or insufficiently tested dependencies. Experimental evaluation was performed on an Android application named Gravity Screen as well as an Arduino-based wearable glove app. Whereas conventional CTM-based testing technique achieved average branch coverage of 26% and 68% on these applications, respectively, proposed technique achieved 100% coverage in both.

  • Dynamic Incentive Mechanism for Industrial Network Congestion Control

    Zhentian WU  Feng YAN  Zhihua YANG  Jingya YANG  

     
    LETTER-Information Network

      Pubricized:
    2021/07/29
      Page(s):
    2015-2018

    This paper studies using price incentives to shift bandwidth demand from peak to non-peak periods. In particular, cost discounts decrease as peak monthly usage increases. We take into account the delay sensitivity of different apps: during peak hours, the usage of hard real-time applications (HRAS) is not counted in the user's monthly data cap, while the usage of other applications (OAS) is counted in the user's monthly data cap. As a result, users may voluntarily delay or abandon OAS in order to get a higher fee discount. Then, a new data rate control algorithm is proposed. The algorithm allocates the data rate according to the priority of the source, which is determined by two factors: (I) the allocated data rate; and (II) the waiting time.

  • Detecting Depression from Speech through an Attentive LSTM Network

    Yan ZHAO  Yue XIE  Ruiyu LIANG  Li ZHANG  Li ZHAO  Chengyu LIU  

     
    LETTER-Speech and Hearing

      Pubricized:
    2021/08/24
      Page(s):
    2019-2023

    Depression endangers people's health conditions and affects the social order as a mental disorder. As an efficient diagnosis of depression, automatic depression detection has attracted lots of researcher's interest. This study presents an attention-based Long Short-Term Memory (LSTM) model for depression detection to make full use of the difference between depression and non-depression between timeframes. The proposed model uses frame-level features, which capture the temporal information of depressive speech, to replace traditional statistical features as an input of the LSTM layers. To achieve more multi-dimensional deep feature representations, the LSTM output is then passed on attention layers on both time and feature dimensions. Then, we concat the output of the attention layers and put the fused feature representation into the fully connected layer. At last, the fully connected layer's output is passed on to softmax layer. Experiments conducted on the DAIC-WOZ database demonstrate that the proposed attentive LSTM model achieves an average accuracy rate of 90.2% and outperforms the traditional LSTM network and LSTM with local attention by 0.7% and 2.3%, respectively, which indicates its feasibility.

  • A Hybrid Retinex-Based Algorithm for UAV-Taken Image Enhancement

    Xinran LIU  Zhongju WANG  Long WANG  Chao HUANG  Xiong LUO  

     
    LETTER-Image Processing and Video Processing

      Pubricized:
    2021/08/05
      Page(s):
    2024-2027

    A hybrid Retinex-based image enhancement algorithm is proposed to improve the quality of images captured by unmanned aerial vehicles (UAVs) in this paper. Hyperparameters of the employed multi-scale Retinex with chromaticity preservation (MSRCP) model are automatically tuned via a two-phase evolutionary computing algorithm. In the two-phase optimization algorithm, the Rao-2 algorithm is applied to performing the global search and a solution is obtained by maximizing the objective function. Next, the Nelder-Mead simplex method is used to improve the solution via local search. Real UAV-taken images of bad quality are collected to verify the performance of the proposed algorithm. Meanwhile, four famous image enhancement algorithms, Multi-Scale Retinex, Multi-Scale Retinex with Color Restoration, Automated Multi-Scale Retinex, and MSRCP are utilized as benchmarking methods. Meanwhile, two commonly used evolutionary computing algorithms, particle swarm optimization and flower pollination algorithm, are considered to verify the efficiency of the proposed method in tuning parameters of the MSRCP model. Experimental results demonstrate that the proposed method achieves the best performance compared with benchmarks and thus the proposed method is applicable for real UAV-based applications.