The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] (42807hit)

2021-2040hit(42807hit)

  • Usage Log-Based Testing of Embedded Software and Identification of Dependencies among Environmental Components

    Sooyong JEONG  Sungdeok CHA  Woo Jin LEE  

     
    LETTER-Software Engineering

      Pubricized:
    2021/07/28
      Vol:
    E104-D No:11
      Page(s):
    2011-2014

    Embedded software often interacts with multiple inputs from various sensors whose dependency is often complex or partially known to developers. With incomplete information on dependency, testing is likely to be insufficient in detecting errors. We propose a method to enhance testing coverage of embedded software by identifying subtle and often neglected dependencies using information contained in usage log. Usage log, traditionally used primarily for investigative purpose following accidents, can also make useful contribution during testing of embedded software. Our approach relies on first individually developing behavioral model for each environmental input, performing compositional analysis while identifying feasible but untested dependencies from usage log, and generating additional test cases that correspond to untested or insufficiently tested dependencies. Experimental evaluation was performed on an Android application named Gravity Screen as well as an Arduino-based wearable glove app. Whereas conventional CTM-based testing technique achieved average branch coverage of 26% and 68% on these applications, respectively, proposed technique achieved 100% coverage in both.

  • A Beam-Switchable Self-Oscillating Active Integrated Array Antenna Using Gunn Oscillator and Magic-T

    Maodudul HASAN  Eisuke NISHIYAMA  Ichihiko TOYODA  

     
    PAPER-Antennas and Propagation

      Pubricized:
    2021/05/14
      Vol:
    E104-B No:11
      Page(s):
    1419-1428

    Herein, a novel self-oscillating active integrated array antenna (AIAA) is proposed for beam switching X-band applications. The proposed AIAA comprises four linearly polarized microstrip antenna elements, a Gunn oscillator, two planar magic-Ts, and two single-pole single-throw (SPST) switches. The in/anti-phase signal combination approach employing planar magic-Ts is adopted to attain bidirectional radiation patterns in the φ =90° plane with a simple structure. The proposed antenna can switch its beam using the SPST switches. The antenna is analyzed through simulations, and a prototype of the antenna is fabricated and tested to validate the concept. The proposed concept is found to be feasible; the prototype has an effective isotropic radiated power of +15.98dBm, radiated power level of +4.28dBm, and cross-polarization suppression of better than 15dB. The measured radiation patterns are in good agreement with the simulation results.

  • Low-Power Reconfigurable Architecture of Elliptic Curve Cryptography for IoT

    Xianghong HU  Hongmin HUANG  Xin ZHENG  Yuan LIU  Xiaoming XIONG  

     
    PAPER-Electronic Circuits

      Pubricized:
    2021/05/14
      Vol:
    E104-C No:11
      Page(s):
    643-650

    Elliptic curve cryptography (ECC), one of the asymmetric cryptography, is widely used in practical security applications, especially in the Internet of Things (IoT) applications. This paper presents a low-power reconfigurable architecture for ECC, which is capable of resisting simple power analysis attacks (SPA) and can be configured to support all of point operations and modular operations on 160/192/224/256-bit field orders over GF(p). Point multiplication (PM) is the most complex and time-consuming operation of ECC, while modular multiplication (MM) and modular division (MD) have high computational complexity among modular operations. For decreasing power dissipation and increasing reconfigurable capability, a Reconfigurable Modular Multiplication Algorithm and Reconfigurable Modular Division Algorithm are proposed, and MM and MD are implemented by two adder units. Combining with the optimization of operation scheduling of PM, on 55 nm CMOS ASIC platform, the proposed architecture takes 0.96, 1.37, 1.87, 2.44 ms and consumes 8.29, 11.86, 16.20, 21.13 uJ to perform one PM on 160-bit, 192-bit, 224-bit, 256-bit field orders. It occupies 56.03 k gate area and has a power of 8.66 mW. The implementation results demonstrate that the proposed architecture outperforms the other contemporary designs reported in the literature in terms of area and configurability.

  • Electromagnetic Field Theory Interpretation on Light Extraction of Organic Light Emitting Diodes (OLEDs)

    Yoshinari ISHIDO  Wataru MIZUTANI  

     
    BRIEF PAPER-Electromagnetic Theory

      Pubricized:
    2021/05/31
      Vol:
    E104-C No:11
      Page(s):
    663-666

    Focusing on the planar slab structure of OLEDs, it is found the threshold value of the in-plane wave number at which the spectrum component of the electromagnetic field at the outermost boundary is divided into a radiation mode and a guided (confined) mode. This is equivalent to the total reflection condition in the ray optics. The spectral integral of the Poynting power was calculated from the boundary values of the electromagnetic fields in each. Both become average power and reactive power respectively, and the sum of them becomes the total volt-amperes from the light emitting dipole. Therefore, the ratio of average power to this total is the power factor that can be a quantitative index of light extraction.

  • Joint Wireless and Computational Resource Allocation Based on Hierarchical Game for Mobile Edge Computing

    Weiwei XIA  Zhuorui LAN  Lianfeng SHEN  

     
    PAPER-Network

      Pubricized:
    2021/05/14
      Vol:
    E104-B No:11
      Page(s):
    1395-1407

    In this paper, we propose a hierarchical Stackelberg game based resource allocation algorithm (HGRAA) to jointly allocate the wireless and computational resources of a mobile edge computing (MEC) system. The proposed HGRAA is composed of two levels: the lower-level evolutionary game (LEG) minimizes the cost of mobile terminals (MTs), and the upper-level exact potential game (UEPG) maximizes the utility of MEC servers. At the lower-level, the MTs are divided into delay-sensitive MTs (DSMTs) and non-delay-sensitive MTs (NDSMTs) according to their different quality of service (QoS) requirements. The competition among DSMTs and NDSMTs in different service areas to share the limited available wireless and computational resources is formulated as a dynamic evolutionary game. The dynamic replicator is applied to obtain the evolutionary equilibrium so as to minimize the costs imposed on MTs. At the upper level, the exact potential game is formulated to solve the resource sharing problem among MEC servers and the resource sharing problem is transferred to nonlinear complementarity. The existence of Nash equilibrium (NE) is proved and is obtained through the Karush-Kuhn-Tucker (KKT) condition. Simulations illustrate that substantial performance improvements such as average utility and the resource utilization of MEC servers can be achieved by applying the proposed HGRAA. Moreover, the cost of MTs is significantly lower than other existing algorithms with the increasing size of input data, and the QoS requirements of different kinds of MTs are well guaranteed in terms of average delay and transmission data rate.

  • Analysis of Signal Distribution in ASE-Limited Optical On-Off Keying Direct-Detection Systems

    Hiroki KAWAHARA  Kyo INOUE  Koji IGARASHI  

     
    PAPER-Fiber-Optic Transmission for Communications

      Pubricized:
    2021/05/14
      Vol:
    E104-B No:11
      Page(s):
    1386-1394

    This paper provides on a theoretical and numerical study of the probability density function (PDF) of the on-off keying (OOK) signals in ASE-limited systems. We present simple closed formulas of PDFs for the optical intensity and the received baseband signal. To confirm the validity of our model, the calculation results yielded by the proposed formulas are compared with those of numerical simulations and the conventional Gaussian model. Our theoretical and numerical results confirm that the signal distribution differs from a Gaussian profile. It is also demonstrated that our model can properly evaluate the signal distribution and the resultant BER performance, especially for systems with an optical bandwidth close to the receiver baseband width.

  • Gait Phase Partitioning and Footprint Detection Using Mutually Constrained Piecewise Linear Approximation with Dynamic Programming

    Makoto YASUKAWA  Yasushi MAKIHARA  Toshinori HOSOI  Masahiro KUBO  Yasushi YAGI  

     
    PAPER-Rehabilitation Engineering and Assistive Technology

      Pubricized:
    2021/08/02
      Vol:
    E104-D No:11
      Page(s):
    1951-1962

    Human gait analysis has been widely used in medical and health fields. It is essential to extract spatio-temporal gait features (e.g., single support duration, step length, and toe angle) by partitioning the gait phase and estimating the footprint position/orientation in such fields. Therefore, we propose a method to partition the gait phase given a foot position sequence using mutually constrained piecewise linear approximation with dynamic programming, which not only represents normal gait well but also pathological gait without training data. We also propose a method to detect footprints by accumulating toe edges on the floor plane during stance phases, which enables us to detect footprints more clearly than a conventional method. Finally, we extract four spatial/temporal gait parameters for accuracy evaluation: single support duration, double support duration, toe angle, and step length. We conducted experiments to validate the proposed method using two types of gait patterns, that is, healthy and mimicked hemiplegic gait, from 10 subjects. We confirmed that the proposed method could estimate the spatial/temporal gait parameters more accurately than a conventional skeleton-based method regardless of the gait pattern.

  • Flexible Bayesian Inference by Weight Transfer for Robust Deep Neural Networks

    Thi Thu Thao KHONG  Takashi NAKADA  Yasuhiko NAKASHIMA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/07/28
      Vol:
    E104-D No:11
      Page(s):
    1981-1991

    Adversarial attacks are viewed as a danger to Deep Neural Networks (DNNs), which reveal a weakness of deep learning models in security-critical applications. Recent findings have been presented adversarial training as an outstanding defense method against adversaries. Nonetheless, adversarial training is a challenge with respect to big datasets and large networks. It is believed that, unless making DNN architectures larger, DNNs would be hard to strengthen the robustness to adversarial examples. In order to avoid iteratively adversarial training, our algorithm is Bayes without Bayesian Learning (BwoBL) that performs the ensemble inference to improve the robustness. As an application of transfer learning, we use learned parameters of pretrained DNNs to build Bayesian Neural Networks (BNNs) and focus on Bayesian inference without costing Bayesian learning. In comparison with no adversarial training, our method is more robust than activation functions designed to enhance adversarial robustness. Moreover, BwoBL can easily integrate into any pretrained DNN, not only Convolutional Neural Networks (CNNs) but also other DNNs, such as Self-Attention Networks (SANs) that outperform convolutional counterparts. BwoBL is also convenient to apply to scaling networks, e.g., ResNet and EfficientNet, with better performance. Especially, our algorithm employs a variety of DNN architectures to construct BNNs against a diversity of adversarial attacks on a large-scale dataset. In particular, under l∞ norm PGD attack of pixel perturbation ε=4/255 with 100 iterations on ImageNet, our proposal in ResNets, SANs, and EfficientNets increase by 58.18% top-5 accuracy on average, which are combined with naturally pretrained ResNets, SANs, and EfficientNets. This enhancement is 62.26% on average below l2 norm C&W attack. The combination of our proposed method with pretrained EfficientNets on both natural and adversarial images (EfficientNet-ADV) drastically boosts the robustness resisting PGD and C&W attacks without additional training. Our EfficientNet-ADV-B7 achieves the cutting-edge top-5 accuracy, which is 92.14% and 94.20% on adversarial ImageNet generated by powerful PGD and C&W attacks, respectively.

  • MPTCP-meLearning: A Multi-Expert Learning-Based MPTCP Extension to Enhance Multipathing Robustness against Network Attacks

    Yuanlong CAO  Ruiwen JI  Lejun JI  Xun SHAO  Gang LEI  Hao WANG  

     
    PAPER

      Pubricized:
    2021/07/08
      Vol:
    E104-D No:11
      Page(s):
    1795-1804

    With multiple network interfaces are being widely equipped in modern mobile devices, the Multipath TCP (MPTCP) is increasingly becoming the preferred transport technique since it can uses multiple network interfaces simultaneously to spread the data across multiple network paths for throughput improvement. However, the MPTCP performance can be seriously affected by the use of a poor-performing path in multipath transmission, especially in the presence of network attacks, in which an MPTCP path would abrupt and frequent become underperforming caused by attacks. In this paper, we propose a multi-expert Learning-based MPTCP variant, called MPTCP-meLearning, to enhance MPTCP performance robustness against network attacks. MPTCP-meLearning introduces a new kind of predictor to possibly achieve better quality prediction accuracy for each of multiple paths, by leveraging a group of representative formula-based predictors. MPTCP-meLearning includes a novel mechanism to intelligently manage multiple paths in order to possibly mitigate the out-of-order reception and receive buffer blocking problems. Experimental results demonstrate that MPTCP-meLearning can achieve better transmission performance and quality of service than the baseline MPTCP scheme.

  • A Design of Automated Vulnerability Information Management System for Secure Use of Internet-Connected Devices Based on Internet-Wide Scanning Methods

    Taeeun KIM  Hwankuk KIM  

     
    PAPER

      Pubricized:
    2021/08/02
      Vol:
    E104-D No:11
      Page(s):
    1805-1813

    Any Internet-connected device is vulnerable to being hacked and misused. Hackers can find vulnerable IoT devices, infect malicious codes, build massive IoT botnets, and remotely control IoT devices through C&C servers. Many studies have been attempted to apply various security features on IoT devices to prevent IoT devices from being exploited by attackers. However, unlike high-performance PCs, IoT devices are lightweight, low-power, and low-cost devices and have limitations on performance of processing and memory, making it difficult to install heavy security functions. Instead of access to applying security functions on IoT devices, Internet-wide scanning (e.g., Shodan) studies have been attempted to quickly discover and take security measures massive IoT devices with weak security. Over the Internet, scanning studies remotely also exist realistic limitations such as low accuracy in analyzing security vulnerabilities due to a lack of device information or filtered by network security devices. In this paper, we propose a system for remotely collecting information from Internet-connected devices and using scanning techniques to identify and manage vulnerability information from IoT devices. The proposed system improves the open-source Zmap engine to solve a realistic problem when attempting to scan through real Internet. As a result, performance measurements show equal or superior results compared to previous Shodan, Zmap-based scanning.

  • Verifiable Credential Proof Generation and Verification Model for Decentralized SSI-Based Credit Scoring Data

    Kang Woo CHO  Byeong-Gyu JEONG  Sang Uk SHIN  

     
    PAPER

      Pubricized:
    2021/07/27
      Vol:
    E104-D No:11
      Page(s):
    1857-1868

    The continuous development of the mobile computing environment has led to the emergence of fintech to enable convenient financial transactions in this environment. Previously proposed financial identity services mostly adopted centralized servers that are prone to single-point-of-failure problems and performance bottlenecks. Blockchain-based self-sovereign identity (SSI), which emerged to address this problem, is a technology that solves centralized problems and allows decentralized identification. However, the verifiable credential (VC), a unit of SSI data transactions, guarantees unlimited right to erasure for self-sovereignty. This does not suit the specificity of the financial transaction network, which requires the restriction of the right to erasure for credit evaluation. This paper proposes a model for VC generation and revocation verification for credit scoring data. The proposed model includes double zero knowledge - succinct non-interactive argument of knowledge (zk-SNARK) proof in the VC generation process between the holder and the issuer. In addition, cross-revocation verification takes place between the holder and the verifier. As a result, the proposed model builds a trust platform among the holder, issuer, and verifier while maintaining the decentralized SSI attributes and focusing on the VC life cycle. The model also improves the way in which credit evaluation data are processed as VCs by granting opt-in and the special right to erasure.

  • An Efficient Public Verifiable Certificateless Multi-Receiver Signcryption Scheme for IoT Environments

    Dae-Hwi LEE  Won-Bin KIM  Deahee SEO  Im-Yeong LEE  

     
    PAPER

      Pubricized:
    2021/07/14
      Vol:
    E104-D No:11
      Page(s):
    1869-1879

    Lightweight cryptographic systems for services delivered by the recently developed Internet of Things (IoT) are being continuously researched. However, existing Public Key Infrastructure (PKI)-based cryptographic algorithms are difficult to apply to IoT services delivered using lightweight devices. Therefore, encryption, authentication, and signature systems based on Certificateless Public Key Cryptography (CL-PKC), which are lightweight because they do not use the certificates of existing PKI-based cryptographic algorithms, are being studied. Of the various public key cryptosystems, signcryption is efficient, and ensures integrity and confidentiality. Recently, CL-based signcryption (CL-SC) schemes have been intensively studied, and a multi-receiver signcryption (MRSC) protocol for environments with multiple receivers, i.e., not involving end-to-end communication, has been proposed. However, when using signcryption, confidentiality and integrity may be violated by public key replacement attacks. In this paper, we develop an efficient CL-based MRSC (CL-MRSC) scheme using CL-PKC for IoT environments. Existing signcryption schemes do not offer public verifiability, which is required if digital signatures are used, because only the receiver can verify the validity of the message; sender authenticity is not guaranteed by a third party. Therefore, we propose a CL-MRSC scheme in which communication participants (such as the gateways through which messages are transmitted) can efficiently and publicly verify the validity of encrypted messages.

  • Provable-Security Analysis of Authenticated Encryption Based on Lesamnta-LW in the Ideal Cipher Model

    Shoichi HIROSE  Hidenori KUWAKADO  Hirotaka YOSHIDA  

     
    PAPER

      Pubricized:
    2021/07/08
      Vol:
    E104-D No:11
      Page(s):
    1894-1901

    Hirose, Kuwakado and Yoshida proposed a nonce-based authenticated encryption scheme Lae0 based on Lesamnta-LW in 2019. Lesamnta-LW is a block-cipher-based iterated hash function included in the ISO/IEC 29192-5 lightweight hash-function standard. They also showed that Lae0 satisfies both privacy and authenticity if the underlying block cipher is a pseudorandom permutation. Unfortunately, their result implies only about 64-bit security for instantiation with the dedicated block cipher of Lesamnta-LW. In this paper, we analyze the security of Lae0 in the ideal cipher model. Our result implies about 120-bit security for instantiation with the block cipher of Lesamnta-LW.

  • CoLaFUZE: Coverage-Guided and Layout-Aware Fuzzing for Android Drivers

    Tianshi MU  Huabing ZHANG  Jian WANG  Huijuan LI  

     
    PAPER

      Pubricized:
    2021/07/28
      Vol:
    E104-D No:11
      Page(s):
    1902-1912

    With the commercialization of 5G mobile phones, Android drivers are increasing rapidly to utilize a large quantity of newly emerging feature-rich hardware. Most of these drivers are developed by third-party vendors and lack proper vulnerabilities review, posing a number of new potential risks to security and privacy. However, the complexity and diversity of Android drivers make the traditional analysis methods inefficient. For example, the driver-specific argument formats make traditional syscall fuzzers difficult to generate valid inputs, the pointer-heavy code makes static analysis results incomplete, and pointer casting hides the actual type. Triggering code deep in Android drivers remains challenging. We present CoLaFUZE, a coverage-guided and layout-aware fuzzing tool for automatically generating valid inputs and exploring the driver code. CoLaFUZE employs a kernel module to capture the data copy operation and redirect it to the fuzzing engine, ensuring that the correct size of the required data is transferred to the driver. CoLaFUZE leverages dynamic analysis and symbolic execution to recover the driver interfaces and generates valid inputs for the interfaces. Furthermore, the seed mutation module of CoLaFUZE leverages coverage information to achieve better seed quality and expose bugs deep in the driver. We evaluate CoLaFUZE on 5 modern Android mobile phones from the top vendors, including Google, Xiaomi, Samsung, Sony, and Huawei. The results show that CoLaFUZE can explore more code coverage compared with the state-of-the-art fuzzer, and CoLaFUZE successfully found 11 vulnerabilities in the testing devices.

  • Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network

    Mariana RODRIGUES MAKIUCHI  Tifani WARNITA  Nakamasa INOUE  Koichi SHINODA  Michitaka YOSHIMURA  Momoko KITAZAWA  Kei FUNAKI  Yoko EGUCHI  Taishiro KISHIMOTO  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/08/03
      Vol:
    E104-D No:11
      Page(s):
    1930-1940

    We propose a non-invasive and cost-effective method to automatically detect dementia by utilizing solely speech audio data. We extract paralinguistic features for a short speech segment and use Gated Convolutional Neural Networks (GCNN) to classify it into dementia or healthy. We evaluate our method on the Pitt Corpus and on our own dataset, the PROMPT Database. Our method yields the accuracy of 73.1% on the Pitt Corpus using an average of 114 seconds of speech data. In the PROMPT Database, our method yields the accuracy of 74.7% using 4 seconds of speech data and it improves to 80.8% when we use all the patient's speech data. Furthermore, we evaluate our method on a three-class classification problem in which we included the Mild Cognitive Impairment (MCI) class and achieved the accuracy of 60.6% with 40 seconds of speech data.

  • Influence of Access to Reading Material during Concept Map Recomposition in Reading Comprehension and Retention

    Pedro GABRIEL FONTELES FURTADO  Tsukasa HIRASHIMA  Nawras KHUDHUR  Aryo PINANDITO  Yusuke HAYASHI  

     
    PAPER-Educational Technology

      Pubricized:
    2021/08/02
      Vol:
    E104-D No:11
      Page(s):
    1941-1950

    This study investigated the influence of reading time while building a closed concept map on reading comprehension and retention. It also investigated the effect of having access to the text during closed concept map creation on reading comprehension and retention. Participants from Amazon Mechanical Turk (N =101) read a text, took an after-text test, and took part in one of three conditions, “Map & Text”, “Map only”, and “Double Text”, took an after-activity test, followed by a two-week retention period and then one final delayed test. Analysis revealed that higher reading times were associated with better reading comprehension and better retention. Furthermore, when comparing “Map & Text” to the “Map only” condition, short-term reading comprehension was improved, but long-term retention was not improved. This suggests that having access to the text while building closed concept maps can improve reading comprehension, but long term learning can only be improved if students invest time accessing both the map and the text.

  • DNN-Based Low-Musical-Noise Single-Channel Speech Enhancement Based on Higher-Order-Moments Matching

    Satoshi MIZOGUCHI  Yuki SAITO  Shinnosuke TAKAMICHI  Hiroshi SARUWATARI  

     
    PAPER-Speech and Hearing

      Pubricized:
    2021/07/30
      Vol:
    E104-D No:11
      Page(s):
    1971-1980

    We propose deep neural network (DNN)-based speech enhancement that reduces musical noise and achieves better auditory impressions. The musical noise is an artifact generated by nonlinear signal processing and negatively affects the auditory impressions. We aim to develop musical-noise-free speech enhancement methods that suppress the musical noise generation and produce perceptually-comfortable enhanced speech. DNN-based speech enhancement using a soft mask achieves high noise reduction but generates musical noise in non-speech regions. Therefore, first, we define kurtosis matching for DNN-based low-musical-noise speech enhancement. Kurtosis is the fourth-order moment and is known to correlate with the amount of musical noise. The kurtosis matching is a penalty term of the DNN training and works to reduce the amount of musical noise. We further extend this scheme to standardized-moment matching. The extended scheme involves using moments whose orders are higher than kurtosis and generalizes the conventional musical-noise-free method based on kurtosis matching. We formulate standardized-moment matching and explore how effectively the higher-order moments reduce the amount of musical noise. Experimental evaluation results 1) demonstrate that kurtosis matching can reduce musical noise without negatively affecting noise suppression and 2) newly reveal that the sixth-moment matching also achieves low-musical-noise speech enhancement as well as kurtosis matching.

  • Smaller Residual Network for Single Image Depth Estimation

    Andi HENDRA  Yasushi KANAZAWA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/08/17
      Vol:
    E104-D No:11
      Page(s):
    1992-2001

    We propose a new framework for estimating depth information from a single image. Our framework is relatively small and straightforward by employing a two-stage architecture: a residual network and a simple decoder network. Our residual network in this paper is a remodeled of the original ResNet-50 architecture, which consists of only thirty-eight convolution layers in the residual block following by pair of two up-sampling and layers. While the simple decoder network, stack of five convolution layers, accepts the initial depth to be refined as the final output depth. During training, we monitor the loss behavior and adjust the learning rate hyperparameter in order to improve the performance. Furthermore, instead of using a single common pixel-wise loss, we also compute loss based on gradient-direction, and their structure similarity. This setting in our network can significantly reduce the number of network parameters, and simultaneously get a more accurate image depth map. The performance of our approach has been evaluated by conducting both quantitative and qualitative comparisons with several prior related methods on the publicly NYU and KITTI datasets.

  • Synthetic Scene Character Generator and Ensemble Scheme with the Random Image Feature Method for Japanese and Chinese Scene Character Recognition

    Fuma HORIE  Hideaki GOTO  Takuo SUGANUMA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/08/24
      Vol:
    E104-D No:11
      Page(s):
    2002-2010

    Scene character recognition has been intensively investigated for a couple of decades because it has a great potential in many applications including automatic translation, signboard recognition, and reading assistance for the visually-impaired. However, scene characters are difficult to recognize at sufficient accuracy owing to various noise and image distortions. In addition, Japanese scene character recognition is more challenging and requires a large amount of character data for training because thousands of character classes exist in the language. Some researchers proposed training data augmentation techniques using Synthetic Scene Character Data (SSCD) to compensate for the shortage of training data. In this paper, we propose a Random Filter which is a new method for SSCD generation, and introduce an ensemble scheme with the Random Image Feature (RI-Feature) method. Since there has not been a large Japanese scene character dataset for the evaluation of the recognition systems, we have developed an open dataset JPSC1400, which consists of a large number of real Japanese scene characters. It is shown that the accuracy has been improved from 70.9% to 83.1% by introducing the RI-Feature method to the ensemble scheme.

  • Detecting Depression from Speech through an Attentive LSTM Network

    Yan ZHAO  Yue XIE  Ruiyu LIANG  Li ZHANG  Li ZHAO  Chengyu LIU  

     
    LETTER-Speech and Hearing

      Pubricized:
    2021/08/24
      Vol:
    E104-D No:11
      Page(s):
    2019-2023

    Depression endangers people's health conditions and affects the social order as a mental disorder. As an efficient diagnosis of depression, automatic depression detection has attracted lots of researcher's interest. This study presents an attention-based Long Short-Term Memory (LSTM) model for depression detection to make full use of the difference between depression and non-depression between timeframes. The proposed model uses frame-level features, which capture the temporal information of depressive speech, to replace traditional statistical features as an input of the LSTM layers. To achieve more multi-dimensional deep feature representations, the LSTM output is then passed on attention layers on both time and feature dimensions. Then, we concat the output of the attention layers and put the fused feature representation into the fully connected layer. At last, the fully connected layer's output is passed on to softmax layer. Experiments conducted on the DAIC-WOZ database demonstrate that the proposed attentive LSTM model achieves an average accuracy rate of 90.2% and outperforms the traditional LSTM network and LSTM with local attention by 0.7% and 2.3%, respectively, which indicates its feasibility.

2021-2040hit(42807hit)