Hiroaki AKUTSU Ko ARAI
Lanxi LIU Pengpeng YANG Suwen DU Sani M. ABDULLAHI
Xiaoguang TU Zhi HE Gui FU Jianhua LIU Mian ZHONG Chao ZHOU Xia LEI Juhang YIN Yi HUANG Yu WANG
Yingying LU Cheng LU Yuan ZONG Feng ZHOU Chuangao TANG
Jialong LI Takuto YAMAUCHI Takanori HIRANO Jinyu CAI Kenji TEI
Wei LEI Yue ZHANG Hanfeng XIE Zebin CHEN Zengping CHEN Weixing LI
David CLARINO Naoya ASADA Atsushi MATSUO Shigeru YAMASHITA
Takashi YOKOTA Kanemitsu OOTSU
Xiaokang Jin Benben Huang Hao Sheng Yao Wu
Tomoki MIYAMOTO
Ken WATANABE Katsuhide FUJITA
Masashi UNOKI Kai LI Anuwat CHAIWONGYEN Quoc-Huy NGUYEN Khalid ZAMAN
Takaharu TSUBOYAMA Ryota TAKAHASHI Motoi IWATA Koichi KISE
Chi ZHANG Li TAO Toshihiko YAMASAKI
Ann Jelyn TIEMPO Yong-Jin JEONG
Haruhisa KATO Yoshitaka KIDANI Kei KAWAMURA
Jiakun LI Jiajian LI Yanjun SHI Hui LIAN Haifan WU
Gyuyeong KIM
Hyun KWON Jun LEE
Fan LI Enze YANG Chao LI Shuoyan LIU Haodong WANG
Guangjin Ouyang Yong Guo Yu Lu Fang He
Yuyao LIU Qingyong LI Shi BAO Wen WANG
Cong PANG Ye NI Jia Ming CHENG Lin ZHOU Li ZHAO
Nikolay FEDOROV Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Yukasa MURAKAMI Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Kazuya KAKIZAKI Kazuto FUKUCHI Jun SAKUMA
Yitong WANG Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO
Waqas NAWAZ Muhammad UZAIR Kifayat ULLAH KHAN Iram FATIMA
Haeyoung Lee
Ji XI Pengxu JIANG Yue XIE Wei JIANG Hao DING
Weiwei JING Zhonghua LI
Sena LEE Chaeyoung KIM Hoorin PARK
Akira ITO Yoshiaki TAKAHASHI
Rindo NAKANISHI Yoshiaki TAKATA Hiroyuki SEKI
Chuzo IWAMOTO Ryo TAKAISHI
Chih-Ping Wang Duen-Ren Liu
Yuya TAKADA Rikuto MOCHIDA Miya NAKAJIMA Syun-suke KADOYA Daisuke SANO Tsuyoshi KATO
Yi Huo Yun Ge
Rikuto MOCHIDA Miya NAKAJIMA Haruki ONO Takahiro ANDO Tsuyoshi KATO
Koichi FUJII Tomomi MATSUI
Yaotong SONG Zhipeng LIU Zhiming ZHANG Jun TANG Zhenyu LEI Shangce GAO
Souhei TAKAGI Takuya KOJIMA Hideharu AMANO Morihiro KUGA Masahiro IIDA
Jun ZHOU Masaaki KONDO
Tetsuya MANABE Wataru UNUMA
Kazuyuki AMANO
Takumi SHIOTA Tonan KAMATA Ryuhei UEHARA
Hitoshi MURAKAMI Yutaro YAMAGUCHI
Jingjing Liu Chuanyang Liu Yiquan Wu Zuo Sun
Zhenglong YANG Weihao DENG Guozhong WANG Tao FAN Yixi LUO
Yoshiaki TAKATA Akira ONISHI Ryoma SENDA Hiroyuki SEKI
Dinesh DAULTANI Masayuki TANAKA Masatoshi OKUTOMI Kazuki ENDO
Kento KIMURA Tomohiro HARAMIISHI Kazuyuki AMANO Shin-ichi NAKANO
Ryotaro MITSUBOSHI Kohei HATANO Eiji TAKIMOTO
Genta INOUE Daiki OKONOGI Satoru JIMBO Thiem Van CHU Masato MOTOMURA Kazushi KAWAMURA
Hikaru USAMI Yusuke KAMEDA
Yinan YANG
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI
Fengshan ZHAO Qin LIU Takeshi IKENAGA
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI
Tomohiro KOBAYASHI Tomomi MATSUI
Shin-ichi NAKANO
Ming PAN
Yuanlong CAO Ruiwen JI Lejun JI Xun SHAO Gang LEI Hao WANG
With multiple network interfaces are being widely equipped in modern mobile devices, the Multipath TCP (MPTCP) is increasingly becoming the preferred transport technique since it can uses multiple network interfaces simultaneously to spread the data across multiple network paths for throughput improvement. However, the MPTCP performance can be seriously affected by the use of a poor-performing path in multipath transmission, especially in the presence of network attacks, in which an MPTCP path would abrupt and frequent become underperforming caused by attacks. In this paper, we propose a multi-expert Learning-based MPTCP variant, called MPTCP-meLearning, to enhance MPTCP performance robustness against network attacks. MPTCP-meLearning introduces a new kind of predictor to possibly achieve better quality prediction accuracy for each of multiple paths, by leveraging a group of representative formula-based predictors. MPTCP-meLearning includes a novel mechanism to intelligently manage multiple paths in order to possibly mitigate the out-of-order reception and receive buffer blocking problems. Experimental results demonstrate that MPTCP-meLearning can achieve better transmission performance and quality of service than the baseline MPTCP scheme.
Any Internet-connected device is vulnerable to being hacked and misused. Hackers can find vulnerable IoT devices, infect malicious codes, build massive IoT botnets, and remotely control IoT devices through C&C servers. Many studies have been attempted to apply various security features on IoT devices to prevent IoT devices from being exploited by attackers. However, unlike high-performance PCs, IoT devices are lightweight, low-power, and low-cost devices and have limitations on performance of processing and memory, making it difficult to install heavy security functions. Instead of access to applying security functions on IoT devices, Internet-wide scanning (e.g., Shodan) studies have been attempted to quickly discover and take security measures massive IoT devices with weak security. Over the Internet, scanning studies remotely also exist realistic limitations such as low accuracy in analyzing security vulnerabilities due to a lack of device information or filtered by network security devices. In this paper, we propose a system for remotely collecting information from Internet-connected devices and using scanning techniques to identify and manage vulnerability information from IoT devices. The proposed system improves the open-source Zmap engine to solve a realistic problem when attempting to scan through real Internet. As a result, performance measurements show equal or superior results compared to previous Shodan, Zmap-based scanning.
Cheng-Chung KUO Ding-Kai TSENG Chun-Wei TSAI Chu-Sing YANG
The development of an efficient detection mechanism to determine malicious network traffic has been a critical research topic in the field of network security in recent years. This study implemented an intrusion-detection system (IDS) based on a machine learning algorithm to periodically convert and analyze real network traffic in the campus environment in almost real time. The focuses of this study are on determining how to improve the detection rate of an IDS and how to detect more non-well-known port attacks apart from the traditional rule-based system. Four new features are used to increase the discriminant accuracy. In addition, an algorithm for balancing the data set was used to construct the training data set, which can also enable the learning model to more accurately reflect situations in real environment.
Yuta TAKATA Hiroshi KUMAGAI Masaki KAMIZONO
While websites are becoming more and more complex daily, the difficulty of managing them is also increasing. It is important to conduct regular maintenance against these complex websites to strengthen their security and improve their cyber resilience. However, misconfigurations and vulnerabilities are still being discovered on some pages of websites and cyberattacks against them are never-ending. In this paper, we take the novel approach of applying the concept of security governance to websites; and, as part of this, measuring the consistency of software settings and versions used on these websites. More precisely, we analyze multiple web pages with the same domain name and identify differences in the security settings of HTTP headers and versions of software among them. After analyzing over 8,000 websites of popular global organizations, our measurement results show that over half of the tested websites exhibit differences. For example, we found websites running on a web server whose version changes depending on access and using a JavaScript library with different versions across over half of the tested pages. We identify the cause of such governance failures and propose improvement plans.
Seolah JANG Sandi RAHMADIKA Sang Uk SHIN Kyung-Hyune RHEE
A private decentralized e-health environment, empowered by blockchain technology, grants authorized healthcare entities to legitimately access the patient's medical data without relying on a centralized node. Every activity from authorized entities is recorded immutably in the blockchain transactions. In terms of privacy, the e-health system preserves a default privacy option as an initial state for every patient since the patients may frequently customize their medical data over time for several purposes. Moreover, adjustments in the patient's privacy contexts are often solely from the patient's initiative without any doctor or stakeholders' recommendation. Therefore, we design, implement, and evaluate user-defined data privacy utilizing nudge theory for decentralized e-health systems named PDPM to tackle these issues. Patients can determine the privacy of their medical records to be closed to certain parties. Data privacy management is dynamic, which can be executed on the blockchain via the smart contract feature. Tamper-proof user-defined data privacy can resolve the dispute between the e-health entities related to privacy management and adjustments. In short, the authorized entities cannot deny any changes since every activity is recorded in the ledgers. Meanwhile, the nudge theory technique supports providing the best patient privacy recommendations based on their behaviour activities even though the final decision rests on the patient. Finally, we demonstrate how to use PDPM to realize user-defined data privacy management in decentralized e-health environments.
Hyungjin CHO Seongmin PARK Youngkwon PARK Bomin CHOI Dowon KIM Kangbin YIM
In Feb 2021, As the competition for commercialization of 5G mobile communication has been increasing, 5G SA Network and Vo5G are expected to be commercialized soon. 5G mobile communication aims to provide 20 Gbps transmission speed which is 20 times faster than 4G mobile communication, connection of at least 1 million devices per 1 km2, and 1 ms transmission delay which is 10 times shorter than 4G. To meet this, various technological developments were required, and various technologies such as Massive MIMO (Multiple-Input and Multiple-Output), mmWave, and small cell network were developed and applied in the area of 5G access network. However, in the core network area, the components constituting the LTE (Long Term Evolution) core network are utilized as they are in the NSA (Non-Standalone) architecture, and only the changes in the SA (Standalone) architecture have occurred. Also, in the network area for providing the voice service, the IMS (IP Multimedia Subsystem) infrastructure is still used in the SA architecture. Here, the issue is that while 5G mobile communication is evolving openly to provide various services, security elements are vulnerable to various cyber-attacks because they maintain the same form as before. Therefore, in this paper, we will look at what the network standard for 5G voice service provision consists of, and what are the vulnerable problems in terms of security. And We Suggest Possible Attack Scenario using Security Issue, We also want to consider whether these problems can actually occur and what is the countermeasure.
Kang Woo CHO Byeong-Gyu JEONG Sang Uk SHIN
The continuous development of the mobile computing environment has led to the emergence of fintech to enable convenient financial transactions in this environment. Previously proposed financial identity services mostly adopted centralized servers that are prone to single-point-of-failure problems and performance bottlenecks. Blockchain-based self-sovereign identity (SSI), which emerged to address this problem, is a technology that solves centralized problems and allows decentralized identification. However, the verifiable credential (VC), a unit of SSI data transactions, guarantees unlimited right to erasure for self-sovereignty. This does not suit the specificity of the financial transaction network, which requires the restriction of the right to erasure for credit evaluation. This paper proposes a model for VC generation and revocation verification for credit scoring data. The proposed model includes double zero knowledge - succinct non-interactive argument of knowledge (zk-SNARK) proof in the VC generation process between the holder and the issuer. In addition, cross-revocation verification takes place between the holder and the verifier. As a result, the proposed model builds a trust platform among the holder, issuer, and verifier while maintaining the decentralized SSI attributes and focusing on the VC life cycle. The model also improves the way in which credit evaluation data are processed as VCs by granting opt-in and the special right to erasure.
Dae-Hwi LEE Won-Bin KIM Deahee SEO Im-Yeong LEE
Lightweight cryptographic systems for services delivered by the recently developed Internet of Things (IoT) are being continuously researched. However, existing Public Key Infrastructure (PKI)-based cryptographic algorithms are difficult to apply to IoT services delivered using lightweight devices. Therefore, encryption, authentication, and signature systems based on Certificateless Public Key Cryptography (CL-PKC), which are lightweight because they do not use the certificates of existing PKI-based cryptographic algorithms, are being studied. Of the various public key cryptosystems, signcryption is efficient, and ensures integrity and confidentiality. Recently, CL-based signcryption (CL-SC) schemes have been intensively studied, and a multi-receiver signcryption (MRSC) protocol for environments with multiple receivers, i.e., not involving end-to-end communication, has been proposed. However, when using signcryption, confidentiality and integrity may be violated by public key replacement attacks. In this paper, we develop an efficient CL-based MRSC (CL-MRSC) scheme using CL-PKC for IoT environments. Existing signcryption schemes do not offer public verifiability, which is required if digital signatures are used, because only the receiver can verify the validity of the message; sender authenticity is not guaranteed by a third party. Therefore, we propose a CL-MRSC scheme in which communication participants (such as the gateways through which messages are transmitted) can efficiently and publicly verify the validity of encrypted messages.
In [31], Shin et al. proposed a Leakage-Resilient and Proactive Authenticated Key Exchange (LRP-AKE) protocol for credential services which provides not only a higher level of security against leakage of stored secrets but also secrecy of private key with respect to the involving server. In this paper, we discuss a problem in the security proof of the LRP-AKE protocol, and then propose a modified LRP-AKE protocol that has a simple and effective measure to the problem. Also, we formally prove its AKE security and mutual authentication for the entire modified LRP-AKE protocol. In addition, we describe several extensions of the (modified) LRP-AKE protocol including 1) synchronization issue between the client and server's stored secrets; 2) randomized ID for the provision of client's privacy; and 3) a solution to preventing server compromise-impersonation attacks. Finally, we evaluate the performance overhead of the LRP-AKE protocol and show its test vectors. From the performance evaluation, we can confirm that the LRP-AKE protocol has almost the same efficiency as the (plain) Diffie-Hellman protocol that does not provide authentication at all.
Shoichi HIROSE Hidenori KUWAKADO Hirotaka YOSHIDA
Hirose, Kuwakado and Yoshida proposed a nonce-based authenticated encryption scheme Lae0 based on Lesamnta-LW in 2019. Lesamnta-LW is a block-cipher-based iterated hash function included in the ISO/IEC 29192-5 lightweight hash-function standard. They also showed that Lae0 satisfies both privacy and authenticity if the underlying block cipher is a pseudorandom permutation. Unfortunately, their result implies only about 64-bit security for instantiation with the dedicated block cipher of Lesamnta-LW. In this paper, we analyze the security of Lae0 in the ideal cipher model. Our result implies about 120-bit security for instantiation with the block cipher of Lesamnta-LW.
Tianshi MU Huabing ZHANG Jian WANG Huijuan LI
With the commercialization of 5G mobile phones, Android drivers are increasing rapidly to utilize a large quantity of newly emerging feature-rich hardware. Most of these drivers are developed by third-party vendors and lack proper vulnerabilities review, posing a number of new potential risks to security and privacy. However, the complexity and diversity of Android drivers make the traditional analysis methods inefficient. For example, the driver-specific argument formats make traditional syscall fuzzers difficult to generate valid inputs, the pointer-heavy code makes static analysis results incomplete, and pointer casting hides the actual type. Triggering code deep in Android drivers remains challenging. We present CoLaFUZE, a coverage-guided and layout-aware fuzzing tool for automatically generating valid inputs and exploring the driver code. CoLaFUZE employs a kernel module to capture the data copy operation and redirect it to the fuzzing engine, ensuring that the correct size of the required data is transferred to the driver. CoLaFUZE leverages dynamic analysis and symbolic execution to recover the driver interfaces and generates valid inputs for the interfaces. Furthermore, the seed mutation module of CoLaFUZE leverages coverage information to achieve better seed quality and expose bugs deep in the driver. We evaluate CoLaFUZE on 5 modern Android mobile phones from the top vendors, including Google, Xiaomi, Samsung, Sony, and Huawei. The results show that CoLaFUZE can explore more code coverage compared with the state-of-the-art fuzzer, and CoLaFUZE successfully found 11 vulnerabilities in the testing devices.
Shuhei NISHIYAMA Chonho LEE Tomohiro MASHITA
In this work, an optimization method for the 3D container loading problem with multiple constraints is proposed. The method consists of a genetic algorithm to generate an arrangement of cargo and a fitness evaluation using a physics simulation. The fitness function considers not only the maximization of the container density and fitness value but also several different constraints such as weight, stack-ability, fragility, and orientation of cargo pieces. We employed a container shaking simulation for the fitness evaluation to include constraint effects during loading and transportation. We verified that the proposed method successfully provides the optimal cargo arrangement for small-scale problems with about 10 pieces of cargo.
Kenya TAJIMA Yoshihiro HIROHASHI Esmeraldo ZARA Tsuyoshi KATO
The multi-category support vector machine (MC-SVM) is one of the most popular machine learning algorithms. There are numerous MC-SVM variants, although different optimization algorithms were developed for diverse learning machines. In this study, we developed a new optimization algorithm that can be applied to several MC-SVM variants. The algorithm is based on the Frank-Wolfe framework that requires two subproblems, direction-finding and line search, in each iteration. The contribution of this study is the discovery that both subproblems have a closed form solution if the Frank-Wolfe framework is applied to the dual problem. Additionally, the closed form solutions on both the direction-finding and line search exist even for the Moreau envelopes of the loss functions. We used several large datasets to demonstrate that the proposed optimization algorithm rapidly converges and thereby improves the pattern recognition performance.
Mariana RODRIGUES MAKIUCHI Tifani WARNITA Nakamasa INOUE Koichi SHINODA Michitaka YOSHIMURA Momoko KITAZAWA Kei FUNAKI Yoko EGUCHI Taishiro KISHIMOTO
We propose a non-invasive and cost-effective method to automatically detect dementia by utilizing solely speech audio data. We extract paralinguistic features for a short speech segment and use Gated Convolutional Neural Networks (GCNN) to classify it into dementia or healthy. We evaluate our method on the Pitt Corpus and on our own dataset, the PROMPT Database. Our method yields the accuracy of 73.1% on the Pitt Corpus using an average of 114 seconds of speech data. In the PROMPT Database, our method yields the accuracy of 74.7% using 4 seconds of speech data and it improves to 80.8% when we use all the patient's speech data. Furthermore, we evaluate our method on a three-class classification problem in which we included the Mild Cognitive Impairment (MCI) class and achieved the accuracy of 60.6% with 40 seconds of speech data.
Pedro GABRIEL FONTELES FURTADO Tsukasa HIRASHIMA Nawras KHUDHUR Aryo PINANDITO Yusuke HAYASHI
This study investigated the influence of reading time while building a closed concept map on reading comprehension and retention. It also investigated the effect of having access to the text during closed concept map creation on reading comprehension and retention. Participants from Amazon Mechanical Turk (N =101) read a text, took an after-text test, and took part in one of three conditions, “Map & Text”, “Map only”, and “Double Text”, took an after-activity test, followed by a two-week retention period and then one final delayed test. Analysis revealed that higher reading times were associated with better reading comprehension and better retention. Furthermore, when comparing “Map & Text” to the “Map only” condition, short-term reading comprehension was improved, but long-term retention was not improved. This suggests that having access to the text while building closed concept maps can improve reading comprehension, but long term learning can only be improved if students invest time accessing both the map and the text.
Makoto YASUKAWA Yasushi MAKIHARA Toshinori HOSOI Masahiro KUBO Yasushi YAGI
Human gait analysis has been widely used in medical and health fields. It is essential to extract spatio-temporal gait features (e.g., single support duration, step length, and toe angle) by partitioning the gait phase and estimating the footprint position/orientation in such fields. Therefore, we propose a method to partition the gait phase given a foot position sequence using mutually constrained piecewise linear approximation with dynamic programming, which not only represents normal gait well but also pathological gait without training data. We also propose a method to detect footprints by accumulating toe edges on the floor plane during stance phases, which enables us to detect footprints more clearly than a conventional method. Finally, we extract four spatial/temporal gait parameters for accuracy evaluation: single support duration, double support duration, toe angle, and step length. We conducted experiments to validate the proposed method using two types of gait patterns, that is, healthy and mimicked hemiplegic gait, from 10 subjects. We confirmed that the proposed method could estimate the spatial/temporal gait parameters more accurately than a conventional skeleton-based method regardless of the gait pattern.
To cope with complicated interference scenarios in realistic acoustic environment, supervised deep neural networks (DNNs) are investigated to estimate different user-defined targets. Such techniques can be broadly categorized into magnitude estimation and time-frequency mask estimation techniques. Further, the mask such as the Wiener gain can be estimated directly or derived by the estimated interference power spectral density (PSD) or the estimated signal-to-interference ratio (SIR). In this paper, we propose to incorporate the multi-task learning in DNN-based single-channel speech enhancement by using the speech presence probability (SPP) as a secondary target to assist the target estimation in the main task. The domain-specific information is shared between two tasks to learn a more generalizable representation. Since the performance of multi-task network is sensitive to the weight parameters of loss function, the homoscedastic uncertainty is introduced to adaptively learn the weights, which is proven to outperform the fixed weighting method. Simulation results show the proposed multi-task scheme improves the speech enhancement performance overall compared to the conventional single-task methods. And the joint direct mask and SPP estimation yields the best performance among all the considered techniques.
Satoshi MIZOGUCHI Yuki SAITO Shinnosuke TAKAMICHI Hiroshi SARUWATARI
We propose deep neural network (DNN)-based speech enhancement that reduces musical noise and achieves better auditory impressions. The musical noise is an artifact generated by nonlinear signal processing and negatively affects the auditory impressions. We aim to develop musical-noise-free speech enhancement methods that suppress the musical noise generation and produce perceptually-comfortable enhanced speech. DNN-based speech enhancement using a soft mask achieves high noise reduction but generates musical noise in non-speech regions. Therefore, first, we define kurtosis matching for DNN-based low-musical-noise speech enhancement. Kurtosis is the fourth-order moment and is known to correlate with the amount of musical noise. The kurtosis matching is a penalty term of the DNN training and works to reduce the amount of musical noise. We further extend this scheme to standardized-moment matching. The extended scheme involves using moments whose orders are higher than kurtosis and generalizes the conventional musical-noise-free method based on kurtosis matching. We formulate standardized-moment matching and explore how effectively the higher-order moments reduce the amount of musical noise. Experimental evaluation results 1) demonstrate that kurtosis matching can reduce musical noise without negatively affecting noise suppression and 2) newly reveal that the sixth-moment matching also achieves low-musical-noise speech enhancement as well as kurtosis matching.
Thi Thu Thao KHONG Takashi NAKADA Yasuhiko NAKASHIMA
Adversarial attacks are viewed as a danger to Deep Neural Networks (DNNs), which reveal a weakness of deep learning models in security-critical applications. Recent findings have been presented adversarial training as an outstanding defense method against adversaries. Nonetheless, adversarial training is a challenge with respect to big datasets and large networks. It is believed that, unless making DNN architectures larger, DNNs would be hard to strengthen the robustness to adversarial examples. In order to avoid iteratively adversarial training, our algorithm is Bayes without Bayesian Learning (BwoBL) that performs the ensemble inference to improve the robustness. As an application of transfer learning, we use learned parameters of pretrained DNNs to build Bayesian Neural Networks (BNNs) and focus on Bayesian inference without costing Bayesian learning. In comparison with no adversarial training, our method is more robust than activation functions designed to enhance adversarial robustness. Moreover, BwoBL can easily integrate into any pretrained DNN, not only Convolutional Neural Networks (CNNs) but also other DNNs, such as Self-Attention Networks (SANs) that outperform convolutional counterparts. BwoBL is also convenient to apply to scaling networks, e.g., ResNet and EfficientNet, with better performance. Especially, our algorithm employs a variety of DNN architectures to construct BNNs against a diversity of adversarial attacks on a large-scale dataset. In particular, under l∞ norm PGD attack of pixel perturbation ε=4/255 with 100 iterations on ImageNet, our proposal in ResNets, SANs, and EfficientNets increase by 58.18% top-5 accuracy on average, which are combined with naturally pretrained ResNets, SANs, and EfficientNets. This enhancement is 62.26% on average below l2 norm C&W attack. The combination of our proposed method with pretrained EfficientNets on both natural and adversarial images (EfficientNet-ADV) drastically boosts the robustness resisting PGD and C&W attacks without additional training. Our EfficientNet-ADV-B7 achieves the cutting-edge top-5 accuracy, which is 92.14% and 94.20% on adversarial ImageNet generated by powerful PGD and C&W attacks, respectively.
We propose a new framework for estimating depth information from a single image. Our framework is relatively small and straightforward by employing a two-stage architecture: a residual network and a simple decoder network. Our residual network in this paper is a remodeled of the original ResNet-50 architecture, which consists of only thirty-eight convolution layers in the residual block following by pair of two up-sampling and layers. While the simple decoder network, stack of five convolution layers, accepts the initial depth to be refined as the final output depth. During training, we monitor the loss behavior and adjust the learning rate hyperparameter in order to improve the performance. Furthermore, instead of using a single common pixel-wise loss, we also compute loss based on gradient-direction, and their structure similarity. This setting in our network can significantly reduce the number of network parameters, and simultaneously get a more accurate image depth map. The performance of our approach has been evaluated by conducting both quantitative and qualitative comparisons with several prior related methods on the publicly NYU and KITTI datasets.
Fuma HORIE Hideaki GOTO Takuo SUGANUMA
Scene character recognition has been intensively investigated for a couple of decades because it has a great potential in many applications including automatic translation, signboard recognition, and reading assistance for the visually-impaired. However, scene characters are difficult to recognize at sufficient accuracy owing to various noise and image distortions. In addition, Japanese scene character recognition is more challenging and requires a large amount of character data for training because thousands of character classes exist in the language. Some researchers proposed training data augmentation techniques using Synthetic Scene Character Data (SSCD) to compensate for the shortage of training data. In this paper, we propose a Random Filter which is a new method for SSCD generation, and introduce an ensemble scheme with the Random Image Feature (RI-Feature) method. Since there has not been a large Japanese scene character dataset for the evaluation of the recognition systems, we have developed an open dataset JPSC1400, which consists of a large number of real Japanese scene characters. It is shown that the accuracy has been improved from 70.9% to 83.1% by introducing the RI-Feature method to the ensemble scheme.
Sooyong JEONG Sungdeok CHA Woo Jin LEE
Embedded software often interacts with multiple inputs from various sensors whose dependency is often complex or partially known to developers. With incomplete information on dependency, testing is likely to be insufficient in detecting errors. We propose a method to enhance testing coverage of embedded software by identifying subtle and often neglected dependencies using information contained in usage log. Usage log, traditionally used primarily for investigative purpose following accidents, can also make useful contribution during testing of embedded software. Our approach relies on first individually developing behavioral model for each environmental input, performing compositional analysis while identifying feasible but untested dependencies from usage log, and generating additional test cases that correspond to untested or insufficiently tested dependencies. Experimental evaluation was performed on an Android application named Gravity Screen as well as an Arduino-based wearable glove app. Whereas conventional CTM-based testing technique achieved average branch coverage of 26% and 68% on these applications, respectively, proposed technique achieved 100% coverage in both.
Zhentian WU Feng YAN Zhihua YANG Jingya YANG
This paper studies using price incentives to shift bandwidth demand from peak to non-peak periods. In particular, cost discounts decrease as peak monthly usage increases. We take into account the delay sensitivity of different apps: during peak hours, the usage of hard real-time applications (HRAS) is not counted in the user's monthly data cap, while the usage of other applications (OAS) is counted in the user's monthly data cap. As a result, users may voluntarily delay or abandon OAS in order to get a higher fee discount. Then, a new data rate control algorithm is proposed. The algorithm allocates the data rate according to the priority of the source, which is determined by two factors: (I) the allocated data rate; and (II) the waiting time.
Yan ZHAO Yue XIE Ruiyu LIANG Li ZHANG Li ZHAO Chengyu LIU
Depression endangers people's health conditions and affects the social order as a mental disorder. As an efficient diagnosis of depression, automatic depression detection has attracted lots of researcher's interest. This study presents an attention-based Long Short-Term Memory (LSTM) model for depression detection to make full use of the difference between depression and non-depression between timeframes. The proposed model uses frame-level features, which capture the temporal information of depressive speech, to replace traditional statistical features as an input of the LSTM layers. To achieve more multi-dimensional deep feature representations, the LSTM output is then passed on attention layers on both time and feature dimensions. Then, we concat the output of the attention layers and put the fused feature representation into the fully connected layer. At last, the fully connected layer's output is passed on to softmax layer. Experiments conducted on the DAIC-WOZ database demonstrate that the proposed attentive LSTM model achieves an average accuracy rate of 90.2% and outperforms the traditional LSTM network and LSTM with local attention by 0.7% and 2.3%, respectively, which indicates its feasibility.
Xinran LIU Zhongju WANG Long WANG Chao HUANG Xiong LUO
A hybrid Retinex-based image enhancement algorithm is proposed to improve the quality of images captured by unmanned aerial vehicles (UAVs) in this paper. Hyperparameters of the employed multi-scale Retinex with chromaticity preservation (MSRCP) model are automatically tuned via a two-phase evolutionary computing algorithm. In the two-phase optimization algorithm, the Rao-2 algorithm is applied to performing the global search and a solution is obtained by maximizing the objective function. Next, the Nelder-Mead simplex method is used to improve the solution via local search. Real UAV-taken images of bad quality are collected to verify the performance of the proposed algorithm. Meanwhile, four famous image enhancement algorithms, Multi-Scale Retinex, Multi-Scale Retinex with Color Restoration, Automated Multi-Scale Retinex, and MSRCP are utilized as benchmarking methods. Meanwhile, two commonly used evolutionary computing algorithms, particle swarm optimization and flower pollination algorithm, are considered to verify the efficiency of the proposed method in tuning parameters of the MSRCP model. Experimental results demonstrate that the proposed method achieves the best performance compared with benchmarks and thus the proposed method is applicable for real UAV-based applications.