The search functionality is under construction.

Keyword Search Result

[Keyword] network(4507hit)

301-320hit(4507hit)

  • Flexible Bayesian Inference by Weight Transfer for Robust Deep Neural Networks

    Thi Thu Thao KHONG  Takashi NAKADA  Yasuhiko NAKASHIMA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/07/28
      Vol:
    E104-D No:11
      Page(s):
    1981-1991

    Adversarial attacks are viewed as a danger to Deep Neural Networks (DNNs), which reveal a weakness of deep learning models in security-critical applications. Recent findings have been presented adversarial training as an outstanding defense method against adversaries. Nonetheless, adversarial training is a challenge with respect to big datasets and large networks. It is believed that, unless making DNN architectures larger, DNNs would be hard to strengthen the robustness to adversarial examples. In order to avoid iteratively adversarial training, our algorithm is Bayes without Bayesian Learning (BwoBL) that performs the ensemble inference to improve the robustness. As an application of transfer learning, we use learned parameters of pretrained DNNs to build Bayesian Neural Networks (BNNs) and focus on Bayesian inference without costing Bayesian learning. In comparison with no adversarial training, our method is more robust than activation functions designed to enhance adversarial robustness. Moreover, BwoBL can easily integrate into any pretrained DNN, not only Convolutional Neural Networks (CNNs) but also other DNNs, such as Self-Attention Networks (SANs) that outperform convolutional counterparts. BwoBL is also convenient to apply to scaling networks, e.g., ResNet and EfficientNet, with better performance. Especially, our algorithm employs a variety of DNN architectures to construct BNNs against a diversity of adversarial attacks on a large-scale dataset. In particular, under l∞ norm PGD attack of pixel perturbation ε=4/255 with 100 iterations on ImageNet, our proposal in ResNets, SANs, and EfficientNets increase by 58.18% top-5 accuracy on average, which are combined with naturally pretrained ResNets, SANs, and EfficientNets. This enhancement is 62.26% on average below l2 norm C&W attack. The combination of our proposed method with pretrained EfficientNets on both natural and adversarial images (EfficientNet-ADV) drastically boosts the robustness resisting PGD and C&W attacks without additional training. Our EfficientNet-ADV-B7 achieves the cutting-edge top-5 accuracy, which is 92.14% and 94.20% on adversarial ImageNet generated by powerful PGD and C&W attacks, respectively.

  • Distributed Optimal Estimation with Scalable Communication Cost

    Ryosuke ADACHI  Yuh YAMASHITA  Koichi KOBAYASHI  

     
    PAPER

      Pubricized:
    2021/05/18
      Vol:
    E104-A No:11
      Page(s):
    1470-1476

    This paper addresses distributed optimal estimation over wireless sensor networks with scalable communications. For realizing scalable communication, a data-aggregation method is introduced. Since our previously proposed method cannot guarantee the global optimality of each estimator, a modified protocol is proposed. A modification of the proposed method is that weights are introduced in the data aggregation. For selecting the weight values in the data aggregation, a redundant output reduction method with minimum covariance is discussed. Based on the proposed protocol, all estimators can calculate the optimal estimate. Finally, numerical simulations show that the proposed method can realize both the scalability of communication and high accuracy estimation.

  • Improving the Recognition Accuracy of a Sound Communication System Designed with a Neural Network

    Kosei OZEKI  Naofumi AOKI  Saki ANAZAWA  Yoshinori DOBASHI  Kenichi IKEDA  Hiroshi YASUDA  

     
    PAPER-Engineering Acoustics

      Pubricized:
    2021/05/06
      Vol:
    E104-A No:11
      Page(s):
    1577-1584

    This study has developed a system that performs data communications using high frequency bands of sound signals. Unlike radio communication systems using advanced wireless devices, it only requires the legacy devices such as microphones and speakers employed in ordinary telephony communication systems. In this study, we have investigated the possibility of a machine learning approach to improve the recognition accuracy identifying binary symbols exchanged through sound media. This paper describes some experimental results evaluating the performance of our proposed technique employing a neural network as its classifier of binary symbols. The experimental results indicate that the proposed technique may have a certain appropriateness for designing an optimal classifier for the symbol identification task.

  • Analysis against Security Issues of Voice over 5G

    Hyungjin CHO  Seongmin PARK  Youngkwon PARK  Bomin CHOI  Dowon KIM  Kangbin YIM  

     
    PAPER

      Pubricized:
    2021/07/13
      Vol:
    E104-D No:11
      Page(s):
    1850-1856

    In Feb 2021, As the competition for commercialization of 5G mobile communication has been increasing, 5G SA Network and Vo5G are expected to be commercialized soon. 5G mobile communication aims to provide 20 Gbps transmission speed which is 20 times faster than 4G mobile communication, connection of at least 1 million devices per 1 km2, and 1 ms transmission delay which is 10 times shorter than 4G. To meet this, various technological developments were required, and various technologies such as Massive MIMO (Multiple-Input and Multiple-Output), mmWave, and small cell network were developed and applied in the area of 5G access network. However, in the core network area, the components constituting the LTE (Long Term Evolution) core network are utilized as they are in the NSA (Non-Standalone) architecture, and only the changes in the SA (Standalone) architecture have occurred. Also, in the network area for providing the voice service, the IMS (IP Multimedia Subsystem) infrastructure is still used in the SA architecture. Here, the issue is that while 5G mobile communication is evolving openly to provide various services, security elements are vulnerable to various cyber-attacks because they maintain the same form as before. Therefore, in this paper, we will look at what the network standard for 5G voice service provision consists of, and what are the vulnerable problems in terms of security. And We Suggest Possible Attack Scenario using Security Issue, We also want to consider whether these problems can actually occur and what is the countermeasure.

  • Neural Network Calculations at the Speed of Light Using Optical Vector-Matrix Multiplication and Optoelectronic Activation

    Naoki HATTORI  Jun SHIOMI  Yutaka MASUDA  Tohru ISHIHARA  Akihiko SHINYA  Masaya NOTOMI  

     
    PAPER

      Pubricized:
    2021/05/17
      Vol:
    E104-A No:11
      Page(s):
    1477-1487

    With the rapid progress of the integrated nanophotonics technology, the optical neural network architecture has been widely investigated. Since the optical neural network can complete the inference processing just by propagating the optical signal in the network, it is expected more than one order of magnitude faster than the electronics-only implementation of artificial neural networks (ANN). In this paper, we first propose an optical vector-matrix multiplication (VMM) circuit using wavelength division multiplexing, which enables inference processing at the speed of light with ultra-wideband. This paper next proposes optoelectronic circuit implementation for batch normalization and activation function, which significantly improves the accuracy of the inference processing without sacrificing the speed performance. Finally, using a virtual environment for machine learning and an optoelectronic circuit simulator, we demonstrate the ultra-fast and accurate operation of the optical-electronic ANN circuit.

  • Evaluation Metrics for the Cost of Data Movement in Deep Neural Network Acceleration

    Hongjie XU  Jun SHIOMI  Hidetoshi ONODERA  

     
    PAPER

      Pubricized:
    2021/06/01
      Vol:
    E104-A No:11
      Page(s):
    1488-1498

    Hardware accelerators are designed to support a specialized processing dataflow for everchanging deep neural networks (DNNs) under various processing environments. This paper introduces two hardware properties to describe the cost of data movement in each memory hierarchy. Based on the hardware properties, this paper proposes a set of evaluation metrics that are able to evaluate the number of memory accesses and the required memory capacity according to the specialized processing dataflow. Proposed metrics are able to analytically predict energy, throughput, and area of a hardware design without detailed implementation. Once a processing dataflow and constraints of hardware resources are determined, the proposed evaluation metrics quickly quantify the expected hardware benefits, thereby reducing design time.

  • A Multi-Task Scheme for Supervised DNN-Based Single-Channel Speech Enhancement by Using Speech Presence Probability as the Secondary Training Target

    Lei WANG  Jie ZHU  Kangbo SUN  

    This paper has been cancelled due to violation of duplicate submission policy on IEICE Transactions on Information and Systems.
     
    PAPER-Speech and Hearing

      Pubricized:
    2021/08/05
      Vol:
    E104-D No:11
      Page(s):
    1963-1970

    To cope with complicated interference scenarios in realistic acoustic environment, supervised deep neural networks (DNNs) are investigated to estimate different user-defined targets. Such techniques can be broadly categorized into magnitude estimation and time-frequency mask estimation techniques. Further, the mask such as the Wiener gain can be estimated directly or derived by the estimated interference power spectral density (PSD) or the estimated signal-to-interference ratio (SIR). In this paper, we propose to incorporate the multi-task learning in DNN-based single-channel speech enhancement by using the speech presence probability (SPP) as a secondary target to assist the target estimation in the main task. The domain-specific information is shared between two tasks to learn a more generalizable representation. Since the performance of multi-task network is sensitive to the weight parameters of loss function, the homoscedastic uncertainty is introduced to adaptively learn the weights, which is proven to outperform the fixed weighting method. Simulation results show the proposed multi-task scheme improves the speech enhancement performance overall compared to the conventional single-task methods. And the joint direct mask and SPP estimation yields the best performance among all the considered techniques.

  • Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network

    Mariana RODRIGUES MAKIUCHI  Tifani WARNITA  Nakamasa INOUE  Koichi SHINODA  Michitaka YOSHIMURA  Momoko KITAZAWA  Kei FUNAKI  Yoko EGUCHI  Taishiro KISHIMOTO  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2021/08/03
      Vol:
    E104-D No:11
      Page(s):
    1930-1940

    We propose a non-invasive and cost-effective method to automatically detect dementia by utilizing solely speech audio data. We extract paralinguistic features for a short speech segment and use Gated Convolutional Neural Networks (GCNN) to classify it into dementia or healthy. We evaluate our method on the Pitt Corpus and on our own dataset, the PROMPT Database. Our method yields the accuracy of 73.1% on the Pitt Corpus using an average of 114 seconds of speech data. In the PROMPT Database, our method yields the accuracy of 74.7% using 4 seconds of speech data and it improves to 80.8% when we use all the patient's speech data. Furthermore, we evaluate our method on a three-class classification problem in which we included the Mild Cognitive Impairment (MCI) class and achieved the accuracy of 60.6% with 40 seconds of speech data.

  • Simple Oblivious Routing Method to Balance Load in Network-on-Chip

    Jiao GUAN  Jueping CAI  Ruilian XIE  Yequn WANG  Jinzhi LAI  

     
    LETTER-Computer System

      Pubricized:
    2021/06/30
      Vol:
    E104-D No:10
      Page(s):
    1749-1752

    This letter presents an oblivious and load-balanced routing (OLBR) method without virtual channels for 2D mesh Network-on-chip (NoC). To balance the traffic load of network and avoid deadlock, OLBR divides network nodes into two regions, one region contains the nodes of east and west sides of NoC, in which packets are routed by odd-even turn rule with Y direction preference (OE-YX), and the remaining nodes are divided to the other region, in which packets are routed by odd-even turn rule with alterable priority arbitration (OE-APA). Simulation results show that OLBR's saturation throughput can be improved than related works by 11.73% and OLBR balances the traffic load over entire network.

  • FL-GAN: Feature Learning Generative Adversarial Network for High-Quality Face Sketch Synthesis

    Lin CAO  Kaixuan LI  Kangning DU  Yanan GUO  Peiran SONG  Tao WANG  Chong FU  

     
    PAPER-Image

      Pubricized:
    2021/04/05
      Vol:
    E104-A No:10
      Page(s):
    1389-1402

    Face sketch synthesis refers to transform facial photos into sketches. Recent research on face sketch synthesis has achieved great success due to the development of Generative Adversarial Networks (GAN). However, these generative methods prone to neglect detailed information and thus lose some individual specific features, such as glasses and headdresses. In this paper, we propose a novel method called Feature Learning Generative Adversarial Network (FL-GAN) to synthesize detail-preserving high-quality sketches. Precisely, the proposed FL-GAN consists of one Feature Learning (FL) module and one Adversarial Learning (AL) module. The FL module aims to learn the detailed information of the image in a latent space, and guide the AL module to synthesize detail-preserving sketch. The AL Module aims to learn the structure and texture of sketch and improve the quality of synthetic sketch by adversarial learning strategy. Quantitative and qualitative comparisons with seven state-of-the-art methods such as the LLE, the MRF, the MWF, the RSLCR, the RL, the FCN and the GAN on four facial sketch datasets demonstrate the superiority of this method.

  • DeepSIP: A System for Predicting Service Impact of Network Failure by Temporal Multimodal CNN

    Yoichi MATSUO  Tatsuaki KIMURA  Ken NISHIMATSU  

     
    PAPER-Network Management/Operation

      Pubricized:
    2021/04/01
      Vol:
    E104-B No:10
      Page(s):
    1288-1298

    When a failure occurs in a network element, such as switch, router, and server, network operators need to recognize the service impact, such as time to recovery from the failure or severity of the failure, since service impact is essential information for handling failures. In this paper, we propose Deep learning based Service Impact Prediction system (DeepSIP), which predicts the service impact of network failure in a network element using a temporal multimodal convolutional neural network (CNN). More precisely, DeepSIP predicts the time to recovery from the failure and the loss of traffic volume due to the failure in a network on the basis of information from syslog messages and traffic volume. Since the time to recovery is useful information for a service level agreement (SLA) and the loss of traffic volume is directly related to the severity of the failure, we regard the time to recovery and the loss of traffic volume as the service impact. The service impact is challenging to predict, since it depends on types of network failures and traffic volume when the failure occurs. Moreover, network elements do not explicitly contain any information about the service impact. To extract the type of network failures and predict the service impact, we use syslog messages and past traffic volume. However, syslog messages and traffic volume are also challenging to analyze because these data are multimodal, are strongly correlated, and have temporal dependencies. To extract useful features for prediction, we develop a temporal multimodal CNN. We experimentally evaluated DeepSIP in terms of accuracy by comparing it with other NN-based methods by using synthetic and real datasets. For both datasets, the results show that DeepSIP outperformed the baselines.

  • Eigenvalue Based Relay Selection for XOR-Physical Layer Network Coding in Bi-Directional Wireless Relaying Networks

    Satoshi DENNO  Kazuma YAMAMOTO  Yafei HOU  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2021/03/25
      Vol:
    E104-B No:10
      Page(s):
    1336-1344

    This paper proposes relay selection techniques for XOR physical layer network coding with MMSE based non-linear precoding in MIMO bi-directional wireless relaying networks. The proposed selection techniques are derived on the different assumption about characteristics of the MMSE based non-linear precoding in the wireless network. We show that the signal to noise power ratio (SNR) is dependent on the product of all the eigenvalues in the channels from the terminals to relays. This paper shows that the best selection techniques in all the proposed techniques is to select a group of the relays that maximizes the product. Therefore, the selection technique is called “product of all eigenvalues (PAE)” in this paper. The performance of the proposed relay selection techniques is evaluated in a MIMO bi-directional wireless relaying network where two terminals with 2 antennas exchange their information via relays. When the PAE is applied to select a group of the 2 relays out of the 10 relays where an antenna is placed, the PAE attains a gain of more than 13dB at the BER of 10-3.

  • A Reinforcement Learning Approach for Self-Optimization of Coverage and Capacity in Heterogeneous Cellular Networks

    Junxuan WANG  Meng YU  Xuewei ZHANG  Fan JIANG  

     
    PAPER-Antennas and Propagation

      Pubricized:
    2021/04/13
      Vol:
    E104-B No:10
      Page(s):
    1318-1327

    Heterogeneous networks (HetNets) are emerging as an inevitable method to tackle the capacity crunch of the cellular networks. Due to the complicated network environment and a large number of configured parameters, coverage and capacity optimization (CCO) is a challenging issue in heterogeneous cellular networks. By combining the self-optimizing algorithm for radio frequency (RF) parameters with the power control mechanism of small cells, the CCO problem of self-organizing network is addressed in this paper. First, the optimization of RF parameters is solved based on reinforcement learning (RL), where the base station is modeled as an agent that can learn effective strategies to control the tunable parameters by interacting with the surrounding environment. Second, the small cell can autonomously change the state of wireless transmission by comparing its distance from the user equipment with the virtual cell size. Simulation results show that the proposed algorithm can achieve better performance on user throughput compared to different conventional methods.

  • Siamese Visual Tracking with Dual-Pipeline Correlated Fusion Network

    Ying KANG  Cong LIU  Ning WANG  Dianxi SHI  Ning ZHOU  Mengmeng LI  Yunlong WU  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/07/09
      Vol:
    E104-D No:10
      Page(s):
    1702-1711

    Siamese visual tracking, viewed as a problem of max-similarity matching to the target template, has absorbed increasing attention in computer vision. However, it is a challenge for current Siamese trackers that the demands of balance between accuracy in real-time tracking and robustness in long-time tracking are hard to meet. This work proposes a new Siamese based tracker with a dual-pipeline correlated fusion network (named as ADF-SiamRPN), which consists of one initial template for robust correlation, and the other transient template with the ability of adaptive feature optimal selection for accurate correlation. By the promotion from the learnable correlation-response fusion network afterwards, we are in pursuit of the synthetical improvement of tracking performance. To compare the performance of ADF-SiamRPN with state-of-the-art trackers, we conduct lots of experiments on benchmarks like OTB100, UAV123, VOT2016, VOT2018, GOT-10k, LaSOT and TrackingNet. The experimental results of tracking demonstrate that ADF-SiamRPN outperforms all the compared trackers and achieves the best balance between accuracy and robustness.

  • Document-Level Neural Machine Translation with Associated Memory Network

    Shu JIANG  Rui WANG  Zuchao LI  Masao UTIYAMA  Kehai CHEN  Eiichiro SUMITA  Hai ZHAO  Bao-liang LU  

     
    PAPER-Natural Language Processing

      Pubricized:
    2021/06/24
      Vol:
    E104-D No:10
      Page(s):
    1712-1723

    Standard neural machine translation (NMT) is on the assumption that the document-level context is independent. Most existing document-level NMT approaches are satisfied with a smattering sense of global document-level information, while this work focuses on exploiting detailed document-level context in terms of a memory network. The capacity of the memory network that detecting the most relevant part of the current sentence from memory renders a natural solution to model the rich document-level context. In this work, the proposed document-aware memory network is implemented to enhance the Transformer NMT baseline. Experiments on several tasks show that the proposed method significantly improves the NMT performance over strong Transformer baselines and other related studies.

  • Multi-Task Learning for Improved Recognition of Multiple Types of Acoustic Information

    Jae-Won KIM  Hochong PARK  

     
    LETTER-Speech and Hearing

      Pubricized:
    2021/07/14
      Vol:
    E104-D No:10
      Page(s):
    1762-1765

    We propose a new method for improving the recognition performance of phonemes, speech emotions, and music genres using multi-task learning. When tasks are closely related, multi-task learning can improve the performance of each task by learning common feature representation for all the tasks. However, the recognition tasks considered in this study demand different input signals of speech and music at different time scales, resulting in input features with different characteristics. In addition, a training dataset with multiple labels for all information sources is not available. Considering these issues, we conduct multi-task learning in a sequential training process using input features with a single label for one information source. A comparative evaluation confirms that the proposed method for multi-task learning provides higher performance for all recognition tasks than individual learning for each task as in conventional methods.

  • Research on a Prediction Method for Carbon Dioxide Concentration Based on an Optimized LSTM Network of Spatio-Temporal Data Fusion

    Jun MENG  Gangyi DING  Laiyang LIU  

     
    LETTER-Data Engineering, Web Information Systems

      Pubricized:
    2021/07/08
      Vol:
    E104-D No:10
      Page(s):
    1753-1757

    In view of the different spatial and temporal resolutions of observed multi-source heterogeneous carbon dioxide data and the uncertain quality of observations, a data fusion prediction model for observed multi-scale carbon dioxide concentration data is studied. First, a wireless carbon sensor network is created, the gross error data in the original dataset are eliminated, and remaining valid data are combined with kriging method to generate a series of continuous surfaces for expressing specific features and providing unified spatio-temporally normalized data for subsequent prediction models. Then, the long short-term memory network is used to process these continuous time- and space-normalized data to obtain the carbon dioxide concentration prediction model at any scales. Finally, the experimental results illustrate that the proposed method with spatio-temporal features is more accurate than the single sensor monitoring method without spatio-temporal features.

  • Triplet Attention Network for Video-Based Person Re-Identification

    Rui SUN  Qili LIANG  Zi YANG  Zhenghui ZHAO  Xudong ZHANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2021/07/21
      Vol:
    E104-D No:10
      Page(s):
    1775-1779

    Video-based person re-identification (re-ID) aims at retrieving person across non-overlapping camera and has achieved promising results owing to deep convolutional neural network. Due to the dynamic properties of the video, the problems of background clutters and occlusion are more serious than image-based person Re-ID. In this letter, we present a novel triple attention network (TriANet) that simultaneously utilizes temporal, spatial, and channel context information by employing the self-attention mechanism to get robust and discriminative feature. Specifically, the network has two parts, where the first part introduces a residual attention subnetwork, which contains channel attention module to capture cross-dimension dependencies by using rotation and transformation and spatial attention module to focus on pedestrian feature. In the second part, a time attention module is designed to judge the quality score of each pedestrian, and to reduce the weight of the incomplete pedestrian image to alleviate the occlusion problem. We evaluate our proposed architecture on three datasets, iLIDS-VID, PRID2011 and MARS. Extensive comparative experimental results show that our proposed method achieves state-of-the-art results.

  • Discovering Multiple Clusters of Sightseeing Spots to Improve Tourist Satisfaction Using Network Motifs

    Tengfei SHAO  Yuya IEIRI  Reiko HISHIYAMA  

     
    PAPER-Office Information Systems, e-Business Modeling

      Pubricized:
    2021/07/09
      Vol:
    E104-D No:10
      Page(s):
    1640-1650

    Tourist satisfaction plays a very important role in the development of local community tourism. For the development of tourist destinations in local communities, it is important to measure, maintain, and improve tourist destination royalties over the medium to long term. It has been proven that improving tourist satisfaction is a major factor in improving tourist destination royalties. Therefore, to improve tourist satisfaction in local communities, we identified multiple clusters of sightseeing spots and determined that the satisfaction of tourists can be increased based on these clusters of sightseeing spots. Our discovery flow can be summarized as follows. First, we extracted tourism keywords from guidebooks on sightseeing spots. We then constructed a complex network of tourists and sightseeing spots based on the data collected from experiments conducted in Kyoto. Next, we added the corresponding tourism keywords to each sightseeing spot. Finally, by analyzing network motifs, we successfully discovered multiple clusters of sightseeing spots that could be used to improve tourist satisfaction.

  • Gradient Corrected Approximation for Binary Neural Networks

    Song CHENG  Zixuan LI  Yongsen WANG  Wanbing ZOU  Yumei ZHOU  Delong SHANG  Shushan QIAO  

     
    LETTER-Biocybernetics, Neurocomputing

      Pubricized:
    2021/07/05
      Vol:
    E104-D No:10
      Page(s):
    1784-1788

    Binary neural networks (BNNs), where both activations and weights are radically quantized to be {-1, +1}, can massively accelerate the run-time performance of convolution neural networks (CNNs) for edge devices, by computation complexity reduction and memory footprint saving. However, the non-differentiable binarizing function used in BNNs, makes the binarized models hard to be optimized, and introduces significant performance degradation than the full-precision models. Many previous works managed to correct the backward gradient of binarizing function with various improved versions of straight-through estimation (STE), or in a gradual approximate approach, but the gradient suppression problem was not analyzed and handled. Thus, we propose a novel gradient corrected approximation (GCA) method to match the discrepancy between binarizing function and backward gradient in a gradual and stable way. Our work has two primary contributions: The first is to approximate the backward gradient of binarizing function using a simple leaky-steep function with variable window size. The second is to correct the gradient approximation by standardizing the backward gradient propagated through binarizing function. Experiment results show that the proposed method outperforms the baseline by 1.5% Top-1 accuracy on ImageNet dataset without introducing extra computation cost.

301-320hit(4507hit)