The search functionality is under construction.

Keyword Search Result

[Keyword] EE(4053hit)

101-120hit(4053hit)

  • An Improved Real-Time Object Tracking Algorithm Based on Deep Learning Features

    Xianyu WANG  Cong LI  Heyi LI  Rui ZHANG  Zhifeng LIANG  Hai WANG  

     
    PAPER-Object Recognition and Tracking

      Pubricized:
    2022/01/07
      Vol:
    E106-D No:5
      Page(s):
    786-793

    Visual object tracking is always a challenging task in computer vision. During the tracking, the shape and appearance of the target may change greatly, and because of the lack of sufficient training samples, most of the online learning tracking algorithms will have performance bottlenecks. In this paper, an improved real-time algorithm based on deep learning features is proposed, which combines multi-feature fusion, multi-scale estimation, adaptive updating of target model and re-detection after target loss. The effectiveness and advantages of the proposed algorithm are proved by a large number of comparative experiments with other excellent algorithms on large benchmark datasets.

  • Bearing Remaining Useful Life Prediction Using 2D Attention Residual Network

    Wenrong XIAO  Yong CHEN  Suqin GUO  Kun CHEN  

     
    LETTER-Smart Industry

      Pubricized:
    2022/05/27
      Vol:
    E106-D No:5
      Page(s):
    818-820

    An attention residual network with triple feature as input is proposed to predict the remaining useful life (RUL) of bearings. First, the channel attention and spatial attention are connected in series into the residual connection of the residual neural network to obtain a new attention residual module, so that the newly constructed deep learning network can better pay attention to the weak changes of the bearing state. Secondly, the “triple feature” is used as the input of the attention residual network, so that the deep learning network can better grasp the change trend of bearing running state, and better realize the prediction of the RUL of bearing. Finally, The method is verified by a set of experimental data. The results show the method is simple and effective, has high prediction accuracy, and reduces manual intervention in RUL prediction.

  • Prediction of Driver's Visual Attention in Critical Moment Using Optical Flow

    Rebeka SULTANA  Gosuke OHASHI  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/01/26
      Vol:
    E106-D No:5
      Page(s):
    1018-1026

    In recent years, driver's visual attention has been actively studied for driving automation technology. However, the number of models is few to perceive an insight understanding of driver's attention in various moments. All attention models process multi-level image representations by a two-stream/multi-stream network, increasing the computational cost due to an increment of model parameters. However, multi-level image representation such as optical flow plays a vital role in tasks involving videos. Therefore, to reduce the computational cost of a two-stream network and use multi-level image representation, this work proposes a single stream driver's visual attention model for a critical situation. The experiment was conducted using a publicly available critical driving dataset named BDD-A. Qualitative results confirm the effectiveness of the proposed model. Moreover, quantitative results highlight that the proposed model outperforms state-of-the-art visual attention models according to CC and SIM. Extensive ablation studies verify the presence of optical flow in the model, the position of optical flow in the spatial network, the convolution layers to process optical flow, and the computational cost compared to a two-stream model.

  • Clustering-Based Neural Network for Carbon Dioxide Estimation

    Conghui LI  Quanlin ZHONG  Baoyin LI  

     
    LETTER-Intelligent Transportation Systems

      Pubricized:
    2022/08/01
      Vol:
    E106-D No:5
      Page(s):
    829-832

    In recent years, the applications of deep learning have facilitated the development of green intelligent transportation system (ITS), and carbon dioxide estimation has been one of important issues in green ITS. Furthermore, the carbon dioxide estimation could be modelled as the fuel consumption estimation. Therefore, a clustering-based neural network is proposed to analyze clusters in accordance with fuel consumption behaviors and obtains the estimated fuel consumption and the estimated carbon dioxide. In experiments, the mean absolute percentage error (MAPE) of the proposed method is only 5.61%, and the performance of the proposed method is higher than other methods.

  • Performance Aware Egress Path Discovery for Content Provider with SRv6 Egress Peer Engineering

    Yasunobu TOYOTA  Wataru MISHIMA  Koichiro KANAYA  Osamu NAKAMURA  

     
    PAPER

      Pubricized:
    2023/02/22
      Vol:
    E106-D No:5
      Page(s):
    927-939

    QoS of applications is essential for content providers, and it is required to improve the end-to-end communication quality from a content provider to users. Generally, a content provider's data center network is connected to multiple ASes and has multiple egress paths to reach the content user's network. However, on the Internet, the communication quality of network paths outside of the provider's administrative domain is a black box, so multiple egress paths cannot be quantitatively compared. In addition, it is impossible to determine a unique egress path within a network domain because the parameters that affect the QoS of the content are different for each network. We propose a “Performance Aware Egress Path Discovery” method to improve QoS for content providers. The proposed method uses two techniques: Egress Peer Engineering with Segment Routing over IPv6 and Passive End-to-End Measurement. The method is superior in that it allows various metrics depending on the type of content and can be used for measurements without affecting existing systems. To evaluate our method, we deployed the Performance Aware Egress Path Discovery System in an existing content provider network and conducted experiments to provide production services. Our findings from the experiment show that, in this network, 15.9% of users can expect a 30Mbps throughput improvement, and 13.7% of users can expect a 10ms RTT improvement.

  • A Fast Handover Mechanism for Ground-to-Train Free-Space Optical Communication using Station ID Recognition by Dual-Port Camera

    Kosuke MORI  Fumio TERAOKA  Shinichiro HARUYAMA  

     
    PAPER

      Pubricized:
    2023/03/08
      Vol:
    E106-D No:5
      Page(s):
    940-951

    There are demands for high-speed and stable ground-to-train optical communication as a network environment for trains. The existing ground-to-train optical communication system developed by the authors uses a camera and a QPD (Quadrant photo diode) to capture beacon light. The problem with the existing system is that it is impossible to identify the ground station. In the system proposed in this paper, a beacon light modulated with the ID of the ground station is transmitted, and the ground station is identified by demodulating the image from the dual-port camera on the opposite side. In this paper, we developed an actual system and conducted experiments using a car on the road. The results showed that only one packet was lost with the ping command every 1 ms near handover. Although the communication device itself has a bandwidth of 100 Mbps, the throughput before and after the handover was about 94 Mbps, and only dropped to about 89.4 Mbps during the handover.

  • Learning Local Similarity with Spatial Interrelations on Content-Based Image Retrieval

    Longjiao ZHAO  Yu WANG  Jien KATO  Yoshiharu ISHIKAWA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2023/02/14
      Vol:
    E106-D No:5
      Page(s):
    1069-1080

    Convolutional Neural Networks (CNNs) have recently demonstrated outstanding performance in image retrieval tasks. Local convolutional features extracted by CNNs, in particular, show exceptional capability in discrimination. Recent research in this field has concentrated on pooling methods that incorporate local features into global features and assess the global similarity of two images. However, the pooling methods sacrifice the image's local region information and spatial relationships, which are precisely known as the keys to the robustness against occlusion and viewpoint changes. In this paper, instead of pooling methods, we propose an alternative method based on local similarity, determined by directly using local convolutional features. Specifically, we first define three forms of local similarity tensors (LSTs), which take into account information about local regions as well as spatial relationships between them. We then construct a similarity CNN model (SCNN) based on LSTs to assess the similarity between the query and gallery images. The ideal configuration of our method is sought through thorough experiments from three perspectives: local region size, local region content, and spatial relationships between local regions. The experimental results on a modified open dataset (where query images are limited to occluded ones) confirm that the proposed method outperforms the pooling methods because of robustness enhancement. Furthermore, testing on three public retrieval datasets shows that combining LSTs with conventional pooling methods achieves the best results.

  • Speech Emotion Recognition Using Multihead Attention in Both Time and Feature Dimensions

    Yue XIE  Ruiyu LIANG  Zhenlin LIANG  Xiaoyan ZHAO  Wenhao ZENG  

     
    LETTER-Speech and Hearing

      Pubricized:
    2023/02/21
      Vol:
    E106-D No:5
      Page(s):
    1098-1101

    To enhance the emotion feature and improve the performance of speech emotion recognition, an attention mechanism is employed to recognize the important information in both time and feature dimensions. In the time dimension, multi-heads attention is modified with the last state of the long short-term memory (LSTM)'s output to match the time accumulation characteristic of LSTM. In the feature dimension, scaled dot-product attention is replaced with additive attention that refers to the method of the state update of LSTM to construct multi-heads attention. This means that a nonlinear change replaces the linear mapping in classical multi-heads attention. Experiments on IEMOCAP datasets demonstrate that the attention mechanism could enhance emotional information and improve the performance of speech emotion recognition.

  • Speech Enhancement for Laser Doppler Vibrometer Dealing with Unknown Irradiated Objects

    Chengkai CAI  Kenta IWAI  Takanobu NISHIURA  

     
    PAPER-Digital Signal Processing

      Pubricized:
    2022/09/30
      Vol:
    E106-A No:4
      Page(s):
    647-656

    The acquisition of distant sound has always been a hot research topic. Since sound is caused by vibration, one of the best methods for measuring distant sound is to use a laser Doppler vibrometer (LDV). This laser has high directivity, that enables it to acquire sound from far away, which is of great practical use for disaster relief and other situations. However, due to the vibration characteristics of the irradiated object itself and the reflectivity of its surface (or other reasons), the acquired sound is often lacking frequency components in certain frequency bands and is mixed with obvious noise. Therefore, when using LDV to acquire distant speech, if we want to recognize the actual content of the speech, it is necessary to enhance the acquired speech signal in some way. Conventional speech enhancement methods are not generally applicable due to the various types of degradation in observed speech. Moreover, while several speech enhancement methods for LDV have been proposed, they are only effective when the irradiated object is known. In this paper, we present a speech enhancement method for LDV that can deal with unknown irradiated objects. The proposed method is composed of noise reduction, pitch detection, power spectrum envelope estimation, power spectrum reconstruction, and phase estimation. Experimental results demonstrate the effectiveness of our method for enhancing the acquired speech with unknown irradiated objects.

  • A QR Decomposition Algorithm with Partial Greedy Permutation for Zero-Forcing Block Diagonalization

    Shigenori KINJO  Takayuki GAMOH  Masaaki YAMANAKA  

     
    PAPER-Communication Theory and Signals

      Pubricized:
    2022/10/18
      Vol:
    E106-A No:4
      Page(s):
    665-673

    A new zero-forcing block diagonalization (ZF-BD) scheme that enables both a more simplified ZF-BD and further increase in sum rate of MU-MIMO channels is proposed in this paper. The proposed scheme provides the improvement in BER performance for equivalent SU-MIMO channels. The proposed scheme consists of two components. First, a permuted channel matrix (PCM), which is given by moving the submatrix related to a target user to the bottom of a downlink MIMO channel matrix, is newly defined to obtain a precoding matrix for ZF-BD. Executing QR decomposition alone for a given PCM provides null space for the target user. Second, a partial MSQRD (PMSQRD) algorithm, which adopts MSQRD only for a target user to provide improvement in bit rate and BER performance for the user, is proposed. Some numerical simulations are performed, and the results show improvement in sum rate performance of the total system. In addition, appropriate bit allocation improves the bit error rate (BER) performance in each equivalent SU-MIMO channel. A successive interference cancellation is applied to achieve further improvement in BER performance of user terminals.

  • A Lightweight Automatic Modulation Recognition Algorithm Based on Deep Learning

    Dong YI  Di WU  Tao HU  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2022/09/30
      Vol:
    E106-B No:4
      Page(s):
    367-373

    Automatic modulation recognition (AMR) plays a critical role in modern communication systems. Owing to the recent advancements of deep learning (DL) techniques, the application of DL has been widely studied in AMR, and a large number of DL-AMR algorithms with high recognition rates have been developed. Most DL-AMR algorithm models have high recognition accuracy but have numerous parameters and are huge, complex models, which make them hard to deploy on resource-constrained platforms, such as satellite platforms. Some lightweight and low-complexity DL-AMR algorithm models also struggle to meet the accuracy requirements. Based on this, this paper proposes a lightweight and high-recognition-rate DL-AMR algorithm model called Lightweight Densely Connected Convolutional Network (DenseNet) Long Short-Term Memory network (LDLSTM). The model cascade of DenseNet and LSTM can achieve the same recognition accuracy as other advanced DL-AMR algorithms, but the parameter volume is only 1/12 that of these algorithms. Thus, it is advantageous to deploy LDLSTM in resource-constrained systems.

  • Handover Experiment of 60-GHz-Band Wireless LAN in over 200-km/h High-Speed Mobility Environment

    Tatsuhiko IWAKUNI  Daisei UCHIDA  Takuto ARAI  Shuki WAI  Naoki KITA  

     
    PAPER-Terrestrial Wireless Communication/Broadcasting Technologies

      Pubricized:
    2022/10/17
      Vol:
    E106-B No:4
      Page(s):
    384-391

    High-frequency wireless communication is drawing attention because of its potential to actualize huge transmission capacity in the next generation wireless system. The use of high-frequency bands requires dense deployment of access points to compensate for significant distance attenuation and diffraction loss. Dense deployment of access points in a mobility environment triggers an increase in the frequency of handover because the number of candidate access points increases. Therefore, simple handover schemes are needed. High-frequency wireless systems enable station position to be determined using their wideband and highly directional communication signals. Thus, simple handover based on position information estimated using the communication signal is possible. Interruptions caused by handover are also a huge barrier to actualizing stable high-frequency wireless communications. This paper proposes a seamless handover scheme using multiple radio units. This paper evaluates the combination of simple handover and the proposed scheme based on experiments using a formula racing car representing the fastest high-speed mobility environment. Experimental results show that seamless handover and high-speed wireless transmission over 200Mbps are achieved over a 400-m area even at station velocities of greater than 200km/h.

  • GConvLoc: WiFi Fingerprinting-Based Indoor Localization Using Graph Convolutional Networks

    Dongdeok KIM  Young-Joo SUH  

     
    LETTER-Information Network

      Pubricized:
    2023/01/13
      Vol:
    E106-D No:4
      Page(s):
    570-574

    We propose GConvLoc, a WiFi fingerprinting-based indoor localization method utilizing graph convolutional networks. Using the graph structure, we can consider the fingerprint data of the reference points and their location labels in addition to the fingerprint data of the test point at inference time. Experimental results show that GConvLoc outperforms baseline methods that do not utilize graphs.

  • ConvNeXt-Haze: A Fog Image Classification Algorithm for Small and Imbalanced Sample Dataset Based on Convolutional Neural Network

    Fuxiang LIU  Chen ZANG  Lei LI  Chunfeng XU  Jingmin LUO  

     
    PAPER

      Pubricized:
    2022/11/22
      Vol:
    E106-D No:4
      Page(s):
    488-494

    Aiming at the different abilities of the defogging algorithms in different fog concentrations, this paper proposes a fog image classification algorithm for a small and imbalanced sample dataset based on a convolution neural network, which can classify the fog images in advance, so as to improve the effect and adaptive ability of image defogging algorithm in fog and haze weather. In order to solve the problems of environmental interference, camera depth of field interference and uneven feature distribution in fog images, the CutBlur-Gauss data augmentation method and focal loss and label smoothing strategies are used to improve the accuracy of classification. It is compared with the machine learning algorithm SVM and classical convolution neural network classification algorithms alexnet, resnet34, resnet50 and resnet101. This algorithm achieves 94.5% classification accuracy on the dataset in this paper, which exceeds other excellent comparison algorithms at present, and achieves the best accuracy. It is proved that the improved algorithm has better classification accuracy.

  • CAMRI Loss: Improving the Recall of a Specific Class without Sacrificing Accuracy

    Daiki NISHIYAMA  Kazuto FUKUCHI  Youhei AKIMOTO  Jun SAKUMA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2023/01/23
      Vol:
    E106-D No:4
      Page(s):
    523-537

    In real world applications of multiclass classification models, misclassification in an important class (e.g., stop sign) can be significantly more harmful than in other classes (e.g., no parking). Thus, it is crucial to improve the recall of an important class while maintaining overall accuracy. For this problem, we found that improving the separation of important classes relative to other classes in the feature space is effective. Existing methods that give a class-sensitive penalty for cross-entropy loss do not improve the separation. Moreover, the methods designed to improve separations between all classes are unsuitable for our purpose because they do not consider the important classes. To achieve the separation, we propose a loss function that explicitly gives loss for the feature space, called class-sensitive additive angular margin (CAMRI) loss. CAMRI loss is expected to reduce the variance of an important class due to the addition of a penalty to the angle between the important class features and the corresponding weight vectors in the feature space. In addition, concentrating the penalty on only the important class hardly sacrifices separating the other classes. Experiments on CIFAR-10, GTSRB, and AwA2 showed that CAMRI loss could improve the recall of a specific class without sacrificing accuracy. In particular, compared with GTSRB's second-worst class recall when trained with cross-entropy loss, CAMRI loss improved recall by 9%.

  • Speech Recognition for Air Traffic Control via Feature Learning and End-to-End Training

    Peng FAN  Xiyao HUA  Yi LIN  Bo YANG  Jianwei ZHANG  Wenyi GE  Dongyue GUO  

     
    PAPER-Speech and Hearing

      Pubricized:
    2023/01/23
      Vol:
    E106-D No:4
      Page(s):
    538-544

    In this work, we propose a new automatic speech recognition (ASR) system based on feature learning and an end-to-end training procedure for air traffic control (ATC) systems. The proposed model integrates the feature learning block, recurrent neural network (RNN), and connectionist temporal classification loss to build an end-to-end ASR model. Facing the complex environments of ATC speech, instead of the handcrafted features, a learning block is designed to extract informative features from raw waveforms for acoustic modeling. Both the SincNet and 1D convolution blocks are applied to process the raw waveforms, whose outputs are concatenated to the RNN layers for the temporal modeling. Thanks to the ability to learn representations from raw waveforms, the proposed model can be optimized in a complete end-to-end manner, i.e., from waveform to text. Finally, the multilingual issue in the ATC domain is also considered to achieve the ASR task by constructing a combined vocabulary of Chinese characters and English letters. The proposed approach is validated on a multilingual real-world corpus (ATCSpeech), and the experimental results demonstrate that the proposed approach outperforms other baselines, achieving a 6.9% character error rate.

  • A New Analysis of the Kipnis-Shamir Method Solving the MinRank Problem

    Shuhei NAKAMURA  Yacheng WANG  Yasuhiko IKEMATSU  

     
    PAPER

      Pubricized:
    2022/09/29
      Vol:
    E106-A No:3
      Page(s):
    203-211

    The MinRank problem is investigated as a problem related to rank attacks in multivariate cryptography and the decoding of rank codes in coding theory. The Kipnis-Shamir method is one of the methods to solve the problem, and recently, significant progress has been made in its complexity estimation by Verbel et al. As this method reduces the problem to an MQ problem, which asks for a solution to a system of quadratic equations, its complexity depends on the solving degree of a quadratic system deduced from the method. A theoretical value introduced by Verbel et al. approximates the minimal solving degree of the quadratic systems in the method although their value is defined under a certain limit for the system considered. A quadratic system outside their limitation often has a larger solving degree, but the solving complexity is not always higher because it has a smaller number of variables and equations. Thus, in order to discuss the best complexity of the Kipnis-Shamir method, a theoretical value is needed to approximate the solving degree of each quadratic system deduced from the method. A quadratic system deduced from the Kipnis-Shamir method always has a multi-degree, and the solving complexity is influenced by this property. In this study, we introduce a theoretical value defined by such a multi-degree and show that it approximates the solving degree of each quadratic system. Thus, the systems deduced from the method are compared, and the best complexity is discussed. As an application, for the MinRank attack using the Kipnis-Shamir method against the multivariate signature scheme Rainbow, we show a case in which a deduced quadratic system outside Verbel et al.'s limitation is the best. In particular, the complexity estimation of the MinRank attack using the KS method against the Rainbow parameter sets I, III and V is reduced by about 172, 140 and 212 bits, respectively, from Verbel et al.'s estimation.

  • Profiling Deep Learning Side-Channel Attacks Using Multi-Label against AES Circuits with RSM Countermeasure

    Yuta FUKUDA  Kota YOSHIDA  Hisashi HASHIMOTO  Kunihiro KURODA  Takeshi FUJINO  

     
    PAPER

      Pubricized:
    2022/09/08
      Vol:
    E106-A No:3
      Page(s):
    294-305

    Deep learning side-channel attacks (DL-SCAs) have been actively studied in recent years. In the DL-SCAs, deep neural networks (DNNs) are trained to predict the internal states of the cryptographic operation from the side-channel information such as power traces. It is important to select suitable DNN output labels expressing an internal states for successful DL-SCAs. We focus on the multi-label method proposed by Zhang et al. for the hardware-implemented advanced encryption standard (AES). They used the power traces supplied from the AES-HD public dataset, and reported to reveal a single key byte on conditions in which the target key was the same as the key used for DNN training (profiling key). In this paper, we discuss an improvement for revealing all the 16 key bytes in practical conditions in which the target key is different from the profiling key. We prepare hardware-implemented AES without SCA countermeasures on ASIC for the experimental environment. First, our experimental results show that the DNN using multi-label does not learn side-channel leakage sufficiently from the power traces acquired with only one key. Second, we report that DNN using multi-label learns the most of side-channel leakage by using three kinds of profiling keys, and all the 16 target key bytes are successfully revealed even if the target key is different from the profiling keys. Finally, we applied the proposed method, DL-SCA using multi-label and three profiling keys against hardware-implemented AES with rotating S-boxes masking (RSM) countermeasures. The experimental result shows that all the 16 key bytes are successfully revealed by using only 2,000 attack traces. We also studied the reasons for the high performance of the proposed method against RSM countermeasures and found that the information from the weak bits is effectively exploited.

  • Deep Learning of Damped AMP Decoding Networks for Sparse Superposition Codes via Annealing

    Toshihiro YOSHIDA  Keigo TAKEUCHI  

     
    PAPER-Communication Theory and Signals

      Pubricized:
    2022/07/22
      Vol:
    E106-A No:3
      Page(s):
    414-421

    This paper addresses short-length sparse superposition codes (SSCs) over the additive white Gaussian noise channel. Damped approximate message-passing (AMP) is used to decode short SSCs with zero-mean independent and identically distributed Gaussian dictionaries. To design damping factors in AMP via deep learning, this paper constructs deep-unfolded damped AMP decoding networks. An annealing method for deep learning is proposed for designing nearly optimal damping factors with high probability. In annealing, damping factors are first optimized via deep learning in the low signal-to-noise ratio (SNR) regime. Then, the obtained damping factors are set to the initial values in stochastic gradient descent, which optimizes damping factors for slightly larger SNR. Repeating this annealing process designs damping factors in the high SNR regime. Numerical simulations show that annealing mitigates fluctuation in learned damping factors and outperforms exhaustive search based on an iteration-independent damping factor.

  • Vulnerability Estimation of DNN Model Parameters with Few Fault Injections

    Yangchao ZHANG  Hiroaki ITSUJI  Takumi UEZONO  Tadanobu TOBA  Masanori HASHIMOTO  

     
    PAPER

      Pubricized:
    2022/11/09
      Vol:
    E106-A No:3
      Page(s):
    523-531

    The reliability of deep neural networks (DNN) against hardware errors is essential as DNNs are increasingly employed in safety-critical applications such as automatic driving. Transient errors in memory, such as radiation-induced soft error, may propagate through the inference computation, resulting in unexpected output, which can adversely trigger catastrophic system failures. As a first step to tackle this problem, this paper proposes constructing a vulnerability model (VM) with a small number of fault injections to identify vulnerable model parameters in DNN. We reduce the number of bit locations for fault injection significantly and develop a flow to incrementally collect the training data, i.e., the fault injection results, for VM accuracy improvement. We enumerate key features (KF) that characterize the vulnerability of the parameters and use KF and the collected training data to construct VM. Experimental results show that VM can estimate vulnerabilities of all DNN model parameters only with 1/3490 computations compared with traditional fault injection-based vulnerability estimation.

101-120hit(4053hit)