Takuya SAKAMOTO Itsuki IWATA Toshiki MINAMI Takuya MATSUMOTO
There has been a growing interest in the application of radar technology to the monitoring of humans and animals and their positions, motions, activities, and vital signs. Radar can be used, for example, to remotely measure vital signs such as respiration and heartbeat without contact. Radar-based human sensing is expected to be adopted in a variety of fields, such as medicine, healthcare, and entertainment, but what can be realized by radar-based animal sensing? This paper reviews the latest research trends in the noncontact sensing of animals using radar systems. We also present examples of our past radar experiments for the respiratory measurement of monkeys and the heartbeat measurement of chimpanzees. The trends in this field are reviewed in terms of the target animal species, type of vital sign, and radar type and selection of frequencies.
Huafei WANG Xianpeng WANG Xiang LAN Ting SU
Using deep learning (DL) to achieve direction-of-arrival (DOA) estimation is an open and meaningful exploration. Existing DL-based methods achieve DOA estimation by spectrum regression or multi-label classification task. While, both of them face the problem of off-grid errors. In this paper, we proposed a cascaded deep neural network (DNN) framework named as off-grid network (OGNet) to provide accurate DOA estimation in the case of off-grid. The OGNet is composed of an autoencoder consisted by fully connected (FC) layers and a deep convolutional neural network (CNN) with 2-dimensional convolutional layers. In the proposed OGNet, the off-grid error is modeled into labels to achieve off-grid DOA estimation based on its sparsity. As compared to the state-of-the-art grid-based methods, the OGNet shows advantages in terms of precision and resolution. The effectiveness and superiority of the OGNet are demonstrated by extensive simulation experiments in different experimental conditions.
To fully exploit the attribute information in graphs and dynamically fuse the features from different modalities, this letter proposes the Attributed Graph Clustering Network with Adaptive Feature Fusion (AGC-AFF) for graph clustering, where an Attribute Reconstruction Graph Autoencoder (ARGAE) with masking operation learns to reconstruct the node attributes and adjacency matrix simultaneously, and an Adaptive Feature Fusion (AFF) mechanism dynamically fuses the features from different modules based on node attention. Extensive experiments on various benchmark datasets demonstrate the effectiveness of the proposed method.
Chunbo LIU Liyin WANG Zhikai ZHANG Chunmiao XIANG Zhaojun GU Zhi WANG Shuang WANG
Aiming at the problem that large-scale traffic data lack labels and take too long for feature extraction in network intrusion detection, an unsupervised intrusion detection method ACOPOD based on Adam asymmetric autoencoder and COPOD (Copula-Based Outlier Detection) algorithm is proposed. This method uses the Adam asymmetric autoencoder with a reduced structure to extract features from the network data and reduce the data dimension. Then, based on the Copula function, the joint probability distribution of all features is represented by the edge probability of each feature, and then the outliers are detected. Experiments on the published NSL-KDD dataset with six other traditional unsupervised anomaly detection methods show that ACOPOD achieves higher precision and has obvious advantages in running speed. Experiments on the real civil aviation air traffic management network dataset further prove that the method can effectively detect intrusion behavior in the real network environment, and the results are interpretable and helpful for attack source tracing.
To improve the recognition rate of the end-to-end modulation recognition method based on deep learning, a modulation recognition method of communication signals based on a cascade network is proposed, which is composed of two networks: Stacked Denoising Auto Encoder (SDAE) network and DCELDNN (Dilated Convolution, ECA Mechanism, Long Short-Term Memory, Deep Neural Networks) network. SDAE network is used to denoise the data, reconstruct the input data through encoding and decoding, and extract deep information from the data. DCELDNN network is constructed based on the CLDNN (Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks) network. In the DCELDNN network, dilated convolution is used instead of normal convolution to enlarge the receptive field and extract signal features, the Efficient Channel Attention (ECA) mechanism is introduced to enhance the expression ability of the features, the feature vector information is integrated by a Global Average Pooling (GAP) layer, and signal features are extracted by the DCELDNN network efficiently. Finally, end-to-end classification recognition of communication signals is realized. The test results on the RadioML2018.01a dataset show that the average recognition accuracy of the proposed method reaches 63.1% at SNR of -10 to 15 dB, compared with CNN, LSTM, and CLDNN models, the recognition accuracy is improved by 25.8%, 12.3%, and 4.8% respectively at 10 dB SNR.
Shinsuke IBI Takumi TAKAHASHI Hisato IWAI
This paper proposes a novel differential active self-interference canceller (DASIC) algorithm for asynchronous in-band full-duplex (IBFD) Gaussian filtered frequency shift keying (GFSK), which is designed for wireless Internet of Things (IoT). In IBFD communications, where two terminals simultaneously transmit and receive signals in the same frequency band, there is an extremely strong self-interference (SI). The SI can be mitigated by an active SI canceller (ASIC), which subtracts an interference replica based on channel state information (CSI) from the received signal. The challenging problem is the realization of asynchronous IBFD for wireless IoT in indoor environments. In the asynchronous mode, pilot contamination is induced by the non-orthogonality between asynchronous pilot sequences. In addition, the transceiver suffers from analog front-end (AFE) impairments, such as phase noise. Due to these impairments, the SI cannot be canceled entirely at the receiver, resulting in residual interference. To address the above issue, the DASIC incorporates the principle of the differential codec, which enables to suppress SI without the CSI estimation of SI owing to the differential structure. Also, on the premise of using an error correction technique, iterative detection and decoding (IDD) is applied to improve the detection capability while exchanging the extrinsic log-likelihood ratio (LLR) between the maximum a-posteriori probability (MAP) detector and the channel decoder. Finally, the validity of using the DASIC algorithm is evaluated by computer simulations in terms of the packet error rate (PER). The results clearly demonstrate the possibility of realizing asynchronous IBFD.
Daichi ISHIKAWA Naoki HAYASHI Shigemasa TAKAI
In this paper, we consider a distributed stochastic nonconvex optimization problem for multiagent systems. We propose a distributed stochastic gradient-tracking method with event-triggered communication. A group of agents cooperatively finds a critical point of the sum of local cost functions, which are smooth but not necessarily convex. We show that the proposed algorithm achieves a sublinear convergence rate by appropriately tuning the step size and the trigger threshold. Moreover, we show that agents can effectively solve a nonconvex optimization problem by the proposed event-triggered algorithm with less communication than by the existing time-triggered gradient-tracking algorithm. We confirm the validity of the proposed method by numerical experiments.
Sohei SHIMOMAI Kei UEDA Shinji KIMURA
Recently, Quantum Annealing (QA) has attracted attention as an efficient algorithm for combinatorial optimization problems. In QA, the input data size becomes large and its reduction is important for accelerating by the hardware emulation since the usable memory size and its bandwidth are limited. The paper proposes the compression method of input sparse matrices for QA emulator. The proposed method uses the sparseness of the coefficient matrix and the reappearance of the same values. An independent table is introduced and data are compressed by the search and registration method of two consecutive data in the value table. The proposed method is applied to Traveling Salesman Problem (TSP) with 32, 64 and 96 cities and Nurse Scheduling Problem (NSP). The proposed method could reduce the amount of data by 1/40 for 96 city TSP and could manage 96 city TSP on the hardware emulator. When applied to NSP, we confirmed the effectiveness of the proposed method by the compression ratio ranging from 1/4 to 1/11.8. The data reduction is also useful for the simulation/emulation performance when using the compressed data directly and 1.9 times faster speed can be found on 96 city TSP than the CSR-based method.
Junya YOSHIDA Naoki HAYASHI Shigemasa TAKAI
This paper presents a quantized gradient descent algorithm for distributed nonconvex optimization in multiagent systems that takes into account the bandwidth limitation of communication channels. Each agent encodes its estimation variable using a zoom-in parameter and sends the quantized intermediate variable to the neighboring agents. Then, each agent updates the estimation by decoding the received information. In this paper, we show that all agents achieve consensus and their estimated variables converge to a critical point in the optimization problem. A numerical example of a nonconvex logistic regression shows that there is a trade-off between the convergence rate of the estimation and the communication bandwidth.
Yuxiang ZHANG Dehua LIU Chuanpeng SU Juncheng LIU
Uncovered muck truck detection aims to detect the muck truck and distinguish whether it is covered or not by dust-proof net to trace the source of pollution. Unlike traditional detection problem, recalling all uncovered trucks is more important than accurate locating for pollution traceability. When two objects are very close in an image, the occluded object may not be recalled because the non-maximum suppression (NMS) algorithm can remove the overlapped proposal. To address this issue, we propose a Location First NMS method to match the ground truth boxes and predicted boxes by position rather than class identifier (ID) in the training stage. Firstly, a box matching method is introduced to re-assign the predicted box ID using the closest ground truth one, which can avoid object missing when the IoU of two proposals is greater than the threshold. Secondly, we design a loss function to adapt the proposed algorithm. Thirdly, a uncovered muck truck detection system is designed using the method in a real scene. Experiment results show the effectiveness of the proposed method.
Jingzhao DAI Ming LI Xuejiao HU Yang LI Sidan DU
Gaze following is the task of estimating where an observer is looking inside a scene. Both the observer and scene information must be learned to determine the gaze directions and gaze points. Recently, many existing works have only focused on scenes or observers. In contrast, revealed frameworks for gaze following are limited. In this paper, a gaze following method using a hybrid transformer is proposed. Based on the conventional method (GazeFollow), we conduct three developments. First, a hybrid transformer is applied for learning head images and gaze positions. Second, the pinball loss function is utilized to control the gaze point error. Finally, a novel ReLU layer with the reborn mechanism (reborn ReLU) is conducted to replace traditional ReLU layers in different network stages. To test the performance of our developments, we train our developed framework with the DL Gaze dataset and evaluate the model on our collected set. Through our experimental results, it can be proven that our framework can achieve outperformance over our referred methods.
Yasuhiro MOCHIDA Daisuke SHIRAI Koichi TAKASUGI
The demand for low-latency transmission of large-capacity video, such as 4K and 8K, is increasing for various applications such as live-broadcast program production, sports viewing, and medical care. In the broadcast industry, low-latency video transmission is required in remote production, an emerging workflow for outside broadcasting. For ideal remote production, long-distance transmission of uncompressed 8K60p video signals, ultra-low latency less than 16.7 ms, and PTP synchronization through network are required; however, no existing video-transmission system fully satisfy these requirements. We focused on optical transport technologies capable of long-distance and large-capacity communication, which were previously used only in telecommunication-carrier networks. To fully utilize optical transport technologies, we propose the first-ever video-transmission system architecture capable of sending and receiving uncompressed 8K video directly through large-capacity optical paths. A transmission timing control in seamless protection switching is also proposed to improve the tolerance to network impairment. As a means of implementation, we focused on whitebox transponder, an emerging type of optical transponder with a disaggregation configuration. The disaggregation configuration enables flexible configuration changes, additional implementations, and cost reduction by separating various functions of optical transponders and controlling them with a standardized interface. We implemented the ultra-low-latency video-transmission system utilizing whitebox transponder Galileo. We developed a hardware plug-in unit for video transmission (VideoPIU), and software to control the VideoPIU. In the video-transmission experiments with 120-km optical fiber, we confirmed that it was capable of transmitting uncompressed 8K60p video stably in 1.3 ms latency and highly accurate PTP synchronization through the optical network, which was required in the ideal remote production. In addition, the application to immersive sports viewing is also presented. Consequently, excellent potential to support the unprecedented applications is demonstrated.
Lingjun KONG Haiyang LIU Lianrong MA
This letter is concerned with incorrigible sets of binary linear codes. For a given binary linear code C, we represent the numbers of incorrigible sets of size up to ⌈3/2d - 1⌉ using the weight enumerator of C, where d is the minimum distance of C. In addition, we determine the incorrigible set enumerators of binary Golay codes G23 and G24 through combinatorial methods.
To achieve object recognition, it is necessary to find the unique features of the objects to be recognized. Results in prior research suggest that methods that use multiple modalities information are effective to find the unique features. In this paper, the overview of the system that can extract the features of the objects to be recognized by integrating visual, tactile, and auditory information as multimodal sensor information with VRAE is shown. Furthermore, a discussion about changing the combination of modalities information is also shown.
Yaping SUN Gaoqi DOU Hao WANG Yufei ZHANG
With the advent of the Internet of Things (IoT), short packet transmissions will dominate the future wireless communication. However, traditional coherent demodulation and channel estimation schemes require large pilot overhead, which may be highly inefficient for short packets in multipath fading scenarios. This paper proposes a novel pilot-free short packet structure based on the association of modulation on conjugate-reciprocal zeros (MOCZ) and tail-biting convolutional codes (TBCC), where a noncoherent demodulation and decoding scheme is designed without the channel state information (CSI) at the transceivers. We provide a construction method of constellation sets and demodulation rule for M-ary MOCZ. By deriving low complexity log-likelihood ratios (LLR) for M-ary MOCZ, we offer a reasonable balance between energy and bandwidth efficiency for joint coding and modulation scheme. Simulation results show that our proposed scheme can attain significant performance and throughput gains compared to the pilot-based coherent modulation scheme over multipath fading channels.
Daisuke KOBAYASHI Ken NAKAMURA Masaki KITAHARA Tatsuya OSAWA Yuya OMORI Takayuki ONISHI Hiroe IWASAKI
This paper describes a novel low-latency 4K 60 fps HEVC (high efficiency video coding)/H.265 multi-channel encoding system with content-aware bitrate control for live streaming. Adaptive bitrate (ABR) streaming techniques, such as MPEG-DASH (dynamic adaptive streaming over HTTP) and HLS (HTTP live streaming), spread widely on Internet video streaming. Live content has increased with the expansion of streaming services, which has led to demands for traffic reduction and low latency. To reduce network traffic, we propose content-aware dynamic and seamless bitrate control that supports multi-channel real-time encoding for ABR, including 4K 60 fps video. Our method further supports chunked packaging transfer to provide low-latency streaming. We adopt a hybrid architecture consisting of hardware and software processing. The system consists of multiple 4K HEVC encoder LSIs that each LSI can encode 4K 60 fps or up to high-definition (HD) ×4 videos efficiently with the proposed bitrate control method. The software takes the packaging process according to the various streaming protocol. Experimental results indicate that our method reduces encoding bitrates obtained with constant bitrate encoding by as much as 56.7%, and the streaming latency over MPEG-DASH is 1.77 seconds.
Satoru JIMBO Daiki OKONOGI Kota ANDO Thiem Van CHU Jaehoon YU Masato MOTOMURA Kazushi KAWAMURA
For formulating Quadratic Knapsack Problems (QKPs) into the form of Quadratic Unconstrained Binary Optimization (QUBO), it is necessary to introduce an integer variable, which converts and incorporates the knapsack capacity constraint into the overall energy function. In QUBO, this integer variable is encoded with auxiliary binary variables, and the encoding method used for it affects the behavior of Simulated Annealing (SA) significantly. For improving the efficiency of SA for QKP instances, this paper first visualized and analyzed their annealing processes encoded by conventional binary and unary encoding methods. Based on this analysis, we proposed a novel hybrid encoding (HE), getting the best of both worlds. The proposed HE obtained feasible solutions in the evaluation, outperforming the others in small- and medium-scale models.
Order-preserving encryption using the hypergeomatric probability distribution leaks about the half bits of a plaintext and the distance between two arbitrary plaintexts. To solve these problems, Popa et al. proposed a mutable order-preserving encoding. This is a keyless encoding scheme that adopts an order-preserving index locating the corresponding ciphertext via tree-based data structures. Unfortunately, it has the following shortcomings. First, the frequency of the ciphertexts reveals that of the plaintexts. Second, the indices are highly correlated to the corresponding plaintexts. For these reasons, statistical cryptanalysis may identify the encrypted fields using public information. To overcome these limitations, we propose a multi-tree approach to the mutable order-preserving encoding. The cost of interactions increases by the increased number of trees, but the proposed scheme mitigates the distribution leakage of plaintexts and also reduces the problematic correlation to plaintexts.
Manaya TOMIOKA Tsuneo KATO Akihiro TAMURA
A neural conversational model (NCM) based on an encoder-decoder recurrent neural network (RNN) with an attention mechanism learns different sequence-to-sequence mappings from what neural machine translation (NMT) learns even when based on the same technique. In the NCM, we confirmed that target-word-to-source-word mappings captured by the attention mechanism are not as clear and stationary as those for NMT. Considering that vector norms indicate a magnitude of information in the processing, we analyzed the inner workings of an encoder-decoder GRU-based NCM focusing on the norms of word embedding vectors and hidden vectors. First, we conducted correlation analyses on the norms of word embedding vectors with frequencies in the training set and with conditional entropies of a bi-gram language model to understand what is correlated with the norms in the encoder and decoder. Second, we conducted correlation analyses on norms of change in the hidden vector of the recurrent layer with their input vectors for the encoder and decoder, respectively. These analyses were done to understand how the magnitude of information propagates through the network. The analytical results suggested that the norms of the word embedding vectors are associated with their semantic information in the encoder, while those are associated with the predictability as a language model in the decoder. The analytical results further revealed how the norms propagate through the recurrent layer in the encoder and decoder.
Yang WANG Hongliang FU Huawei TAO Jing YANG Hongyi GE Yue XIE
This letter focuses on the cross-corpus speech emotion recognition (SER) task, in which the training and testing speech signals in cross-corpus SER belong to different speech corpora. Existing algorithms are incapable of effectively extracting common sentiment information between different corpora to facilitate knowledge transfer. To address this challenging problem, a novel convolutional auto-encoder and adversarial domain adaptation (CAEADA) framework for cross-corpus SER is proposed. The framework first constructs a one-dimensional convolutional auto-encoder (1D-CAE) for feature processing, which can explore the correlation among adjacent one-dimensional statistic features and the feature representation can be enhanced by the architecture based on encoder-decoder-style. Subsequently the adversarial domain adaptation (ADA) module alleviates the feature distributions discrepancy between the source and target domains by confusing domain discriminator, and specifically employs maximum mean discrepancy (MMD) to better accomplish feature transformation. To evaluate the proposed CAEADA, extensive experiments were conducted on EmoDB, eNTERFACE, and CASIA speech corpora, and the results show that the proposed method outperformed other approaches.