Shoichi HIROSE Hidenori KUWAKADO Hirotaka YOSHIDA
Hirose, Kuwakado and Yoshida proposed a nonce-based authenticated encryption scheme Lae0 based on Lesamnta-LW in 2019. Lesamnta-LW is a block-cipher-based iterated hash function included in the ISO/IEC 29192-5 lightweight hash-function standard. They also showed that Lae0 satisfies both privacy and authenticity if the underlying block cipher is a pseudorandom permutation. Unfortunately, their result implies only about 64-bit security for instantiation with the dedicated block cipher of Lesamnta-LW. In this paper, we analyze the security of Lae0 in the ideal cipher model. Our result implies about 120-bit security for instantiation with the block cipher of Lesamnta-LW.
Mariana RODRIGUES MAKIUCHI Tifani WARNITA Nakamasa INOUE Koichi SHINODA Michitaka YOSHIMURA Momoko KITAZAWA Kei FUNAKI Yoko EGUCHI Taishiro KISHIMOTO
We propose a non-invasive and cost-effective method to automatically detect dementia by utilizing solely speech audio data. We extract paralinguistic features for a short speech segment and use Gated Convolutional Neural Networks (GCNN) to classify it into dementia or healthy. We evaluate our method on the Pitt Corpus and on our own dataset, the PROMPT Database. Our method yields the accuracy of 73.1% on the Pitt Corpus using an average of 114 seconds of speech data. In the PROMPT Database, our method yields the accuracy of 74.7% using 4 seconds of speech data and it improves to 80.8% when we use all the patient's speech data. Furthermore, we evaluate our method on a three-class classification problem in which we included the Mild Cognitive Impairment (MCI) class and achieved the accuracy of 60.6% with 40 seconds of speech data.
Yan ZHAO Yue XIE Ruiyu LIANG Li ZHANG Li ZHAO Chengyu LIU
Depression endangers people's health conditions and affects the social order as a mental disorder. As an efficient diagnosis of depression, automatic depression detection has attracted lots of researcher's interest. This study presents an attention-based Long Short-Term Memory (LSTM) model for depression detection to make full use of the difference between depression and non-depression between timeframes. The proposed model uses frame-level features, which capture the temporal information of depressive speech, to replace traditional statistical features as an input of the LSTM layers. To achieve more multi-dimensional deep feature representations, the LSTM output is then passed on attention layers on both time and feature dimensions. Then, we concat the output of the attention layers and put the fused feature representation into the fully connected layer. At last, the fully connected layer's output is passed on to softmax layer. Experiments conducted on the DAIC-WOZ database demonstrate that the proposed attentive LSTM model achieves an average accuracy rate of 90.2% and outperforms the traditional LSTM network and LSTM with local attention by 0.7% and 2.3%, respectively, which indicates its feasibility.
This paper proposes a switched pinning control method with a multi-rating mechanism for vehicle platoons. The platoons are expressed as multi-agent systems consisting of mass-damper systems in which pinning agents receive target velocities from external devices (ex. intelligent traffic signals). We construct model predictive control (MPC) algorithm that switches pinning agents via mixed-integer quadratic programmings (MIQP) problems. The optimization rate is determined according to the convergence rate to the target velocities and the inter-vehicular distances. This multi-rating mechanism can reduce the computational load caused by iterative calculation. Numerical results demonstrate that our method has a reduction effect on the string instability by selecting the pinning agents to minimize errors of the inter-vehicular distances to the target distances.
A construction method of self-orthogonal and self-dual quasi-cyclic codes is shown which relies on factorization of modulus polynomials for cyclicity in this study. The smaller-size generator polynomial matrices are used instead of the generator matrices as linear codes. An algorithm based on Chinese remainder theorem finds the generator polynomial matrix on the original modulus from the ones constructed on each factor. This method enables us to efficiently construct and search these codes when factoring modulus polynomials into reciprocal polynomials.
To cope with complicated interference scenarios in realistic acoustic environment, supervised deep neural networks (DNNs) are investigated to estimate different user-defined targets. Such techniques can be broadly categorized into magnitude estimation and time-frequency mask estimation techniques. Further, the mask such as the Wiener gain can be estimated directly or derived by the estimated interference power spectral density (PSD) or the estimated signal-to-interference ratio (SIR). In this paper, we propose to incorporate the multi-task learning in DNN-based single-channel speech enhancement by using the speech presence probability (SPP) as a secondary target to assist the target estimation in the main task. The domain-specific information is shared between two tasks to learn a more generalizable representation. Since the performance of multi-task network is sensitive to the weight parameters of loss function, the homoscedastic uncertainty is introduced to adaptively learn the weights, which is proven to outperform the fixed weighting method. Simulation results show the proposed multi-task scheme improves the speech enhancement performance overall compared to the conventional single-task methods. And the joint direct mask and SPP estimation yields the best performance among all the considered techniques.
Hideya SO Kazuhiko FUKAWA Hayato SOYA Yuyuan CHANG
In unlicensed spectrum, wireless communications employing carrier sense multiple access with collision avoidance (CSMA/CA) suffer from longer transmission delay time as the number of user terminals (UTs) increases, because packet collisions are more likely to occur. To cope with this problem, this paper proposes a new multiuser detection (MUD) scheme that uses both request-to-send (RTS) and enhanced clear-to-send (eCTS) for high-reliable and low-latency wireless communications. As in conventional MUD scheme, the metric-combining MUD (MC-MUD) calculates log likelihood functions called metrics and accumulates the metrics for the maximum likelihood detection (MLD). To avoid increasing the number of states for MLD, MC-MUD forces the relevant UTs to retransmit their packets until all the collided packets are correctly detected, which requires a kind of central control and reduces the system throughput. To overcome these drawbacks, the proposed scheme, which is referred to as cancelling MC-MUD (CMC-MUD), deletes replicas of some of the collided packets from the received signals, once the packets are correctly detected during the retransmission. This cancellation enables new UTs to transmit their packets and then performs MLD without increasing the number of states, which improves the system throughput without increasing the complexity. In addition, the proposed scheme adopts RTS and eCTS. One UT that suffers from packet collision transmits RTS before the retransmission. Then, the corresponding access point (AP) transmits eCTS including addresses of the other UTs, which have experienced the same packet collision. To reproduce the same packet collision, these other UTs transmit their packets once they receive the eCTS. Computer simulations under one AP conditions evaluate an average carrier-to-interference ratio (CIR) range in which the proposed scheme is effective, and clarify that the transmission delay time of the proposed scheme is shorter than that of the conventional schemes. In two APs environments that can cause the hidden terminal problem, it is demonstrated that the proposed scheme achieves shorter transmission delay times than the conventional scheme with RTS and conventional CTS.
Cheng-Chung KUO Ding-Kai TSENG Chun-Wei TSAI Chu-Sing YANG
The development of an efficient detection mechanism to determine malicious network traffic has been a critical research topic in the field of network security in recent years. This study implemented an intrusion-detection system (IDS) based on a machine learning algorithm to periodically convert and analyze real network traffic in the campus environment in almost real time. The focuses of this study are on determining how to improve the detection rate of an IDS and how to detect more non-well-known port attacks apart from the traditional rule-based system. Four new features are used to increase the discriminant accuracy. In addition, an algorithm for balancing the data set was used to construct the training data set, which can also enable the learning model to more accurately reflect situations in real environment.
Ying KANG Cong LIU Ning WANG Dianxi SHI Ning ZHOU Mengmeng LI Yunlong WU
Siamese visual tracking, viewed as a problem of max-similarity matching to the target template, has absorbed increasing attention in computer vision. However, it is a challenge for current Siamese trackers that the demands of balance between accuracy in real-time tracking and robustness in long-time tracking are hard to meet. This work proposes a new Siamese based tracker with a dual-pipeline correlated fusion network (named as ADF-SiamRPN), which consists of one initial template for robust correlation, and the other transient template with the ability of adaptive feature optimal selection for accurate correlation. By the promotion from the learnable correlation-response fusion network afterwards, we are in pursuit of the synthetical improvement of tracking performance. To compare the performance of ADF-SiamRPN with state-of-the-art trackers, we conduct lots of experiments on benchmarks like OTB100, UAV123, VOT2016, VOT2018, GOT-10k, LaSOT and TrackingNet. The experimental results of tracking demonstrate that ADF-SiamRPN outperforms all the compared trackers and achieves the best balance between accuracy and robustness.
Kazuki KASAI Kaoru KAWAKITA Akira KUBOTA Hiroki TSURUSAKI Ryosuke WATANABE Masaru SUGANO
In this paper, we present an efficient and robust method for estimating Homography matrix for soccer field registration between a captured camera image and a soccer field model. The presented method first detects reliable field lines from the camera image through clustering. Constructing a novel directional feature of the intersection points of the lines in both the camera image and the model, the presented method then finds matching pairs of these points between the image and the model. Finally, Homography matrix estimations and validations are performed using the obtained matching pairs, which can reduce the required number of Homography matrix calculations. Our presented method uses possible intersection points outside image for the point matching. This effectively improves robustness and accuracy of Homography estimation as demonstrated in experimental results.
Akio KAWABATA Bijoy Chand CHATTERJEE Eiji OKI
In distributed processing for communication services, a proper server selection scheme is required to reduce delay by ensuring the event occurrence order. Although a conservative synchronization algorithm (CSA) has been used to achieve this goal, an optimistic synchronization algorithm (OSA) can be feasible for synchronizing distributed systems. In comparison with CSA, which reproduces events in occurrence order before processing applications, OSA can be feasible to realize low delay communication as the processing events arrive sequentially. This paper proposes an optimal server selection scheme that uses OSA for distributed processing systems to minimize end-to-end delay under the condition that maximum status holding time is limited. In other words, the end-to-end delay is minimized based on the allowed rollback time, which is given according to the application designing aspects and availability of computing resources. Numerical results indicate that the proposed scheme reduces the delay compared to the conventional scheme.
Kaiyu WANG Sichen TAO Rong-Long WANG Yuki TODO Shangce GAO
In 2019, a new selection method, named fitness-distance balance (FDB), was proposed. FDB has been proved to have a significant effect on improving the search capability for evolutionary algorithms. But it still suffers from poor flexibility when encountering various optimization problems. To address this issue, we propose a functional weights-enhanced FDB (FW). These functional weights change the original weights in FDB from fixed values to randomly generated ones by a distribution function, thereby enabling the algorithm to select more suitable individuals during the search. As a case study, FW is incorporated into the spherical search algorithm. Experimental results based on various IEEE CEC2017 benchmark functions demonstrate the effectiveness of FW.
Yue LI Xiaosheng YU Haijun CAO Ming XU
An autoencoder is trained to generate the background from the surveillance image by setting the training label as the shuffled input, instead of the input itself in a traditional autoencoder. Then the multi-scale features are extracted by a sparse autoencoder from the surveillance image and the corresponding background to detect foreground.
Tengfei SHAO Yuya IEIRI Reiko HISHIYAMA
Tourist satisfaction plays a very important role in the development of local community tourism. For the development of tourist destinations in local communities, it is important to measure, maintain, and improve tourist destination royalties over the medium to long term. It has been proven that improving tourist satisfaction is a major factor in improving tourist destination royalties. Therefore, to improve tourist satisfaction in local communities, we identified multiple clusters of sightseeing spots and determined that the satisfaction of tourists can be increased based on these clusters of sightseeing spots. Our discovery flow can be summarized as follows. First, we extracted tourism keywords from guidebooks on sightseeing spots. We then constructed a complex network of tourists and sightseeing spots based on the data collected from experiments conducted in Kyoto. Next, we added the corresponding tourism keywords to each sightseeing spot. Finally, by analyzing network motifs, we successfully discovered multiple clusters of sightseeing spots that could be used to improve tourist satisfaction.
Enze YANG Shuoyan LIU Yuxin LIU Kai FANG
Crowd flow prediction in high density urban scenes is involved in a wide range of intelligent transportation and smart city applications, and it has become a significant topic in urban computing. In this letter, a CNN-based framework called Pyramidal Spatio-Temporal Network (PSTNet) for crowd flow prediction is proposed. Spatial encoding is employed for spatial representation of external factors, while prior pyramid enhances feature dependence of spatial scale distances and temporal spans, after that, post pyramid is proposed to fuse the heterogeneous spatio-temporal features of multiple scales. Experimental results based on TaxiBJ and MobileBJ demonstrate that proposed PSTNet outperforms the state-of-the-art methods.
Atomu SAKAI Keiichi MIZUTANI Takeshi MATSUMURA Hiroshi HARADA
The Dynamic Spectrum Sharing (DSS) system, which uses the frequency band allocated to incumbent systems (i.e., primary users) has attracted attention to expand the available bandwidth of the fifth-generation mobile communication (5G) systems in the sub-6GHz band. In Japan, a DSS system in the 2.3GHz band, in which the ARIB STD-B57-based Field Pickup Unit (FPU) is assigned as an incumbent system, has been studied for the secondary use of 5G systems. In this case, the incumbent FPU is a mobile system, and thus, the DSS system needs to use not only a spectrum sharing database but also radio sensors to detect primary signals with high accuracy, protect the primary system from interference, and achieve more secure spectrum sharing. This paper proposes highly efficient sensing methods for detecting the ARIB STD-B57-based FPU signals in the 2.3GHz band. The proposed methods can be applied to two types of the FPU signal; those that apply the Continuous Pilot (CP) mode pilot and the Scattered Pilot (SP) mode pilot. Moreover, we apply a sample addition method and a symbol addition method for improving the detection performance. Even in the 3GPP EVA channel environment, the proposed method can, with a probability of more than 99%, detect the FPU signal with an SNR of -10dB. In addition, we propose a quantized reference signal for reducing the implementation complexity of the complex cross-correlation circuit. The proposed reference signal can reduce the number of quantization bits of the reference signal to 2 bits for in-phase and 3 bits for orthogonal components.
Motohiro SUNOUCHI Masaharu YOSHIOKA
This paper proposes new acoustic feature signatures based on the multiscale fractal dimension (MFD), which are robust against the diversity of environmental sounds, for the content-based similarity search. The diversity of sound sources and acoustic compositions is a typical feature of environmental sounds. Several acoustic features have been proposed for environmental sounds. Among them is the widely-used Mel-Frequency Cepstral Coefficients (MFCCs), which describes frequency-domain features. However, in addition to these features in the frequency domain, environmental sounds have other important features in the time domain with various time scales. In our previous paper, we proposed enhanced multiscale fractal dimension signature (EMFD) for environmental sounds. This paper extends EMFD by using the kernel density estimation method, which results in better performance of the similarity search tasks. Furthermore, it newly proposes another acoustic feature signature based on MFD, namely very-long-range multiscale fractal dimension signature (MFD-VL). The MFD-VL signature describes several features of the time-varying envelope for long periods of time. The MFD-VL signature has stability and robustness against background noise and small fluctuations in the parameters of sound sources, which are produced in field recordings. We discuss the effectiveness of these signatures in the similarity sound search by comparing with acoustic features proposed in the DCASE 2018 challenges. Due to the unique descriptiveness of our proposed signatures, we confirmed the signatures are effective when they are used with other acoustic features.
A method for detecting the timing of photodiode (PD) saturation without using an in-pixel time-to-digital converter (TDC) is proposed. Detecting PD saturation time is an approach to extend the dynamic range of a CMOS image sensor (CIS) without multiple exposures. In addition to accumulated charges in a PD, PD saturation time can be used as a signal related to light intensity. However, in previously reported CISs with detecting PD saturation time, an in-pixel TDC is used to detect and store PD saturation time. That makes the resolution of a CIS lower because an in-pixel TDC requires a large area. As for the proposed pixel circuit, PD saturation time is detected and stored as a voltage in a capacitor. The voltage is read and converted to a digital code by a column ADC after an exposure. As a result, an in-pixel TDC is not required. A signal-processing and calibration method for combining two signals, which are saturation time and accumulated charges, linearly are also proposed. Circuit simulations confirmed that the proposed method extends the dynamic range by 36 dB and its total dynamic range to 95 dB. Effectiveness of the calibration was also confirmed through circuit simulations.
Yujin ZHENG Yan LIN Zhuo ZHANG Qinglin ZHANG Qiaoqiao XIA
Linear programming (LP) decoding based on the alternating direction method of multipliers (ADMM) has proved to be effective for low-density parity-check (LDPC) codes. However, for high-density parity-check (HDPC) codes, the ADMM-LP decoder encounters two problems, namely a high-density check matrix in HDPC codes and a great number of pseudocodewords in HDPC codes' fundamental polytope. The former problem makes the check polytope projection extremely complex, and the latter one leads to poor frame error rates (FER) performance. To address these issues, we introduce the even vertex algorithm (EVA) into the ADMM-LP decoding algorithm for HDPC codes, named as HDPC-EVA. HDPC-EVA can reduce the complexity of the projection process and improve the FER performance. We further enhance the proposed decoder by the automorphism groups of codes, creating diversity in the parity-check matrix. The simulation results show that the proposed decoder is capable of cutting down the average decoding time for each iteration by 30%-60%, as well as achieving near maximum likelihood (ML) performance on some BCH codes.
Akira KITAYAMA Akira KURIYAMA Hideyuki NAGAISHI Hiroshi KURODA
Long-range radars (LRRs) for higher level autonomous driving (AD) will require more antennas than simple driving assistance. The point at issue here is 50-60% of the LRR module area is used for antennas. To miniaturize LRR modules, we use horn and lens antenna with highly efficient gain. In this paper, we propose two high-density implementation techniques for radio-frequency (RF) front-end using horn and lens antennas. In the first technique, the gap between antennas was eliminated by taking advantage of the high isolation performance of horn and lens antennas. In the second technique, the RF front-end including micro-strip-lines, monolithic microwave integrated circuits, and peripheral parts is placed in the valley area of each horn. We fabricated a prototype LRR operating at 77 GHz with only one printed circuit board (PCB). To detect vehicles horizontally and vertically, this LRR has a minimum antenna configuration of one Tx antenna and four Rx antennas placed in 2×2 array, and 30 mm thickness. Evaluation results revealed that vehicles could be detected up to 320 m away and that the horizontal and vertical angle error was less than +/- 0.2 degrees, which is equivalent to the vehicle width over 280 m. Thus, horn and lens antennas implemented using the proposed techniques are very suitable for higher level AD LRRs.