We consider the problem of finding the best subset of sensors in wireless sensor networks where linear Bayesian parameter estimation is conducted from the selected measurements corrupted by correlated noise. We aim to directly minimize the estimation error which is manipulated by using the QR and LU factorizations. We derive an analytic result which expedites the sensor selection in a greedy manner. We also provide the complexity of the proposed algorithm in comparison with previous selection methods. We evaluate the performance through numerical experiments using random measurements under correlated noise and demonstrate a competitive estimation accuracy of the proposed algorithm with a reasonable increase in complexity as compared with the previous selection methods.
Shijie WANG Xuejiao HU Sheng LIU Ming LI Yang LI Sidan DU
Detecting key frames in videos has garnered substantial attention in recent years, it is a point-level task and has deep research value and application prospect in daily life. For instances, video surveillance system, video cover generation and highlight moment flashback all demands the technique of key frame detection. However, the task is beset by challenges such as the sparsity of key frame instances, imbalances between target frames and background frames, and the absence of post-processing method. In response to these problems, we introduce a novel and effective Temporal Interval Guided (TIG) framework to precisely localize specific frames. The framework is incorporated with a proposed Point-Level-Soft non-maximum suppression (PLS-NMS) post-processing algorithm which is suitable for point-level task, facilitated by the well-designed confidence score decay function. Furthermore, we propose a TIG-loss, exhibiting sensitivity to temporal interval from target frame, to optimize the two-stage framework. The proposed method can be broadly applied to key frame detection in video understanding, including action start detection and static video summarization. Extensive experimentation validates the efficacy of our approach on action start detection benchmark datasets: THUMOS’14 and Activitynet v1.3, and we have reached state-of-the-art performance. Competitive results are also demonstrated on SumMe and TVSum datasets for deep learning based static video summarization.
KuanChao CHU Satoshi YAMAZAKI Hideki NAKAYAMA
This work focuses on training dataset enhancement of informative relational triplets for Scene Graph Generation (SGG). Due to the lack of effective supervision, the current SGG model predictions perform poorly for informative relational triplets with inadequate training samples. Therefore, we propose two novel training dataset enhancement modules: Feature Space Triplet Augmentation (FSTA) and Soft Transfer. FSTA leverages a feature generator trained to generate representations of an object in relational triplets. The biased prediction based sampling in FSTA efficiently augments artificial triplets focusing on the challenging ones. In addition, we introduce Soft Transfer, which assigns soft predicate labels to general relational triplets to make more supervisions for informative predicate classes effectively. Experimental results show that integrating FSTA and Soft Transfer achieve high levels of both Recall and mean Recall in Visual Genome dataset. The mean of Recall and mean Recall is the highest among all the existing model-agnostic methods.
Jia-ji JIANG Hai-bin WAN Hong-min SUN Tuan-fa QIN Zheng-qiang WANG
In this paper, the Towards High Performance Voxel-based 3D Object Detection (Voxel-RCNN) three-dimensional (3D) point cloud object detection model is used as the benchmark network. Aiming at the problems existing in the current mainstream 3D point cloud voxelization methods, such as the backbone and the lack of feature expression ability under the bird’s-eye view (BEV), a high-performance voxel-based 3D object detection network (Reinforced Voxel-RCNN) is proposed. Firstly, a 3D feature extraction module based on the integration of inverted residual convolutional network and weight normalization is designed on the 3D backbone. This module can not only well retain more point cloud feature information, enhance the information interaction between convolutional layers, but also improve the feature extraction ability of the backbone network. Secondly, a spatial feature-semantic fusion module based on spatial and channel attention is proposed from a BEV perspective. The mixed use of channel features and semantic features further improves the network’s ability to express point cloud features. In the comparison of experimental results on the public dataset KITTI, the experimental results of this paper are better than many voxel-based methods. Compared with the baseline network, the 3D average accuracy and BEV average accuracy on the three categories of Car, Cyclist, and Pedestrians are improved. Among them, in the 3D average accuracy, the improvement rate of Car category is 0.23%, Cyclist is 0.78%, and Pedestrians is 2.08%. In the context of BEV average accuracy, enhancements are observed: 0.32% for the Car category, 0.99% for Cyclist, and 2.38% for Pedestrians. The findings demonstrate that the algorithm enhancement introduced in this study effectively enhances the accuracy of target category detection.
Zhishuo ZHANG Chengxiang TAN Xueyan ZHAO Min YANG
Entity alignment (EA) is a crucial task for integrating cross-lingual and cross-domain knowledge graphs (KGs), which aims to discover entities referring to the same real-world object from different KGs. Most existing embedding-based methods generate aligning entity representation by mining the relevance of triple elements, paying little attention to triple indivisibility and entity role diversity. In this paper, a novel framework named TTEA - Type-enhanced Ensemble Triple Representation via Triple-aware Attention for Cross-lingual Entity Alignment is proposed to overcome the above shortcomings from the perspective of ensemble triple representation considering triple specificity and diversity features of entity role. Specifically, the ensemble triple representation is derived by regarding relation as information carrier between semantic and type spaces, and hence the noise influence during spatial transformation and information propagation can be smoothly controlled via specificity-aware triple attention. Moreover, the role diversity of triple elements is modeled via triple-aware entity enhancement in TTEA for EA-oriented entity representation. Extensive experiments on three real-world cross-lingual datasets demonstrate that our framework makes comparative results.
Peng WANG Guifen CHEN Zhiyao SUN
Unmanned Aerial Vehicle (UAV)-assisted Mobile Edge Computing (MEC) can provide mobile users (MU) with additional computing services and a wide range of connectivity. This paper investigates the joint optimization strategy of task offloading and resource allocation for UAV-assisted MEC systems in complex scenarios with the goal of reducing the total system cost, consisting of task execution latency and energy consumption. We adopt a game theoretic approach to model the interaction process between the MEC server and the MU Stackelberg bilayer game model. Then, the original problem with complex multi-constraints is transformed into a duality problem using the Lagrangian duality method. Furthermore, we prove that the modeled Stackelberg bilayer game has a unique Nash equilibrium solution. In order to obtain an approximate optimal solution to the proposed problem, we propose a two-stage alternating iteration (TASR) algorithm based on the subgradient method and the marginal revenue optimization method. We evaluate the effective performance of the proposed algorithm through detailed simulation experiments. The simulation results show that the proposed algorithm is superior and robust compared to other benchmark methods and can effectively reduce the task execution latency and total system cost in different scenarios.
Batnasan LUVAANJALBA Elaine Yi-Ling WU
Emergency Medical Services (EMS) play a crucial role in healthcare systems, managing pre-hospital or out-of-hospital emergencies from the onset of an emergency call to the patient’s arrival at a healthcare facility. The design of an efficient ambulance location model is pivotal in enhancing survival rates, controlling morbidity, and preventing disability. Key factors in the classical models typically include travel time, demand zones, and the number of stations. While urban EMS systems have received extensive examination due to their centralized populations, rural areas pose distinct challenges. These include lower population density and longer response distances, contributing to a higher fatality rate due to sparse population distribution, limited EMS stations, and extended travel times. To address these challenges, we introduce a novel mathematical model that aims to optimize coverage and equity. A distinctive feature of our model is the integration of equity within the objective function, coupled with a focus on practical response time that includes the period required for personal protective equipment procedures, ensuring the model’s applicability and realism in emergency response scenarios. We tackle the proposed problem using a tailored genetic algorithm and propose a greedy algorithm for solution construction. The implementation of our tailored Genetic Algorithm promises efficient and effective EMS solutions, potentially enhancing emergency care and health outcomes in rural communities.
In this research, we investigated the digital/analog-operation utilizing ferroelectric nondoped HfO2 (FeND-HfO2) as a blocking layer (BL) in the Hf-based metal/oxide/nitride/oxide/Si (MONOS) nonvolatile memory (NVM), so called FeNOS NVM. The Al/HfN0.5/HfN1.1/HfO2/p-Si(100) FeNOS diodes realized small equivalent oxide thickness (EOT) of 4.5 nm with the density of interface states (Dit) of 5.3 × 1010 eV-1cm-2 which were suitable for high-speed and low-voltage operation. The flat-band voltage (VFB) was well controlled as 80-100 mV with the input pulses of ±3 V/100 ms controlled by the partial polarization of FeND-HfO2 BL at each 2-bit state operated by the charge injection with the input pulses of +8 V/1-100 ms.
To improve the recognition rate of the end-to-end modulation recognition method based on deep learning, a modulation recognition method of communication signals based on a cascade network is proposed, which is composed of two networks: Stacked Denoising Auto Encoder (SDAE) network and DCELDNN (Dilated Convolution, ECA Mechanism, Long Short-Term Memory, Deep Neural Networks) network. SDAE network is used to denoise the data, reconstruct the input data through encoding and decoding, and extract deep information from the data. DCELDNN network is constructed based on the CLDNN (Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks) network. In the DCELDNN network, dilated convolution is used instead of normal convolution to enlarge the receptive field and extract signal features, the Efficient Channel Attention (ECA) mechanism is introduced to enhance the expression ability of the features, the feature vector information is integrated by a Global Average Pooling (GAP) layer, and signal features are extracted by the DCELDNN network efficiently. Finally, end-to-end classification recognition of communication signals is realized. The test results on the RadioML2018.01a dataset show that the average recognition accuracy of the proposed method reaches 63.1% at SNR of -10 to 15 dB, compared with CNN, LSTM, and CLDNN models, the recognition accuracy is improved by 25.8%, 12.3%, and 4.8% respectively at 10 dB SNR.
Zixv SU Wei CHEN Yuanyuan YANG
In this paper, a cluster-based three-dimensional (3D) non-stationary vehicle-to-vehicle (V2V) channel model with circular arc motions and antenna rotates is proposed. The channel model simulates the complex urban communication scenario where clusters move with arbitrary velocities and directions. A novel cluster evolution algorithm with time-array consistency is developed to capture the non-stationarity. For time evolution, the birth-and-death (BD) property of clusters including birth, death, and rebirth are taken into account. Additionally, a visibility region (VR) method is proposed for array evolution, which is verified to be applicable to circular motions. Based on the Taylor expansion formula, a detailed derivation of space-time correlation function (ST-CF) with circular arc motions is shown. Statistical properties including ST-CF, Doppler power spectrum density (PSD), quasi-stationary interval, instantaneous Doppler frequency, root mean square delay spread (RMS-DS), delay PSD, and angular PSD are derived and analyzed. According to the simulated results, the non-stationarity in time, space, delay, and angular domains is captured. The presented results show that motion modes including linear motions as well as circular motions, the dynamic property of the scattering environment, and the velocity of the vehicle all have significant impacts on the statistical properties.
Smart cities aim to improve the quality of life of citizens and efficiency of city operations through utilization of 5G communication technology. Based on various technologies such as IoT, cloud computing, artificial intelligence, and big data, they provide smart services in terms of urban planning, development, and management for solving problems such as fine dust, traffic congestion and safety, energy efficiency, water shortage, and an aging population. However, as smart city has an open network structure, an adversary can easily try to gain illegal access and perform denial of service and sniffing attacks that can threaten the safety and privacy of citizens. In smart cities, the global mobility network (GLOMONET) supports mobile services between heterogeneous networks of mobile devices such as autonomous vehicles and drones. Recently, Chen et al. proposed a user authentication scheme for GLOMONET in smart cities. Nevertheless, we found some weaknesses in the scheme proposed by them. In this study, we propose a secure lightweight authentication for roaming services in a smart city, called SLARS, to enhance security. We proved that SLARS is more secure and efficient than the related authentication scheme for GLOMONET through security and performance analysis. Our analysis results show that SLARS satisfies all security requirements in GLOMONET and saves 72.7% of computation time compared to that of Chen et al.’s scheme.
Chen ZHONG Chegnyu WU Xiangyang LI Ao ZHAN Zhengqiang WANG
A novel temporal convolution network-gated recurrent unit (NTCN-GRU) algorithm is proposed for the greatest of constant false alarm rate (GO-CFAR) frequency hopping (FH) prediction, integrating GRU and Bayesian optimization (BO). GRU efficiently captures the semantic associations among long FH sequences, and mitigates the phenomenon of gradient vanishing or explosion. BO improves extracting data features by optimizing hyperparameters besides. Simulations demonstrate that the proposed algorithm effectively reduces the loss in the training process, greatly improves the FH prediction effect, and outperforms the existing FH sequence prediction model. The model runtime is also reduced by three-quarters compared with others FH sequence prediction models.
Feng LIU Helin WANG Conggai LI Yanli XU
This letter proposes a scheme for the backward transmission of the propagation-delay based three-user X channel, which is reciprocal to the forward transmission. The given scheme successfully delivers 10 expected messages in 6 time-slots by cyclic interference alignment without loss of degrees of freedom, which supports efficient bidirectional transmission between the two ends of the three-user X channel.
We propose a pre-T event-triggered controller (ETC) for the stabilization of a chain of integrators. Our per-T event-triggered controller is a modified event-triggered controller by adding a pre-defined positive constant T to the event-triggering condition. With this pre-T, the immediate advantages are (i) the often complicated additional analysis regarding the Zeno behavior is no longer needed, (ii) the positive lower bound of interexecution times can be specified, (iii) the number of control input updates can be further reduced. We carry out the rigorous system analysis and simulations to illustrate the advantages of our proposed method over the traditional event-triggered control method.
Izumi TSUNOKUNI Gen SATO Yusuke IKEDA Yasuhiro OIKAWA
This paper reports a spatial extrapolation of the sound field with a physics-informed neural network. We investigate the spatial extrapolation of the room impulse responses with physics-informed SIREN architecture. Furthermore, we proposed a noise-robust extrapolation method by introducing a tolerance term to the loss function.
Feng WANG Xiangyu WEN Lisheng LI Yan WEN Shidong ZHANG Yang LIU
The rapid advancement of cloud-edge-end collaboration offers a feasible solution to realize low-delay and low-energy-consumption data processing for internet of things (IoT)-based smart distribution grid. The major concern of cloud-edge-end collaboration lies on resource management. However, the joint optimization of heterogeneous resources involves multiple timescales, and the optimization decisions of different timescales are intertwined. In addition, burst electromagnetic interference will affect the channel environment of the distribution grid, leading to inaccuracies in optimization decisions, which can result in negative influences such as slow convergence and strong fluctuations. Hence, we propose a cloud-edge-end collaborative multi-timescale multi-service resource management algorithm. Large-timescale device scheduling is optimized by sliding window pricing matching, which enables accurate matching estimation and effective conflict elimination. Small-timescale compression level selection and power control are jointly optimized by disturbance-robust upper confidence bound (UCB), which perceives the presence of electromagnetic interference and adjusts exploration tendency for convergence improvement. Simulation outcomes illustrate the excellent performance of the proposed algorithm.
Radar emitter identification (REI) is a crucial function of electronic radar warfare support systems. The challenge emphasizes identifying and locating unique transmitters, avoiding potential threats, and preparing countermeasures. Due to the remarkable effectiveness of deep learning (DL) in uncovering latent features within data and performing classifications, deep neural networks (DNNs) have seen widespread application in radar emitter identification (REI). In many real-world scenarios, obtaining a large number of annotated radar transmitter samples for training identification models is essential yet challenging. Given the issues of insufficient labeled datasets and abundant unlabeled training datasets, we propose a novel REI method based on a semi-supervised learning (SSL) framework with virtual adversarial training (VAT). Specifically, two objective functions are designed to extract the semantic features of radar signals: computing cross-entropy loss for labeled samples and virtual adversarial training loss for all samples. Additionally, a pseudo-labeling approach is employed for unlabeled samples. The proposed VAT-based SS-REI method is evaluated on a radar dataset. Simulation results indicate that the proposed VAT-based SS-REI method outperforms the latest SS-REI method in recognition performance.
Latin squares are a classical and well-studied topic of discrete mathematics, and recently Takeuti and Adachi (IACR ePrint, 2023) proposed (2, n)-threshold secret sharing based on mutually orthogonal Latin squares (MOLS). Hence efficient constructions of as large sets of MOLS as possible are also important from practical viewpoints. In this letter, we determine the maximum number of MOLS among a known class of Latin squares defined by weighted sums. We also mention some known property of Latin squares interpreted via the relation to secret sharing and a connection of Takeuti-Adachi’s scheme to Shamir’s secret sharing scheme.
Tetsuya ARAKI Shin-ichi NAKANO
The dispersion problem is a variant of facility location problems, that has been extensively studied. Given a polygon with n edges on a plane we want to find k points in the polygon so that the minimum pairwise Euclidean distance of the k points is maximized. We call the problem the k-dispersion problem in a polygon. Intuitively, for an island, we want to locate k drone bases far away from each other in flying distance to avoid congestion in the sky. In this paper, we give a polynomial-time approximation scheme (PTAS) for this problem when k is a constant and ε < 1 (where ε is a positive real number). Our proposed algorithm runs in O(((1/ε)2 + n/ε)k) time with 1/(1 + ε) approximation, the first PTAS developed for this problem. Additionally, we consider three variations of the dispersion problem and design a PTAS for each of them.
Hongliang FU Qianqian LI Huawei TAO Chunhua ZHU Yue XIE Ruxue GUO
Speech emotion recognition (SER) is a key research technology to realize the third generation of artificial intelligence, which is widely used in human-computer interaction, emotion diagnosis, interpersonal communication and other fields. However, the aliasing of language and semantic information in speech tends to distort the alignment of emotion features, which affects the performance of cross-corpus SER system. This paper proposes a cross-corpus SER model based on causal emotion information representation (CEIR). The model uses the reconstruction loss of the deep autoencoder network and the source domain label information to realize the preliminary separation of causal features. Then, the causal correlation matrix is constructed, and the local maximum mean difference (LMMD) feature alignment technology is combined to make the causal features of different dimensions jointly distributed independent. Finally, the supervised fine-tuning of labeled data is used to achieve effective extraction of causal emotion information. The experimental results show that the average unweighted average recall (UAR) of the proposed algorithm is increased by 3.4% to 7.01% compared with the latest partial algorithms in the field.