2D and 3D semantic segmentation play important roles in robotic scene understanding. However, current 3D semantic segmentation heavily relies on 3D point clouds, which are susceptible to factors such as point cloud noise, sparsity, estimation and reconstruction errors, and data imbalance. In this paper, a novel approach is proposed to enhance 3D semantic segmentation by incorporating 2D semantic segmentation from RGB-D sequences. Firstly, the RGB-D pairs are consistently segmented into 2D semantic maps using the tracking pipeline of Simultaneous Localization and Mapping (SLAM). This process effectively propagates object labels from full scans to corresponding labels in partial views with high probability. Subsequently, a novel Semantic Projection (SP) block is introduced, which integrates features extracted from localized 2D fragments across different camera viewpoints into their corresponding 3D semantic features. Lastly, the 3D semantic segmentation network utilizes a combination of 2D-3D fusion features to facilitate a merged semantic segmentation process for both 2D and 3D. Extensive experiments conducted on public datasets demonstrate the effective performance of the proposed 2D-assisted 3D semantic segmentation method.
Zhichao SHA Ziji MA Kunlai XIONG Liangcheng QIN Xueying WANG
Diagnosis at an early stage is clinically important for the cure of skin cancer. However, since some skin cancers have similar intuitive characteristics, and dermatologists rely on subjective experience to distinguish skin cancer types, the accuracy is often suboptimal. Recently, the introduction of computer methods in the medical field has better assisted physicians to improve the recognition rate but some challenges still exist. In the face of massive dermoscopic image data, residual network (ResNet) is more suitable for learning feature relationships inside big data because of its deeper network depth. Aiming at the deficiency of ResNet, this paper proposes a multi-region feature extraction and raising dimension matching method, which further improves the utilization rate of medical image features. This method firstly extracted rich and diverse features from multiple regions of the feature map, avoiding the deficiency of traditional residual modules repeatedly extracting features in a few fixed regions. Then, the fused features are strengthened by up-dimensioning the branch path information and stacking it with the main path, which solves the problem that the information of two paths is not ideal after fusion due to different dimensionality. The proposed method is experimented on the International Skin Imaging Collaboration (ISIC) Archive dataset, which contains more than 40,000 images. The results of this work on this dataset and other datasets are evaluated to be improved over networks containing traditional residual modules and some popular networks.
Existing weakly-supervised segmentation approaches based on image-level annotations may focus on the most activated region in the image and tend to identify only part of the target object. Intuitively, high-level semantics among objects of the same category in different images could help to recognize corresponding activated regions of the query. In this study, a scheme called Cycle-Consistency of Semantics Network (CyCSNet) is proposed, which can enhance the activation of the potential inactive regions of the target object by utilizing the cycle-consistent semantics from images of the same category in the training set. Moreover, a Dynamic Correlation Feature Selection (DCFS) algorithm is derived to reduce the noise from pixel-wise samples of low relevance for better training. Experiments on the PASCAL VOC 2012 dataset show that the proposed CyCSNet achieves competitive results compared with state-of-the-art weakly-supervised segmentation approaches.
Hiroya HACHIYAMA Takamichi NAKAMOTO
Devices presenting audiovisual information are widespread, but few ones presenting olfactory information. We have developed a device called an olfactory display that presents odors to users by mixing multiple fragrances. Previously developed olfactory displays had the problem that the ejection volume of liquid perfume droplets was large and the dynamic range of the blending ratio was small. In this study, we used an inkjet device that ejects small droplets in order to expand the dynamic range of blending ratios to present a variety of scents. By finely controlling the back pressure using an electro-osmotic pump (EO pump) and adjusting the timing of EO pump and inkjet device, we succeeded in stabilizing the ejection of the inkjet device and we can have large dynamic range.
Yang XIAO Zhongyuan ZHOU Mingjie SHENG Qi ZHOU
The method of extracting impedance parameters of surface mounted (SMD) electronic components by test is suitable for components with unknown model or material information, but requires consideration of errors caused by non-coaxial and measurement fixtures. In this paper, a fixture for impedance measurement is designed according to the characteristics of passive devices, and the fixture de-embedding method is used to eliminate errors and improve the test accuracy. The method of obtaining S parameters of fixture based on full wave simulation proposed in this paper can provide a thought for obtaining S parameters in de-embedding. Taking a certain patch capacitor as an example, the S parameters for de-embedding were obtained using methods based on full wave simulation, 2×Thru, and ADS simulation, and de-embedding tests were conducted. The results indicate that obtaining the S parameter of the testing fixture based on full wave simulation and conducting de-embedding testing compared to ADS simulation can accurately extract the impedance parameters of SMD electronic components, which provides a reference for the study of electromagnetic interference (EMI) coupling mechanism.
Hongbo LI Aijun LIU Qiang YANG Zhe LYU Di YAO
To improve the direction-of-arrival estimation performance of the small-aperture array, we propose a source localization method inspired by the Ormia fly’s coupled ears and MUSIC-like algorithm. The Ormia can local its host cricket’s sound precisely despite the tremendous incompatibility between the spacing of its ear and the sound wavelength. In this paper, we first implement a biologically inspired coupled system based on the coupled model of the Ormia’s ears and solve its responses by the modal decomposition method. Then, we analyze the effect of the system on the received signals of the array. Research shows that the system amplifies the amplitude ratio and phase difference between the signals, equivalent to creating a virtual array with a larger aperture. Finally, we apply the MUSIC-like algorithm for DOA estimation to suppress the colored noise caused by the system. Numerical results demonstrate that the proposed method can improve the localization precision and resolution of the array.
Xiaolong ZHENG Bangjie LI Daqiao ZHANG Di YAO Xuguang YANG
High Frequency Surface Wave Radar holds significant potential in sea detection. However, the target signals are often surpassed by substantial sea clutter and ionospheric clutter, making it crucial to address clutter suppression and extract weak target signals amidst the strong noise background.This study proposes a novel method for separating weak harmonic target signals based on local tangent space, leveraging the chaotic feature of ionospheric clutter.The effectiveness of this approach is demonstrated through the analysis of measured data, thereby validating its practicality and potential for real-world applications.
Yun JIANG Huiyang LIU Xiaopeng JIAO Ji WANG Qiaoqiao XIA
In this letter, a novel projection algorithm is proposed in which projection onto a triangle consisting of the three even-vertices closest to the vector to be projected replaces check polytope projection, achieving the same FER performance as exact projection algorithm in both high-iteration and low-iteration regime. Simulation results show that compared with the sparse affine projection algorithm (SAPA), it can improve the FER performance by 0.2 dB as well as save average number of iterations by 4.3%.
Hakan BERCAG Osman KUKRER Aykut HOCANIN
A new extended normalized least-mean-square (ENLMS) algorithm is proposed. A novel non-linear time-varying step-size (NLTVSS) formula is derived. The convergence rate of ENLMS increases due to NLTVSS as the number of data-reuse L is increased. ENLMS does not involve matrix inversion, and, thus, avoids numerical instability issues.
Jun-Feng LIU Yuan FENG Zeng-Hui LI Jing-Wei TANG
To improve the control performance of the permanent magnet synchronous motor speed control system, the fractional order calculus theory is combined with the sliding mode control to design the fractional order integral sliding mode sliding mode surface (FOISM) to improve the robustness of the system. Secondly, considering the existence of chattering phenomenon in sliding mode control, a new second-order sliding mode reaching law (NSOSMRL) is designed to improve the control accuracy of the system. Finally, the effectiveness of the proposed strategy is demonstrated by simulation.
Boolean functions play an important role in symmetric ciphers. One of important open problems on Boolean functions is determining the maximum possible resiliency order of n-variable Boolean functions with optimal algebraic immunity. In this letter, we search Boolean functions in the rotation symmetric class, and determine the maximum possible resiliency order of 9-variable Boolean functions with optimal algebraic immunity. Moreover, the maximum possible nonlinearity of 9-variable rotation symmetric Boolean functions with optimal algebraic immunity-resiliency trade-off is determined to be 224.
You GAO Ming-Yue XIE Gang WANG Lin-Zhi SHEN
Mutually unbiased bases (MUBs) are widely used in quantum information processing and play an important role in quantum cryptography, quantum state tomography and communications. It’s difficult to construct MUBs and remains unknown whether complete MUBs exist for any non prime power. Therefore, researchers have proposed the solution to construct approximately mutually unbiased bases (AMUBs) by weakening the inner product conditions. This paper constructs q AMUBs of ℂq, (q + 1) AMUBs of ℂq-1 and q AMUBs of ℂq-1 by using character sums over Galois rings and finite fields, where q is a power of a prime. The first construction of q AMUBs of ℂq is new which illustrates K AMUBs of ℂK can be achieved. The second and third constructions in this paper include the partial results about AMUBs constructed by W. Wang et al. in [9].
Xiuping PENG Yinna LIU Hongbin LIN
In this letter, we propose a novel direct construction of three-phase Z-complementary triads with flexible lengths and various widths of the zero-correlation zone based on extended Boolean functions. The maximum width ratio of the zero-correlation zone of the construction can reach 3/4. And the proposed sequences can exist for all lengths other than powers of three. We also investigate the peak-to-average power ratio properties of the proposed ZCTs.
Mengmeng ZHANG Zeliang ZHANG Yuan LI Ran CHENG Hongyuan JING Zhi LIU
Point cloud video contains not only color information but also spatial position information and usually has large volume of data. Typical rate distortion optimization algorithms based on Human Visual System only consider the color information, which limit the coding performance. In this paper, a Coding Tree Unit (CTU) level quantization parameter (QP) adjustment algorithm based on JND and spatial complexity is proposed to improve the subjective and objective quality of Video-Based Point Cloud Compression (V-PCC). Firstly, it is found that the JND model is degraded at CTU level for attribute video due to the pixel filling strategy of V-PCC, and an improved JND model is designed using the occupancy map. Secondly, a spatial complexity detection metric is designed to measure the visual importance of each CTU. Finally, a CTU-level QP adjustment scheme based on both JND levels and visual importance is proposed for geometry and attribute video. The experimental results show that, compared with the latest V-PCC (TMC2-18.0) anchors, the BD-rate is reduced by -2.8% and -3.2% for D1 and D2 metrics, respectively, and the subjective quality is improved significantly.
Ngoc-Tan NGUYEN Trung-Duc NGUYEN Nam-Hoang NGUYEN Trong-Minh HOANG
Multi-access edge computing (MEC) is an emerging technology of 5G and beyond mobile networks which deploys computation services at edge servers for reducing service delay. However, edge servers may have not enough computation capabilities to satisfy the delay requirement of services. Thus, heavy computation tasks need to be offloaded to other MEC servers. In this paper, we propose an offloading solution, called optimal delay offloading (ODO) solution, that can guarantee service delay requirements. Specificially, this method exploits an estimation of queuing delay among MEC servers to find a proper offloading server with the lowest service delay to offload the computation task. Simulation results have proved that the proposed ODO method outperforms the conventional methods, i.e., the non-offloading and the energy-efficient offloading [10] methods (up to 1.6 times) in terms of guaranteeing the service delay under a threshold.
In underwater acoustic communication systems based on orthogonal frequency division multiplexing (OFDM), taking clipping to reduce the peak-to-average power ratio leads to nonlinear distortion of the signal, making the receiver unable to recover the faded signal accurately. In this letter, an Aquila optimizer-based convolutional attention block stacked network (AO-CABNet) is proposed to replace the receiver to improve the ability to recover the original signal. Simulation results show that the AO method has better optimization capability to quickly obtain the optimal parameters of the network model, and the proposed AO-CABNet structure outperforms existing schemes.
Terahertz (THz) ultra-massive multiple-input multiple-output (UM-MIMO) is envisioned as a key enabling technology of 6G wireless communication. In UM-MIMO systems, downlink channel state information (CSI) has to be fed to the base station for beamforming. However, the feedback overhead becomes unacceptable because of the large antenna array. In this letter, the characteristic of CSI is explored from the perspective of data distribution. Based on this characteristic, a novel network named Attention-GRU Net (AGNet) is proposed for CSI feedback. Simulation results show that the proposed AGNet outperforms other advanced methods in the quality of CSI feedback in UM-MIMO systems.
Pingping JI Lingge JIANG Chen HE Di HE Zhuxian LIAN
In this letter, we study the dynamic antenna grouping and the hybrid beamforming for high altitude platform (HAP) massive multiple-input multiple-output (MIMO) systems. We first exploit the fact that the ergodic sum rate is only related to statistical channel state information (SCSI) in the large-scale array regime, and then we utilize it to perform the dynamic antenna grouping and design the RF beamformer. By applying the Gershgorin Circle Theorem, the dynamic antenna grouping is realized based on the novel statistical distance metric instead of the value of the instantaneous channels. The RF beamformer is designed according to the singular value decomposition of the statistical correlation matrix according to the obtained dynamic antenna group. Dynamic subarrays mean each RF chain is linked with a dynamic antenna sub-set. The baseband beamformer is derived by utilizing the zero forcing (ZF). Numerical results demonstrate the performance enhancement of our proposed dynamic hybrid precoding (DHP) algorithm.
Integrated Sensing and Communication at terahertz band (ISAC-THz) has been considered as one of the promising technologies for the future 6G. However, in the phase-shifters (PSs) based massive multiple-input-multiple-output (MIMO) hybrid precoding system, due to the ultra-large bandwidth of the terahertz frequency band, the subcarrier channels with different frequencies have different equivalent spatial directions. Therefore, the hybrid beamforming at the transmitter will cause serious beam split problems. In this letter, we propose a dual-function radar communication (DFRC) precoding method by considering recently proposed delay-phase precoding structure for THz massive MIMO. By adding delay phase components between the radio frequency chain and the frequency-independent PSs, the beam is aligned with the target physical direction over the entire bandwidth to reduce the loss caused by beam splitting effect. Furthermore, we employ a hardware structure by using true-time-delayers (TTDs) to realize the concept of frequency-dependent phase shifts. Theoretical analysis and simulation results have shown that it can increase communication performance and make up for the performance loss caused by the dual-function trade-off of communication radar to a certain extent.
Menglong WU Jianwen ZHANG Yongfa XIE Yongchao SHI Tianao YAO
Direct-current biased optical orthogonal frequency division multiplexing (DCO-OFDM) exhibits a high peak-to-average power ratio (PAPR), which leads to nonlinear distortion in the system. In response to the above, the study proposes a scheme that combines direct-current biased optical orthogonal frequency division multiplexing with index modulation (DCO-OFDM-IM) and convex optimization algorithms. The proposed scheme utilizes partially activated subcarriers of the system to transmit constellation modulated symbol information, and transmits additional symbol information of the system through the combination of activated carrier index. Additionally, a dither signal is added to the system’s idle subcarriers, and the convex optimization algorithm is applied to solve for the optimal values of this dither signal. Therefore, by ensuring the system’s peak power remains unchanged, the scheme enhances the system’s average transmission power and thus achieves a reduction in the PAPR. Experimental results indicate that at a system’s complementary cumulative distribution function (CCDF) of 10-4, the proposed scheme reduces the PAPR by approximately 3.5 dB compared to the conventional DCO-OFDM system. Moreover, at a bit error rate (BER) of 10-3, the proposed scheme can lower the signal-to-noise ratio (SNR) by about 1 dB relative to the traditional DCO-OFDM system. Therefore, the proposed scheme enables a more substantial reduction in PAPR and improvement in BER performance compared to the conventional DCO-OFDM approach.