Naranchimeg BOLD Chao ZHANG Takuya AKASHI
In recent decade, many state-of-the-art algorithms on image classification as well as audio classification have achieved noticeable successes with the development of deep convolutional neural network (CNN). However, most of the works only exploit single type of training data. In this paper, we present a study on classifying bird species by exploiting the combination of both visual (images) and audio (sounds) data using CNN, which has been sparsely treated so far. Specifically, we propose CNN-based multimodal learning models in three types of fusion strategies (early, middle, late) to settle the issues of combining training data cross domains. The advantage of our proposed method lies on the fact that we can utilize CNN not only to extract features from image and audio data (spectrogram) but also to combine the features across modalities. In the experiment, we train and evaluate the network structure on a comprehensive CUB-200-2011 standard data set combing our originally collected audio data set with respect to the data species. We observe that a model which utilizes the combination of both data outperforms models trained with only an either type of data. We also show that transfer learning can significantly increase the classification performance.
Ji-Gao ZHANG Jin-Chun GAO Xue-Yan LIN
Large number of electronic connectors are widely used in various electronic and telecommunication systems. No matter whether it is optical telecommunications or mobile phone systems, connectors are important links for electronics. Unfortunately connector contacts are exposed in air, they are different from any other electronic components, the contacts are greatly influenced by the environment where they operate. In China, dust and corrosion products are the main contaminants to cause contact failure. Evidently the failed contacts seriously deteriorate the reliability of electronic and telecommunication systems. This paper summarizes the recent achievements obtained by our Lab on the effect of dust and corrosion products to the connector contact failure. Since dust contamination is a very complex problem which is not only popular in China, but also happened in many countries. Continuous studies will be very useful to improve the contact reliability of connectors, setting up new and effective testing methods and standards, building up experimental and computer simulation systems.
Chengyu WU Jiangshan QIN Xiangyang LI Ao ZHAN Zhengqiang WANG
Real-time matting is a challenging research in deep learning. Conventional CNN (Convolutional Neural Networks) approaches are easy to misjudge the foreground and background semantic and have blurry matting edges, which result from CNN’s limited concentration on global context due to receptive field. We propose a real-time matting approach called RMViT (Real-time matting with Vision Transformer) with Transformer structure, attention and content-aware guidance to solve issues above. The semantic accuracy improves a lot due to the establishment of global context and long-range pixel information. The experiments show our approach exceeds a 30% reduction in error metrics compared with existing real-time matting approaches.