IEICE global.ieice.org Site

Keyword Search Result

[Keyword] Ti(30728hit)

701-720hit(30728hit)

A Visual Question Answering Network Merging High- and Low-Level Semantic Information
Huimin LI Dezhi HAN Chongqing CHEN Chin-Chen CHANG Kuan-Ching LI Dun LI

PAPER-Core Methods

Pubricized:
2022/01/06
Vol:
E106-D No:5
Page(s):
581-589
Visual Question Answering (VQA) usually uses deep attention mechanisms to learn fine-grained visual content of images and textual content of questions. However, the deep attention mechanism can only learn high-level semantic information while ignoring the impact of the low-level semantic information on answer prediction. For such, we design a High- and Low-Level Semantic Information Network (HLSIN), which employs two strategies to achieve the fusion of high-level semantic information and low-level semantic information. Adaptive weight learning is taken as the first strategy to allow different levels of semantic information to learn weights separately. The gate-sum mechanism is used as the second to suppress invalid information in various levels of information and fuse valid information. On the benchmark VQA-v2 dataset, we quantitatively and qualitatively evaluate HLSIN and conduct extensive ablation studies to explore the reasons behind HLSIN's effectiveness. Experimental results demonstrate that HLSIN significantly outperforms the previous state-of-the-art, with an overall accuracy of 70.93% on test-dev.
The Comparison of Attention Mechanisms with Different Embedding Modes for Performance Improvement of Fine-Grained Classification
Wujian YE Run TAN Yijun LIU Chin-Chen CHANG

PAPER-Core Methods

Pubricized:
2021/12/22
Vol:
E106-D No:5
Page(s):
590-600
Fine-grained image classification is one of the key basic tasks of computer vision. The appearance of traditional deep convolutional neural network (DCNN) combined with attention mechanism can focus on partial and local features of fine-grained images, but it still lacks the consideration of the embedding mode of different attention modules in the network, leading to the unsatisfactory result of classification model. To solve the above problems, three different attention mechanisms are introduced into the DCNN network (like ResNet, VGGNet, etc.), including SE, CBAM and ECA modules, so that DCNN could better focus on the key local features of salient regions in the image. At the same time, we adopt three different embedding modes of attention modules, including serial, residual and parallel modes, to further improve the performance of the classification model. The experimental results show that the three attention modules combined with three different embedding modes can improve the performance of DCNN network effectively. Moreover, compared with SE and ECA, CBAM has stronger feature extraction capability. Among them, the parallelly embedded CBAM can make the local information paid attention to by DCNN richer and more accurate, and bring the optimal effect for DCNN, which is 1.98% and 1.57% higher than that of original VGG16 and Resnet34 in CUB-200-2011 dataset, respectively. The visualization analysis also indicates that the attention modules can be easily embedded into DCNN networks, especially in the parallel mode, with stronger generality and universality.
A Novel Differential Evolution Algorithm Based on Local Fitness Landscape Information for Optimization Problems
Jing LIANG Ke LI Kunjie YU Caitong YUE Yaxin LI Hui SONG

PAPER-Core Methods

Pubricized:
2023/02/13
Vol:
E106-D No:5
Page(s):
601-616
The selection of mutation strategy greatly affects the performance of differential evolution algorithm (DE). For different types of optimization problems, different mutation strategies should be selected. How to choose a suitable mutation strategy for different problems is a challenging task. To deal with this challenge, this paper proposes a novel DE algorithm based on local fitness landscape, called FLIDE. In the proposed method, fitness landscape information is obtained to guide the selection of mutation operators. In this way, different problems can be solved with proper evolutionary mechanisms. Moreover, a population adjustment method is used to balance the search ability and population diversity. On one hand, the diversity of the population in the early stage is enhanced with a relative large population. One the other hand, the computational cost is reduced in the later stage with a relative small population. The evolutionary information is utilized as much as possible to guide the search direction. The proposed method is compared with five popular algorithms on 30 test functions with different characteristics. Experimental results show that the proposed FLIDE is more effective on problems with high dimensions.
Prioritization of Lane-Specific Traffic Jam Detection for Automotive Navigation Framework Utilizing Suddenness Index and Automatic Threshold Determination
Aki HAYASHI Yuki YOKOHATA Takahiro HATA Kouhei MORI Masato KAMIYA

PAPER

Pubricized:
2023/02/03
Vol:
E106-D No:5
Page(s):
895-903
Car navigation systems provide traffic jam information. In this study, we attempt to provide more detailed traffic jam information that considers the lane in which a traffic jam is in. This makes it possible for users to avoid long waits in queued traffic going toward an unintended destination. Lane-specific traffic jam detection utilizes image processing, which incurs long processing time and high cost. To reduce these, we propose a “suddenness index (SI)” to categorize candidate areas as sudden or periodic. Sudden traffic jams are prioritized as they may lead to accidents. This technology aggregates the number of connected cars for each mesh on a map and quantifies the degree of deviation from the ordinary state. In this paper, we evaluate the proposed method using actual global positioning system (GPS) data and found that the proposed index can cover 100% of sudden lane-specific traffic jams while excluding 82.2% of traffic jam candidates. We also demonstrate the effectiveness of time savings by integrating the proposed method into a demonstration framework. In addition, we improved the proposed method's ability to automatically determine the SI threshold to select the appropriate traffic jam candidates to avoid manual parameter settings.
Effectively Utilizing the Category Labels for Image Captioning
Junlong FENG Jianping ZHAO

PAPER-Core Methods

Pubricized:
2021/12/13
Vol:
E106-D No:5
Page(s):
617-624
As a further investigation of the image captioning task, some works extended the vision-text dataset for specific subtasks, such as the stylized caption generating. The corpus in such dataset is usually composed of obvious sentiment-bearing words. While, in some special cases, the captions are classified depending on image category. This will result in a latent problem: the generated sentences are in close semantic meaning but belong to different or even opposite categories. It is a worthy issue to explore an effective way to utilize the image category label to boost the caption difference. Therefore, we proposed an image captioning network with the label control mechanism (LCNET) in this paper. First, to further improve the caption difference, LCNET employs a semantic enhancement module to provide the decoder with global semantic vectors. Then, through the proposed label control LSTM, LCNET can dynamically modulate the caption generation depending on the image category labels. Finally, the decoder integrates the spatial image features with global semantic vectors to output the caption. Using all the standard evaluation metrics shows that our model outperforms the compared models. Caption analysis demonstrates our approach can improve the performance of semantic representation. Compared with other label control mechanisms, our model is capable of boosting the caption difference according to the labels and keeping a better consistent with image content as well.
A Novel SSD-Based Detection Algorithm Suitable for Small Object
Xi ZHANG Yanan ZHANG Tao GAO Yong FANG Ting CHEN

PAPER-Core Methods

Pubricized:
2022/01/06
Vol:
E106-D No:5
Page(s):
625-634
The original single-shot multibox detector (SSD) algorithm has good detection accuracy and speed for regular object recognition. However, the SSD is not suitable for detecting small objects for two reasons: 1) the relationships among different feature layers with various scales are not considered, 2) the predicted results are solely determined by several independent feature layers. To enhance its detection capability for small objects, this study proposes an improved SSD-based algorithm called proportional channels' fusion SSD (PCF-SSD). Three enhancements are provided by this novel PCF-SSD algorithm. First, a fusion feature pyramid model is proposed by concatenating channels of certain key feature layers in a given proportion for object detection. Second, the default box sizes are adjusted properly for small object detection. Third, an improved loss function is suggested to train the above-proposed fusion model, which can further improve object detection performance. A series of experiments are conducted on the public database Pascal VOC to validate the PCF-SSD. On comparing with the original SSD algorithm, our algorithm improves the mean average precision and detection accuracy for small objects by 3.3% and 3.9%, respectively, with a detection speed of 40FPS. Furthermore, the proposed PCF-SSD can achieve a better balance of detection accuracy and efficiency than the original SSD algorithm, as demonstrated by a series of experimental results.
Intelligent Tool Condition Monitoring Based on Multi-Scale Convolutional Recurrent Neural Network
Xincheng CAO Bin YAO Binqiang CHEN Wangpeng HE Suqin GUO Kun CHEN

PAPER-Smart Industry

Pubricized:
2022/06/16
Vol:
E106-D No:5
Page(s):
644-652
Tool condition monitoring is one of the core tasks of intelligent manufacturing in digital workshop. This paper presents an intelligent recognize method of tool condition based on deep learning. First, the industrial microphone is used to collect the acoustic signal during machining; then, a central fractal decomposition algorithm is proposed to extract sensitive information; finally, the multi-scale convolutional recurrent neural network is used for deep feature extraction and pattern recognition. The multi-process milling experiments proved that the proposed method is superior to the existing methods, and the recognition accuracy reached 88%.
Computer Vision-Based Tracking of Workers in Construction Sites Based on MDNet
Wen LIU Yixiao SHAO Shihong ZHAI Zhao YANG Peishuai CHEN

PAPER-Smart Industry

Pubricized:
2022/10/20
Vol:
E106-D No:5
Page(s):
653-661
Automatic continuous tracking of objects involved in a construction project is required for such tasks as productivity assessment, unsafe behavior recognition, and progress monitoring. Many computer-vision-based tracking approaches have been investigated and successfully tested on construction sites; however, their practical applications are hindered by the tracking accuracy limited by the dynamic, complex nature of construction sites (i.e. clutter with background, occlusion, varying scale and pose). To achieve better tracking performance, a novel deep-learning-based tracking approach called the Multi-Domain Convolutional Neural Networks (MD-CNN) is proposed and investigated. The proposed approach consists of two key stages: 1) multi-domain representation of learning; and 2) online visual tracking. To evaluate the effectiveness and feasibility of this approach, it is applied to a metro project in Wuhan China, and the results demonstrate good tracking performance in construction scenarios with complex background. The average distance error and F-measure for the MDNet are 7.64 pixels and 67, respectively. The results demonstrate that the proposed approach can be used by site managers to monitor and track workers for hazard prevention in construction sites.
An Improved Insulator and Spacer Detection Algorithm Based on Dual Network and SSD
Yong LI Shidi WEI Xuan LIU Yinzheng LUO Yafeng LI Feng SHUANG

PAPER-Smart Industry

Pubricized:
2022/10/17
Vol:
E106-D No:5
Page(s):
662-672
The traditional manual inspection is gradually replaced by the unmanned aerial vehicles (UAV) automatic inspection. However, due to the limited computational resources carried by the UAV, the existing deep learning-based algorithm needs a large amount of computational resources, which makes it impossible to realize the online detection. Moreover, there is no effective online detection system at present. To realize the high-precision online detection of electrical equipment, this paper proposes an SSD (Single Shot Multibox Detector) detection algorithm based on the improved Dual network for the images of insulators and spacers taken by UAVs. The proposed algorithm uses MnasNet and MobileNetv3 to form the Dual network to extract multi-level features, which overcomes the shortcoming of single convolutional network-based backbone for feature extraction. Then the features extracted from the two networks are fused together to obtain the features with high-level semantic information. Finally, the proposed algorithm is tested on the public dataset of the insulator and spacer. The experimental results show that the proposed algorithm can detect insulators and spacers efficiently. Compared with other methods, the proposed algorithm has the advantages of smaller model size and higher accuracy. The object detection accuracy of the proposed method is up to 95.1%.
Image-to-Image Translation for Data Augmentation on Multimodal Medical Images
Yue PENG Zuqiang MENG Lina YANG

PAPER-Smart Healthcare

Pubricized:
2022/03/01
Vol:
E106-D No:5
Page(s):
686-696
Medical images play an important role in medical diagnosis. However, acquiring a large number of datasets with annotations is still a difficult task in the medical field. For this reason, research in the field of image-to-image translation is combined with computer-aided diagnosis, and data augmentation methods based on generative adversarial networks are applied to medical images. In this paper, we try to perform data augmentation on unimodal data. The designed StarGAN V2 based network has high performance in augmenting the dataset using a small number of original images, and the augmented data is expanded from unimodal data to multimodal medical images, and this multimodal medical image data can be applied to the segmentation task with some improvement in the segmentation results. Our experiments demonstrate that the generated multimodal medical image data can improve the performance of glioma segmentation.
MolHF: Molecular Heterogeneous Attributes Fusion for Drug-Target Affinity Prediction on Heterogeneity
Runze WANG Zehua ZHANG Yueqin ZHANG Zhongyuan JIANG Shilin SUN Guixiang MA

PAPER-Smart Healthcare

Pubricized:
2022/05/31
Vol:
E106-D No:5
Page(s):
697-706
Recent studies in protein structure prediction such as AlphaFold have enabled deep learning to achieve great attention on the Drug-Target Affinity (DTA) task. Most works are dedicated to embed single molecular property and homogeneous information, ignoring the diverse heterogeneous information gains that are contained in the molecules and interactions. Motivated by this, we propose an end-to-end deep learning framework to perform Molecular Heterogeneous features Fusion (MolHF) for DTA prediction on heterogeneity. To address the challenges that biochemical attributes locates in different heterogeneous spaces, we design a Molecular Heterogeneous Information Learning module with multi-strategy learning. Especially, Molecular Heterogeneous Attention Fusion module is present to obtain the gains of molecular heterogeneous features. With these, the diversity of molecular structure information for drugs can be extracted. Extensive experiments on two benchmark datasets show that our method outperforms the baselines in all four metrics. Ablation studies validate the effect of attentive fusion and multi-group of drug heterogeneous features. Visual presentations demonstrate the impact of protein embedding level and the model ability of fitting data. In summary, the diverse gains brought by heterogeneous information contribute to drug-target affinity prediction.
The Effectiveness of Data Augmentation for Mature White Blood Cell Image Classification in Deep Learning — Selection of an Optimal Technique for Hematological Morphology Recognition —
Hiroyuki NOZAKA Kosuke KAMATA Kazufumi YAMAGATA

PAPER-Smart Healthcare

Pubricized:
2022/11/22
Vol:
E106-D No:5
Page(s):
707-714
The data augmentation method is known as a helpful technique to generate a dataset with a large number of images from one with a small number of images for supervised training in deep learning. However, a low validity augmentation method for image recognition was reported in a recent study on artificial intelligence (AI). This study aimed to clarify the optimal data augmentation method in deep learning model generation for the recognition of white blood cells (WBCs). Study Design: We conducted three different data augmentation methods (rotation, scaling, and distortion) on original WBC images, with each AI model for WBC recognition generated by supervised training. The subjects of the clinical assessment were 51 healthy persons. Thin-layer blood smears were prepared from peripheral blood and subjected to May-Grünwald-Giemsa staining. Results: The only significantly effective technique among the AI models for WBC recognition was data augmentation with rotation. By contrast, the effectiveness of both image distortion and image scaling was poor, and improved accuracy was limited to a specific WBC subcategory. Conclusion: Although data augmentation methods are often used for achieving high accuracy in AI generation with supervised training, we consider that it is necessary to select the optimal data augmentation method for medical AI generation based on the characteristics of medical images.
Detection Method of Fat Content in Pig B-Ultrasound Based on Deep Learning
Wenxin DONG Jianxun ZHANG Shuqiu TAN Xinyue ZHANG

PAPER-Smart Agriculture

Pubricized:
2022/02/07
Vol:
E106-D No:5
Page(s):
726-734
In the pork fat content detection task, traditional physical or chemical methods are strongly destructive, have substantial technical requirements and cannot achieve nondestructive detection without slaughtering. To solve these problems, we propose a novel, convenient and economical method for detecting the fat content of pig B-ultrasound images based on hybrid attention and multiscale fusion learning, which extracts and fuses shallow detail information and deep semantic information at multiple scales. First, a deep learning network is constructed to learn the salient features of fat images through a hybrid attention mechanism. Then, the information describing pork fat is extracted at multiple scales, and the detailed information expressed in the shallow layer and the semantic information expressed in the deep layer are fused later. Finally, a deep convolution network is used to predict the fat content compared with the real label. The experimental results show that the determination coefficient is greater than 0.95 on the 130 groups of pork B-ultrasound image data sets, which is 2.90, 6.10 and 5.13 percentage points higher than that of VGGNet, ResNet and DenseNet, respectively. It indicats that the model could effectively identify the B-ultrasound image of pigs and predict the fat content with high accuracy.
Compression of Vehicle and Pedestrian Detection Network Based on YOLOv3 Model
Lie GUO Yibing ZHAO Jiandong GAO

PAPER-Intelligent Transportation Systems

Pubricized:
2022/06/22
Vol:
E106-D No:5
Page(s):
735-745
The commonly used object detection algorithm based on convolutional neural network is difficult to meet the real-time requirement on embedded platform due to its large size of model, large amount of calculation, and long inference time. It is necessary to use model compression to reduce the amount of network calculation and increase the speed of network inference. This paper conducts compression of vehicle and pedestrian detection network by pruning and removing redundant parameters. The vehicle and pedestrian detection network is trained based on YOLOv3 model by using K-means++ to cluster the anchor boxes. The detection accuracy is improved by changing the proportion of categorical losses and regression losses for each category in the loss function because of the unbalanced number of targets in the dataset. A layer and channel pruning algorithm is proposed by combining global channel pruning thresholds and L1 norm, which can reduce the time cost of the network layer transfer process and the amount of computation. Network layer fusion based on TensorRT is performed and inference is performed using half-precision floating-point to improve the speed of inference. Results show that the vehicle and pedestrian detection compression network pruned 84% channels and 15 Shortcut modules can reduce the size by 32% and the amount of calculation by 17%. While the network inference time can be decreased to 21 ms, which is 1.48 times faster than the network pruned 84% channels.
Dynamic Evolution Simulation of Bus Bunching Affected by Traffic Operation State
Shaorong HU Yuqi ZHANG Yuefei JIN Ziqi DOU

PAPER-Intelligent Transportation Systems

Pubricized:
2022/04/13
Vol:
E106-D No:5
Page(s):
746-755
Bus bunching often occurs in public transit system, resulting in a series of problems such as poor punctuality, long waiting time and low service quality. In this paper, we explore the influence of the discrete distribution of traffic operation state on the dynamic evolution of bus bunching. Firstly, we use self-organizing map (SOM) to find the threshold of bus bunching and analyze the factors that affect bus bunching based on GPS data of No. 600 bus line in Xi'an. Then, taking the bus headway as the research index, we construct the bus bunching mechanism model. Finally, a simulation platform is built by MATLAB to examine the trend of headway when various influencing factors show different distribution states along the bus line. In terms of influencing factors, inter vehicle speed, queuing time at intersection and loading time at station are shown to have a significant impact on headway between buses. In terms of the impact of the distribution of crowded road sections on headway, long-distance and concentrated crowded road sections will lead to large interval or bus bunching. When the traffic states along the bus line are randomly distributed among crowded, normal and free, the headway may fluctuate in a large range, which may result in bus bunching, or fluctuate in a small range and remain relatively stable. The headway change curve is determined by the distribution length of each traffic state along the bus line. The research results can help to formulate improvement measures according to traffic operation state for equilibrium bus headway and alleviating bus bunching.
SPSD: Semantics and Deep Reinforcement Learning Based Motion Planning for Supermarket Robot
Jialun CAI Weibo HUANG Yingxuan YOU Zhan CHEN Bin REN Hong LIU

PAPER-Positioning and Navigation

Pubricized:
2022/09/15
Vol:
E106-D No:5
Page(s):
765-772
Robot motion planning is an important part of the unmanned supermarket. The challenges of motion planning in supermarkets lie in the diversity of the supermarket environment, the complexity of obstacle movement, the vastness of the search space. This paper proposes an adaptive Search and Path planning method based on the Semantic information and Deep reinforcement learning (SPSD), which effectively improves the autonomous decision-making ability of supermarket robots. Firstly, based on the backbone of deep reinforcement learning (DRL), supermarket robots process real-time information from multi-modality sensors to realize high-speed and collision-free motion planning. Meanwhile, in order to solve the problem caused by the uncertainty of the reward in the deep reinforcement learning, common spatial semantic relationships between landmarks and target objects are exploited to define reward function. Finally, dynamics randomization is introduced to improve the generalization performance of the algorithm in the training. The experimental results show that the SPSD algorithm is excellent in the three indicators of generalization performance, training time and path planning length. Compared with other methods, the training time of SPSD is reduced by 27.42% at most, the path planning length is reduced by 21.08% at most, and the trained network of SPSD can be applied to unfamiliar scenes safely and efficiently. The results are motivating enough to consider the application of the proposed method in practical scenes. We have uploaded the video of the results of the experiment to https://www.youtube.com/watch?v=h1wLpm42NZk.
An Improved BPNN Method Based on Probability Density for Indoor Location
Rong FEI Yufan GUO Junhuai LI Bo HU Lu YANG

PAPER-Positioning and Navigation

Pubricized:
2022/12/23
Vol:
E106-D No:5
Page(s):
773-785
With the widespread use of indoor positioning technology, the need for high-precision positioning services is rising; nevertheless, there are several challenges, such as the difficulty of simulating the distribution of interior location data and the enormous inaccuracy of probability computation. As a result, this paper proposes three different neural network model comparisons for indoor location based on WiFi fingerprint - indoor location algorithm based on improved back propagation neural network model, RSSI indoor location algorithm based on neural network angle change, and RSSI indoor location algorithm based on depth neural network angle change - to raise accurately predict indoor location coordinates. Changing the action range of the activation function in the standard back-propagation neural network model achieves the goal of accurately predicting location coordinates. The revised back-propagation neural network model has strong stability and enhances indoor positioning accuracy based on experimental comparisons of loss rate (loss), accuracy rate (acc), and cumulative distribution function (CDF).
An Improved Real-Time Object Tracking Algorithm Based on Deep Learning Features
Xianyu WANG Cong LI Heyi LI Rui ZHANG Zhifeng LIANG Hai WANG

PAPER-Object Recognition and Tracking

Pubricized:
2022/01/07
Vol:
E106-D No:5
Page(s):
786-793
Visual object tracking is always a challenging task in computer vision. During the tracking, the shape and appearance of the target may change greatly, and because of the lack of sufficient training samples, most of the online learning tracking algorithms will have performance bottlenecks. In this paper, an improved real-time algorithm based on deep learning features is proposed, which combines multi-feature fusion, multi-scale estimation, adaptive updating of target model and re-detection after target loss. The effectiveness and advantages of the proposed algorithm are proved by a large number of comparative experiments with other excellent algorithms on large benchmark datasets.
Learning Pixel Perception for Identity and Illumination Consistency Face Frontalization in the Wild
Yongtang BAO Pengfei ZHOU Yue QI Zhihui WANG Qing FAN

PAPER-Person Image Generation

Pubricized:
2022/06/21
Vol:
E106-D No:5
Page(s):
794-803
A frontal and realistic face image was synthesized from a single profile face image. It has a wide range of applications in face recognition. Although the frontal face method based on deep learning has made substantial progress in recent years, there is still no guarantee that the generated face has identity consistency and illumination consistency in a significant posture. This paper proposes a novel pixel-based feature regression generative adversarial network (PFR-GAN), which can learn to recover local high-frequency details and preserve identity and illumination frontal face images in an uncontrolled environment. We first propose a Reslu block to obtain richer feature representation and improve the convergence speed of training. We then introduce a feature conversion module to reduce the artifacts caused by face rotation discrepancy, enhance image generation quality, and preserve more high-frequency details of the profile image. We also construct a 30,000 face pose dataset to learn about various uncontrolled field environments. Our dataset includes ages of different races and wild backgrounds, allowing us to handle other datasets and obtain better results. Finally, we introduce a discriminator used for recovering the facial structure of the frontal face images. Quantitative and qualitative experimental results show our PFR-GAN can generate high-quality and high-fidelity frontal face images, and our results are better than the state-of-art results.
Multi-Scale Correspondence Learning for Person Image Generation
Shi-Long SHEN Ai-Guo WU Yong XU

PAPER-Person Image Generation

Pubricized:
2022/04/15
Vol:
E106-D No:5
Page(s):
804-812
A generative model is presented for two types of person image generation in this paper. First, this model is applied to pose-guided person image generation, i.e., converting the pose of a source person image to the target pose while preserving the texture of that source person image. Second, this model is also used for clothing-guided person image generation, i.e., changing the clothing texture of a source person image to the desired clothing texture. The core idea of the proposed model is to establish the multi-scale correspondence, which can effectively address the misalignment introduced by transferring pose, thereby preserving richer information on appearance. Specifically, the proposed model consists of two stages: 1) It first generates the target semantic map imposed on the target pose to provide more accurate guidance during the generation process. 2) After obtaining the multi-scale feature map by the encoder, the multi-scale correspondence is established, which is useful for a fine-grained generation. Experimental results show the proposed method is superior to state-of-the-art methods in pose-guided person image generation and show its effectiveness in clothing-guided person image generation.

701-720hit(30728hit)

Keyword Search Result

[Keyword] Ti(30728hit)

A Visual Question Answering Network Merging High- and Low-Level Semantic Information

The Comparison of Attention Mechanisms with Different Embedding Modes for Performance Improvement of Fine-Grained Classification

A Novel Differential Evolution Algorithm Based on Local Fitness Landscape Information for Optimization Problems

Prioritization of Lane-Specific Traffic Jam Detection for Automotive Navigation Framework Utilizing Suddenness Index and Automatic Threshold Determination

Effectively Utilizing the Category Labels for Image Captioning

A Novel SSD-Based Detection Algorithm Suitable for Small Object

Intelligent Tool Condition Monitoring Based on Multi-Scale Convolutional Recurrent Neural Network

Computer Vision-Based Tracking of Workers in Construction Sites Based on MDNet

An Improved Insulator and Spacer Detection Algorithm Based on Dual Network and SSD

Image-to-Image Translation for Data Augmentation on Multimodal Medical Images

MolHF: Molecular Heterogeneous Attributes Fusion for Drug-Target Affinity Prediction on Heterogeneity

The Effectiveness of Data Augmentation for Mature White Blood Cell Image Classification in Deep Learning — Selection of an Optimal Technique for Hematological Morphology Recognition —

Detection Method of Fat Content in Pig B-Ultrasound Based on Deep Learning

Compression of Vehicle and Pedestrian Detection Network Based on YOLOv3 Model

Dynamic Evolution Simulation of Bus Bunching Affected by Traffic Operation State

SPSD: Semantics and Deep Reinforcement Learning Based Motion Planning for Supermarket Robot

An Improved BPNN Method Based on Probability Density for Indoor Location

An Improved Real-Time Object Tracking Algorithm Based on Deep Learning Features

Learning Pixel Perception for Identity and Illumination Consistency Face Frontalization in the Wild

Multi-Scale Correspondence Learning for Person Image Generation

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles