The search functionality is under construction.
The search functionality is under construction.

Image Captioning Algorithm Based on Multi-Branch CNN and Bi-LSTM

Shan HE, Yuanyao LU, Shengnan CHEN

  • Full Text Views

    3

  • Cite this

Summary :

The development of deep learning and neural networks has brought broad prospects to computer vision and natural language processing. The image captioning task combines cutting-edge methods in two fields. By building an end-to-end encoder-decoder model, its description performance can be greatly improved. In this paper, the multi-branch deep convolutional neural network is used as the encoder to extract image features, and the recurrent neural network is used to generate descriptive text that matches the input image. We conducted experiments on Flickr8k, Flickr30k and MSCOCO datasets. According to the analysis of the experimental results on evaluation metrics, the model proposed in this paper can effectively achieve image caption, and its performance is better than classic image captioning models such as neural image annotation models.

Publication
IEICE TRANSACTIONS on Information Vol.E104-D No.7 pp.941-947
Publication Date
2021/07/01
Publicized
2021/04/19
Online ISSN
1745-1361
DOI
10.1587/transinf.2020EDP7227
Type of Manuscript
PAPER
Category
Artificial Intelligence, Data Mining

Authors

Shan HE
  North China University of Technology
Yuanyao LU
  North China University of Technology
Shengnan CHEN
  North China University of Technology

Keyword