Food Image Recognition Using Covariance of Convolutional Layer Feature Maps

Atsushi TATSUMA; Masaki AONO

doi:10.1587/transinf.2015EDL8212

IEICE TRANSACTIONS on Information

Food Image Recognition Using Covariance of Convolutional Layer Feature Maps

Atsushi TATSUMA, Masaki AONO

Full Text Views

0

Cite this

Summary :

Recent studies have obtained superior performance in image recognition tasks by using, as an image representation, the fully connected layer activations of Convolutional Neural Networks (CNN) trained with various kinds of images. However, the CNN representation is not very suitable for fine-grained image recognition tasks involving food image recognition. For improving performance of the CNN representation in food image recognition, we propose a novel image representation that is comprised of the covariances of convolutional layer feature maps. In the experiment on the ETHZ Food-101 dataset, our method achieved 58.65% averaged accuracy, which outperforms the previous methods such as the Bag-of-Visual-Words Histogram, the Improved Fisher Vector, and CNN-SVM.

Publication: IEICE TRANSACTIONS on Information Vol.E99-D No.6 pp.1711-1715

Publication Date: 2016/06/01

Publicized: 2016/02/23

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2015EDL8212

Type of Manuscript: LETTER

Category: Image Recognition, Computer Vision

Authors

Atsushi TATSUMA
Toyohashi University of Technology
Masaki AONO
Toyohashi University of Technology

Keyword

food image recognition, convolutional neural networks, covariance descriptor, pattern recognition, deep learning

Cite this

Copy

Atsushi TATSUMA, Masaki AONO, "Food Image Recognition Using Covariance of Convolutional Layer Feature Maps" in IEICE TRANSACTIONS on Information, vol. E99-D, no. 6, pp. 1711-1715, June 2016, doi: 10.1587/transinf.2015EDL8212.
Abstract: Recent studies have obtained superior performance in image recognition tasks by using, as an image representation, the fully connected layer activations of Convolutional Neural Networks (CNN) trained with various kinds of images. However, the CNN representation is not very suitable for fine-grained image recognition tasks involving food image recognition. For improving performance of the CNN representation in food image recognition, we propose a novel image representation that is comprised of the covariances of convolutional layer feature maps. In the experiment on the ETHZ Food-101 dataset, our method achieved 58.65% averaged accuracy, which outperforms the previous methods such as the Bag-of-Visual-Words Histogram, the Improved Fisher Vector, and CNN-SVM.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2015EDL8212/_p

Copy

@ARTICLE{e99-d_6_1711,
author={Atsushi TATSUMA, Masaki AONO, },
journal={IEICE TRANSACTIONS on Information},
title={Food Image Recognition Using Covariance of Convolutional Layer Feature Maps},
year={2016},
volume={E99-D},
number={6},
pages={1711-1715},
abstract={Recent studies have obtained superior performance in image recognition tasks by using, as an image representation, the fully connected layer activations of Convolutional Neural Networks (CNN) trained with various kinds of images. However, the CNN representation is not very suitable for fine-grained image recognition tasks involving food image recognition. For improving performance of the CNN representation in food image recognition, we propose a novel image representation that is comprised of the covariances of convolutional layer feature maps. In the experiment on the ETHZ Food-101 dataset, our method achieved 58.65% averaged accuracy, which outperforms the previous methods such as the Bag-of-Visual-Words Histogram, the Improved Fisher Vector, and CNN-SVM.},
keywords={},
doi={10.1587/transinf.2015EDL8212},
ISSN={1745-1361},
month={June},}

Copy

TY - JOUR
TI - Food Image Recognition Using Covariance of Convolutional Layer Feature Maps
T2 - IEICE TRANSACTIONS on Information
SP - 1711
EP - 1715
AU - Atsushi TATSUMA
AU - Masaki AONO
PY - 2016
DO - 10.1587/transinf.2015EDL8212
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E99-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 2016
AB - Recent studies have obtained superior performance in image recognition tasks by using, as an image representation, the fully connected layer activations of Convolutional Neural Networks (CNN) trained with various kinds of images. However, the CNN representation is not very suitable for fine-grained image recognition tasks involving food image recognition. For improving performance of the CNN representation in food image recognition, we propose a novel image representation that is comprised of the covariances of convolutional layer feature maps. In the experiment on the ETHZ Food-101 dataset, our method achieved 58.65% averaged accuracy, which outperforms the previous methods such as the Bag-of-Visual-Words Histogram, the Improved Fisher Vector, and CNN-SVM.
ER -

IEICE TRANSACTIONS on Information