The search functionality is under construction.

IEICE TRANSACTIONS on Information

Efficient Two-Step Middle-Level Part Feature Extraction for Fine-Grained Visual Categorization

Hideki NAKAYAMA, Tomoya TSUDA

  • Full Text Views

    0

  • Cite this

Summary :

Fine-grained visual categorization (FGVC) has drawn increasing attention as an emerging research field in recent years. In contrast to generic-domain visual recognition, FGVC is characterized by high intra-class and subtle inter-class variations. To distinguish conceptually and visually similar categories, highly discriminative visual features must be extracted. Moreover, FGVC has highly specialized and task-specific nature. It is not always easy to obtain a sufficiently large-scale training dataset. Therefore, the key to success in practical FGVC systems is to efficiently exploit discriminative features from a limited number of training examples. In this paper, we propose an efficient two-step dimensionality compression method to derive compact middle-level part-based features. To do this, we compare both space-first and feature-first convolution schemes and investigate their effectiveness. Our approach is based on simple linear algebra and analytic solutions, and is highly scalable compared with the current one-vs-one or one-vs-all approach, making it possible to quickly train middle-level features from a number of pairwise part regions. We experimentally show the effectiveness of our method using the standard Caltech-Birds and Stanford-Cars datasets.

Publication
IEICE TRANSACTIONS on Information Vol.E99-D No.6 pp.1626-1634
Publication Date
2016/06/01
Publicized
2016/02/23
Online ISSN
1745-1361
DOI
10.1587/transinf.2015EDP7358
Type of Manuscript
PAPER
Category
Image Recognition, Computer Vision

Authors

Hideki NAKAYAMA
  University of Tokyo
Tomoya TSUDA
  University of Tokyo

Keyword