Selecting visually overlapping image pairs without any prior information is an essential task of large-scale structure from motion (SfM) pipelines. To address this problem, many state-of-the-art image retrieval systems adopt the idea of bag of visual words (BoVW) for computing image-pair similarity. In this paper, we present a method for improving the image pair selection using BoVW. Our method combines a conventional vector-based approach and a set-based approach. For the set similarity, we introduce a modified version of the Simpson (m-Simpson) coefficient. We show the advantage of this measure over three typical set similarity measures and demonstrate that the combination of vector similarity and the m-Simpson coefficient effectively reduces false positives and increases accuracy. To discuss the choice of vocabulary construction, we prepared both a sampled vocabulary on an evaluation dataset and a basic pre-trained vocabulary on a training dataset. In addition, we tested our method on vocabularies of different sizes. Our experimental results show that the proposed method dramatically improves precision scores especially on the sampled vocabulary and performs better than the state-of-the-art methods that use pre-trained vocabularies. We further introduce a method to determine the k value of top-k relevant searches for each image and show that it obtains higher precision at the same recall.
Takaharu KATO
Tokyo University of Agriculture and Technology
Ikuko SHIMIZU
Tokyo University of Agriculture and Technology
Tomas PAJDLA
Czech Technical University in Prague
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Takaharu KATO, Ikuko SHIMIZU, Tomas PAJDLA, "Improving Image Pair Selection for Large Scale Structure from Motion by Introducing Modified Simpson Coefficient" in IEICE TRANSACTIONS on Information,
vol. E105-D, no. 9, pp. 1590-1599, September 2022, doi: 10.1587/transinf.2021EDP7244.
Abstract: Selecting visually overlapping image pairs without any prior information is an essential task of large-scale structure from motion (SfM) pipelines. To address this problem, many state-of-the-art image retrieval systems adopt the idea of bag of visual words (BoVW) for computing image-pair similarity. In this paper, we present a method for improving the image pair selection using BoVW. Our method combines a conventional vector-based approach and a set-based approach. For the set similarity, we introduce a modified version of the Simpson (m-Simpson) coefficient. We show the advantage of this measure over three typical set similarity measures and demonstrate that the combination of vector similarity and the m-Simpson coefficient effectively reduces false positives and increases accuracy. To discuss the choice of vocabulary construction, we prepared both a sampled vocabulary on an evaluation dataset and a basic pre-trained vocabulary on a training dataset. In addition, we tested our method on vocabularies of different sizes. Our experimental results show that the proposed method dramatically improves precision scores especially on the sampled vocabulary and performs better than the state-of-the-art methods that use pre-trained vocabularies. We further introduce a method to determine the k value of top-k relevant searches for each image and show that it obtains higher precision at the same recall.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2021EDP7244/_p
Copy
@ARTICLE{e105-d_9_1590,
author={Takaharu KATO, Ikuko SHIMIZU, Tomas PAJDLA, },
journal={IEICE TRANSACTIONS on Information},
title={Improving Image Pair Selection for Large Scale Structure from Motion by Introducing Modified Simpson Coefficient},
year={2022},
volume={E105-D},
number={9},
pages={1590-1599},
abstract={Selecting visually overlapping image pairs without any prior information is an essential task of large-scale structure from motion (SfM) pipelines. To address this problem, many state-of-the-art image retrieval systems adopt the idea of bag of visual words (BoVW) for computing image-pair similarity. In this paper, we present a method for improving the image pair selection using BoVW. Our method combines a conventional vector-based approach and a set-based approach. For the set similarity, we introduce a modified version of the Simpson (m-Simpson) coefficient. We show the advantage of this measure over three typical set similarity measures and demonstrate that the combination of vector similarity and the m-Simpson coefficient effectively reduces false positives and increases accuracy. To discuss the choice of vocabulary construction, we prepared both a sampled vocabulary on an evaluation dataset and a basic pre-trained vocabulary on a training dataset. In addition, we tested our method on vocabularies of different sizes. Our experimental results show that the proposed method dramatically improves precision scores especially on the sampled vocabulary and performs better than the state-of-the-art methods that use pre-trained vocabularies. We further introduce a method to determine the k value of top-k relevant searches for each image and show that it obtains higher precision at the same recall.},
keywords={},
doi={10.1587/transinf.2021EDP7244},
ISSN={1745-1361},
month={September},}
Copy
TY - JOUR
TI - Improving Image Pair Selection for Large Scale Structure from Motion by Introducing Modified Simpson Coefficient
T2 - IEICE TRANSACTIONS on Information
SP - 1590
EP - 1599
AU - Takaharu KATO
AU - Ikuko SHIMIZU
AU - Tomas PAJDLA
PY - 2022
DO - 10.1587/transinf.2021EDP7244
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E105-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2022
AB - Selecting visually overlapping image pairs without any prior information is an essential task of large-scale structure from motion (SfM) pipelines. To address this problem, many state-of-the-art image retrieval systems adopt the idea of bag of visual words (BoVW) for computing image-pair similarity. In this paper, we present a method for improving the image pair selection using BoVW. Our method combines a conventional vector-based approach and a set-based approach. For the set similarity, we introduce a modified version of the Simpson (m-Simpson) coefficient. We show the advantage of this measure over three typical set similarity measures and demonstrate that the combination of vector similarity and the m-Simpson coefficient effectively reduces false positives and increases accuracy. To discuss the choice of vocabulary construction, we prepared both a sampled vocabulary on an evaluation dataset and a basic pre-trained vocabulary on a training dataset. In addition, we tested our method on vocabularies of different sizes. Our experimental results show that the proposed method dramatically improves precision scores especially on the sampled vocabulary and performs better than the state-of-the-art methods that use pre-trained vocabularies. We further introduce a method to determine the k value of top-k relevant searches for each image and show that it obtains higher precision at the same recall.
ER -