We revisit the problem with generic object recognition from the point of view of human-computer interaction. While many existing algorithms for generic object recognition first try to detect target objects before features are extracted and classified in processing, our work is motivated by the belief that solving the task of detection by computer is not always necessary in many practical situations, such as those involving mobile recognition systems with touch displays and cameras. It is natural for these systems to ask users to input the segmentation data for targets through their touch displays. Speaking from the perspective of usability, such systems should involve rough segmentation to reduce the user workload. In this situation, different people would provide different segmentation data. Here, an interesting question arises – if multiple training samples are generated from a single image by using various segmentation data created by different people, what would happen to the accuracy of classification? We created “20 wild bird datasets” that had a large number of rough segmentation datasets made by 383 people in an attempt to answer this question. Our experiments revealed two interesting facts: (i) generating multiple training samples from a single image had positive effects on classification accuracies, especially when image features including spatial information were used and (ii) augmenting training samples with artificial segmentation data synthesized with a morphing technique also had slightly positive effects on classification accuracies.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Mitsuru AMBAI, Yuichi YOSHIDA, "Augmenting Training Samples with a Large Number of Rough Segmentation Datasets" in IEICE TRANSACTIONS on Information,
vol. E94-D, no. 10, pp. 1880-1888, October 2011, doi: 10.1587/transinf.E94.D.1880.
Abstract: We revisit the problem with generic object recognition from the point of view of human-computer interaction. While many existing algorithms for generic object recognition first try to detect target objects before features are extracted and classified in processing, our work is motivated by the belief that solving the task of detection by computer is not always necessary in many practical situations, such as those involving mobile recognition systems with touch displays and cameras. It is natural for these systems to ask users to input the segmentation data for targets through their touch displays. Speaking from the perspective of usability, such systems should involve rough segmentation to reduce the user workload. In this situation, different people would provide different segmentation data. Here, an interesting question arises – if multiple training samples are generated from a single image by using various segmentation data created by different people, what would happen to the accuracy of classification? We created “20 wild bird datasets” that had a large number of rough segmentation datasets made by 383 people in an attempt to answer this question. Our experiments revealed two interesting facts: (i) generating multiple training samples from a single image had positive effects on classification accuracies, especially when image features including spatial information were used and (ii) augmenting training samples with artificial segmentation data synthesized with a morphing technique also had slightly positive effects on classification accuracies.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E94.D.1880/_p
Copy
@ARTICLE{e94-d_10_1880,
author={Mitsuru AMBAI, Yuichi YOSHIDA, },
journal={IEICE TRANSACTIONS on Information},
title={Augmenting Training Samples with a Large Number of Rough Segmentation Datasets},
year={2011},
volume={E94-D},
number={10},
pages={1880-1888},
abstract={We revisit the problem with generic object recognition from the point of view of human-computer interaction. While many existing algorithms for generic object recognition first try to detect target objects before features are extracted and classified in processing, our work is motivated by the belief that solving the task of detection by computer is not always necessary in many practical situations, such as those involving mobile recognition systems with touch displays and cameras. It is natural for these systems to ask users to input the segmentation data for targets through their touch displays. Speaking from the perspective of usability, such systems should involve rough segmentation to reduce the user workload. In this situation, different people would provide different segmentation data. Here, an interesting question arises – if multiple training samples are generated from a single image by using various segmentation data created by different people, what would happen to the accuracy of classification? We created “20 wild bird datasets” that had a large number of rough segmentation datasets made by 383 people in an attempt to answer this question. Our experiments revealed two interesting facts: (i) generating multiple training samples from a single image had positive effects on classification accuracies, especially when image features including spatial information were used and (ii) augmenting training samples with artificial segmentation data synthesized with a morphing technique also had slightly positive effects on classification accuracies.},
keywords={},
doi={10.1587/transinf.E94.D.1880},
ISSN={1745-1361},
month={October},}
Copy
TY - JOUR
TI - Augmenting Training Samples with a Large Number of Rough Segmentation Datasets
T2 - IEICE TRANSACTIONS on Information
SP - 1880
EP - 1888
AU - Mitsuru AMBAI
AU - Yuichi YOSHIDA
PY - 2011
DO - 10.1587/transinf.E94.D.1880
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E94-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2011
AB - We revisit the problem with generic object recognition from the point of view of human-computer interaction. While many existing algorithms for generic object recognition first try to detect target objects before features are extracted and classified in processing, our work is motivated by the belief that solving the task of detection by computer is not always necessary in many practical situations, such as those involving mobile recognition systems with touch displays and cameras. It is natural for these systems to ask users to input the segmentation data for targets through their touch displays. Speaking from the perspective of usability, such systems should involve rough segmentation to reduce the user workload. In this situation, different people would provide different segmentation data. Here, an interesting question arises – if multiple training samples are generated from a single image by using various segmentation data created by different people, what would happen to the accuracy of classification? We created “20 wild bird datasets” that had a large number of rough segmentation datasets made by 383 people in an attempt to answer this question. Our experiments revealed two interesting facts: (i) generating multiple training samples from a single image had positive effects on classification accuracies, especially when image features including spatial information were used and (ii) augmenting training samples with artificial segmentation data synthesized with a morphing technique also had slightly positive effects on classification accuracies.
ER -