We propose a system that enables us to gather hundreds of images related to one set of keywords provided by a user from the World Wide Web. The system is called Image Collector II. The Image Collector, which we proposed previously, can gather only one or two hundreds of images. We propose the two following improvements on our previous system in terms of the number of gathered images and their precision: (1) We extract some words appearing with high frequency from all HTML files in which output images are embedded in an initial image gathering, and using them as keywords, we carry out a second image gathering. Through this process, we can obtain hundreds of images for one set of keywords. (2) The more images we gather, the more the precision of gathered images decreases. To improve the precision, we introduce word vectors of HTML files embedding images into the image selecting process in addition to image feature vectors.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Keiji YANAI, "Image Collector II: A System to Gather a Large Number of Images from the Web" in IEICE TRANSACTIONS on Information,
vol. E88-D, no. 10, pp. 2432-2436, October 2005, doi: 10.1093/ietisy/e88-d.10.2432.
Abstract: We propose a system that enables us to gather hundreds of images related to one set of keywords provided by a user from the World Wide Web. The system is called Image Collector II. The Image Collector, which we proposed previously, can gather only one or two hundreds of images. We propose the two following improvements on our previous system in terms of the number of gathered images and their precision: (1) We extract some words appearing with high frequency from all HTML files in which output images are embedded in an initial image gathering, and using them as keywords, we carry out a second image gathering. Through this process, we can obtain hundreds of images for one set of keywords. (2) The more images we gather, the more the precision of gathered images decreases. To improve the precision, we introduce word vectors of HTML files embedding images into the image selecting process in addition to image feature vectors.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e88-d.10.2432/_p
Copy
@ARTICLE{e88-d_10_2432,
author={Keiji YANAI, },
journal={IEICE TRANSACTIONS on Information},
title={Image Collector II: A System to Gather a Large Number of Images from the Web},
year={2005},
volume={E88-D},
number={10},
pages={2432-2436},
abstract={We propose a system that enables us to gather hundreds of images related to one set of keywords provided by a user from the World Wide Web. The system is called Image Collector II. The Image Collector, which we proposed previously, can gather only one or two hundreds of images. We propose the two following improvements on our previous system in terms of the number of gathered images and their precision: (1) We extract some words appearing with high frequency from all HTML files in which output images are embedded in an initial image gathering, and using them as keywords, we carry out a second image gathering. Through this process, we can obtain hundreds of images for one set of keywords. (2) The more images we gather, the more the precision of gathered images decreases. To improve the precision, we introduce word vectors of HTML files embedding images into the image selecting process in addition to image feature vectors.},
keywords={},
doi={10.1093/ietisy/e88-d.10.2432},
ISSN={},
month={October},}
Copy
TY - JOUR
TI - Image Collector II: A System to Gather a Large Number of Images from the Web
T2 - IEICE TRANSACTIONS on Information
SP - 2432
EP - 2436
AU - Keiji YANAI
PY - 2005
DO - 10.1093/ietisy/e88-d.10.2432
JO - IEICE TRANSACTIONS on Information
SN -
VL - E88-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2005
AB - We propose a system that enables us to gather hundreds of images related to one set of keywords provided by a user from the World Wide Web. The system is called Image Collector II. The Image Collector, which we proposed previously, can gather only one or two hundreds of images. We propose the two following improvements on our previous system in terms of the number of gathered images and their precision: (1) We extract some words appearing with high frequency from all HTML files in which output images are embedded in an initial image gathering, and using them as keywords, we carry out a second image gathering. Through this process, we can obtain hundreds of images for one set of keywords. (2) The more images we gather, the more the precision of gathered images decreases. To improve the precision, we introduce word vectors of HTML files embedding images into the image selecting process in addition to image feature vectors.
ER -