There are several methods for mining communities on the Web using hyperlinks. One of the well-known ones is a max-flow based method proposed by Flake et al. The method adopts a page-oriented framework, that is, it uses a page on the Web as a unit of information, like other methods including HITS and trawling. Recently, Asano et al. built a site-oriented framework which uses a site as a unit of information, and they experimentally showed that trawling on the site-oriented framework often outputs significantly better communities than trawling on the page-oriented framework. However, it has not been known whether the site-oriented framework is effective in mining communities through the max-flow based method. In this paper, we first point out several problems of the max-flow based method, mainly owing to the page-oriented framework, and then propose solutions to the problems by utilizing several advantages of the site-oriented framework. Computational experiments reveal that our max-flow based method on the site-oriented framework is very effective in mining communities, related to the topics of given pages, in comparison with the original max-flow based method on the page-oriented framework.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Yasuhito ASANO, Takao NISHIZEKI, Masashi TOYODA, Masaru KITSUREGAWA, "Mining Communities on the Web Using a Max-Flow and a Site-Oriented Framework" in IEICE TRANSACTIONS on Information,
vol. E89-D, no. 10, pp. 2606-2615, October 2006, doi: 10.1093/ietisy/e89-d.10.2606.
Abstract: There are several methods for mining communities on the Web using hyperlinks. One of the well-known ones is a max-flow based method proposed by Flake et al. The method adopts a page-oriented framework, that is, it uses a page on the Web as a unit of information, like other methods including HITS and trawling. Recently, Asano et al. built a site-oriented framework which uses a site as a unit of information, and they experimentally showed that trawling on the site-oriented framework often outputs significantly better communities than trawling on the page-oriented framework. However, it has not been known whether the site-oriented framework is effective in mining communities through the max-flow based method. In this paper, we first point out several problems of the max-flow based method, mainly owing to the page-oriented framework, and then propose solutions to the problems by utilizing several advantages of the site-oriented framework. Computational experiments reveal that our max-flow based method on the site-oriented framework is very effective in mining communities, related to the topics of given pages, in comparison with the original max-flow based method on the page-oriented framework.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e89-d.10.2606/_p
Copy
@ARTICLE{e89-d_10_2606,
author={Yasuhito ASANO, Takao NISHIZEKI, Masashi TOYODA, Masaru KITSUREGAWA, },
journal={IEICE TRANSACTIONS on Information},
title={Mining Communities on the Web Using a Max-Flow and a Site-Oriented Framework},
year={2006},
volume={E89-D},
number={10},
pages={2606-2615},
abstract={There are several methods for mining communities on the Web using hyperlinks. One of the well-known ones is a max-flow based method proposed by Flake et al. The method adopts a page-oriented framework, that is, it uses a page on the Web as a unit of information, like other methods including HITS and trawling. Recently, Asano et al. built a site-oriented framework which uses a site as a unit of information, and they experimentally showed that trawling on the site-oriented framework often outputs significantly better communities than trawling on the page-oriented framework. However, it has not been known whether the site-oriented framework is effective in mining communities through the max-flow based method. In this paper, we first point out several problems of the max-flow based method, mainly owing to the page-oriented framework, and then propose solutions to the problems by utilizing several advantages of the site-oriented framework. Computational experiments reveal that our max-flow based method on the site-oriented framework is very effective in mining communities, related to the topics of given pages, in comparison with the original max-flow based method on the page-oriented framework.},
keywords={},
doi={10.1093/ietisy/e89-d.10.2606},
ISSN={1745-1361},
month={October},}
Copy
TY - JOUR
TI - Mining Communities on the Web Using a Max-Flow and a Site-Oriented Framework
T2 - IEICE TRANSACTIONS on Information
SP - 2606
EP - 2615
AU - Yasuhito ASANO
AU - Takao NISHIZEKI
AU - Masashi TOYODA
AU - Masaru KITSUREGAWA
PY - 2006
DO - 10.1093/ietisy/e89-d.10.2606
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E89-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2006
AB - There are several methods for mining communities on the Web using hyperlinks. One of the well-known ones is a max-flow based method proposed by Flake et al. The method adopts a page-oriented framework, that is, it uses a page on the Web as a unit of information, like other methods including HITS and trawling. Recently, Asano et al. built a site-oriented framework which uses a site as a unit of information, and they experimentally showed that trawling on the site-oriented framework often outputs significantly better communities than trawling on the page-oriented framework. However, it has not been known whether the site-oriented framework is effective in mining communities through the max-flow based method. In this paper, we first point out several problems of the max-flow based method, mainly owing to the page-oriented framework, and then propose solutions to the problems by utilizing several advantages of the site-oriented framework. Computational experiments reveal that our max-flow based method on the site-oriented framework is very effective in mining communities, related to the topics of given pages, in comparison with the original max-flow based method on the page-oriented framework.
ER -