The search functionality is under construction.

IEICE TRANSACTIONS on Information

A Web Page Segmentation Approach Using Visual Semantics

Jun ZENG, Brendan FLANAGAN, Sachio HIROKAWA, Eisuke ITO

  • Full Text Views

    0

  • Cite this

Summary :

Web page segmentation has a variety of benefits and potential web applications. Early techniques of web page segmentation are mainly based on machine learning algorithms and rule-based heuristics, which cannot be used for large-scale page segmentation. In this paper, we propose a formulated page segmentation method using visual semantics. Instead of analyzing the visual cues of web pages, this method utilizes three measures to formulate the visual semantics: layout tree is used to recognize the visual similar blocks; seam degree is used to describe how neatly the blocks are arranged; content similarity is used to describe the content coherent degree between blocks. A comparison experiment was done using the VIPS algorithm as a baseline. Experiment results show that the proposed method can divide a Web page into appropriate semantic segments.

Publication
IEICE TRANSACTIONS on Information Vol.E97-D No.2 pp.223-230
Publication Date
2014/02/01
Publicized
Online ISSN
1745-1361
DOI
10.1587/transinf.E97.D.223
Type of Manuscript
PAPER
Category
Data Engineering, Web Information Systems

Authors

Jun ZENG
  Kyushu University
Brendan FLANAGAN
  Kyushu University
Sachio HIROKAWA
  Kyushu University
Eisuke ITO
  Kyushu University

Keyword