1-2hit |
Adam JATOWT Yukiko KAWAI Katsumi TANAKA
Due to the increased preservation efforts, large amounts of past Web data have been stored in Web archives and other archival repositories. Utilizing this data can offer certain benefits to users, for example, it can facilitate page understanding. In this paper, we propose a system for interactive exploration of page histories. We demonstrate an application called Page History Explorer (PHE) for summarizing and visualizing histories of Web pages. PHE portrays the overview of page evolution, characterizes its typical content over time and lets users observe page histories from different viewpoints. In addition, it enables flexible comparison of histories of different pages.
Hideki KAWAI Adam JATOWT Katsumi TANAKA Kazuo KUNIEDA Keiji YAMADA
This paper introduces a future and past search engine, ChronoSeeker, which can help users to develop long-term strategies for their organizations. To provide on-demand searches, we tackled two technical issues: (1) organizing efficient event searches and (2) filtering out noises from search results. Our system employed query expansion with typical expressions related to event information such as year expressions, temporal modifiers, and context terms for efficient event searches. We utilized a machine-learning technique of filtering noise to classify candidates into information or non-event information, using heuristic features and lexical patterns derived from a text-mining approach. Our experiment revealed that filtering achieved an 85% F-measure, and that query expansion could collect dozens more events than those without expansion.