The search functionality is under construction.

IEICE TRANSACTIONS on Information

Efficient Algorithm for Math Formula Semantic Search

Shunsuke OHASHI, Giovanni Yoko KRISTIANTO, Goran TOPIC, Akiko AIZAWA

  • Full Text Views

    0

  • Cite this

Summary :

Mathematical formulae play an important role in many scientific domains. Regardless of the importance of mathematical formula search, conventional keyword-based retrieval methods are not sufficient for searching mathematical formulae, which are structured as trees. The increasing number as well as the structural complexity of mathematical formulae in scientific articles lead to the necessity for large-scale structure-aware formula search techniques. In this paper, we formulate three types of measures that represent distinctive features of semantic similarity of math formulae, and develop efficient hash-based algorithms for the approximate calculation. Our experiments using NTCIR-11 Math-2 Task dataset, a large-scale test collection for math information retrieval with about 60-million formulae, show that the proposed method improves the search precision while also keeps the scalability and runtime efficiency high.

Publication
IEICE TRANSACTIONS on Information Vol.E99-D No.4 pp.979-988
Publication Date
2016/04/01
Publicized
2016/01/14
Online ISSN
1745-1361
DOI
10.1587/transinf.2015DAP0023
Type of Manuscript
Special Section PAPER (Special Section on Data Engineering and Information Management)
Category

Authors

Shunsuke OHASHI
  The University of Tokyo
Giovanni Yoko KRISTIANTO
  The University of Tokyo
Goran TOPIC
  National Institute of Informatics
Akiko AIZAWA
  The University of Tokyo,National Institute of Informatics

Keyword