1-3hit |
As the data size of Web-related multi-label classification problems continues to increase, the label space has also grown extremely large. For example, the number of labels appearing in Web page tagging and E-commerce recommendation tasks reaches hundreds of thousands or even millions. In this paper, we propose a graph partitioning tree (GPT), which is a novel approach for extreme multi-label learning. At an internal node of the tree, the GPT learns a linear separator to partition a feature space, considering approximate k-nearest neighbor graph of the label vectors. We also developed a simple sequential optimization procedure for learning the linear binary classifiers. Extensive experiments on large-scale real-world data sets showed that our method achieves better prediction accuracy than state-of-the-art tree-based methods, while maintaining fast prediction.
Yukihiro TAGAMI Hayato KOBAYASHI Shingo ONO Akira TAJIMA
Modeling user activities on the Web is a key problem for various Web services, such as news article recommendation and ad click prediction. In our work-in-progress paper[1], we introduced an approach that summarizes each sequence of user Web page visits using Paragraph Vector[3], considering users and URLs as paragraphs and words, respectively. The learned user representations are used among the user-related prediction tasks in common. In this paper, on the basis of analysis of our Web page visit data, we propose Backward PV-DM, which is a modified version of Paragraph Vector. We show experimental results on two ad-related data sets based on logs from Web services of Yahoo! JAPAN. Our proposed method achieved better results than those of existing vector models.
Extreme multi-label classification methods have been widely used in Web-scale classification tasks such as Web page tagging and product recommendation. In this paper, we present a novel graph embedding method called “AnnexML”. At the training step, AnnexML constructs a k-nearest neighbor graph of label vectors and attempts to reproduce the graph structure in the embedding space. The prediction is efficiently performed by using an approximate nearest neighbor search method that efficiently explores the learned k-nearest neighbor graph in the embedding space. We conducted evaluations on several large-scale real-world data sets and compared our method with recent state-of-the-art methods. Experimental results show that our AnnexML can significantly improve prediction accuracy, especially on data sets that have a larger label space. In addition, AnnexML improves the trade-off between prediction time and accuracy. At the same level of accuracy, the prediction time of AnnexML was up to 58 times faster than that of SLEEC, a state-of-the-art embedding-based method.