The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Paragraph Vector(2hit)

1-2hit
  • Representation Learning for Users' Web Browsing Sequences

    Yukihiro TAGAMI  Hayato KOBAYASHI  Shingo ONO  Akira TAJIMA  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2018/04/20
      Vol:
    E101-D No:7
      Page(s):
    1870-1879

    Modeling user activities on the Web is a key problem for various Web services, such as news article recommendation and ad click prediction. In our work-in-progress paper[1], we introduced an approach that summarizes each sequence of user Web page visits using Paragraph Vector[3], considering users and URLs as paragraphs and words, respectively. The learned user representations are used among the user-related prediction tasks in common. In this paper, on the basis of analysis of our Web page visit data, we propose Backward PV-DM, which is a modified version of Paragraph Vector. We show experimental results on two ad-related data sets based on logs from Web services of Yahoo! JAPAN. Our proposed method achieved better results than those of existing vector models.

  • Semantically Readable Distributed Representation Learning and Its Expandability Using a Word Semantic Vector Dictionary

    Ikuo KESHI  Yu SUZUKI  Koichiro YOSHINO  Satoshi NAKAMURA  

     
    PAPER

      Pubricized:
    2018/01/18
      Vol:
    E101-D No:4
      Page(s):
    1066-1078

    The problem with distributed representations generated by neural networks is that the meaning of the features is difficult to understand. We propose a new method that gives a specific meaning to each node of a hidden layer by introducing a manually created word semantic vector dictionary into the initial weights and by using paragraph vector models. We conducted experiments to test the hypotheses using a single domain benchmark for Japanese Twitter sentiment analysis and then evaluated the expandability of the method using a diverse and large-scale benchmark. Moreover, we tested the domain-independence of the method using a Wikipedia corpus. Our experimental results demonstrated that the learned vector is better than the performance of the existing paragraph vector in the evaluation of the Twitter sentiment analysis task using the single domain benchmark. Also, we determined the readability of document embeddings, which means distributed representations of documents, in a user test. The definition of readability in this paper is that people can understand the meaning of large weighted features of distributed representations. A total of 52.4% of the top five weighted hidden nodes were related to tweets where one of the paragraph vector models learned the document embeddings. For the expandability evaluation of the method, we improved the dictionary based on the results of the hypothesis test and examined the relationship of the readability of learned word vectors and the task accuracy of Twitter sentiment analysis using the diverse and large-scale benchmark. We also conducted a word similarity task using the Wikipedia corpus to test the domain-independence of the method. We found the expandability results of the method are better than or comparable to the performance of the paragraph vector. Also, the objective and subjective evaluation support each hidden node maintaining a specific meaning. Thus, the proposed method succeeded in improving readability.