The search functionality is under construction.

Author Search Result

[Author] Fumiyo FUKUMOTO(3hit)

1-3hit
  • Generating Category Hierarchy for Classifying Large Corpora

    Fumiyo FUKUMOTO  Yoshimi SUZUKI  

     
    PAPER-Natural Language Processing

      Vol:
    E89-D No:4
      Page(s):
    1543-1554

    We address the problem of dealing with large collections of data, and investigate the use of automatically constructing domain specific category hierarchies to improve text classification. We use two well-known techniques, the partitioning clustering method called k-means and loss function, to create the category hierarchy. The k-means method involves iterating through the data that the system is permitted to classify during each iteration and construction of a hierarchical structure. In general, the number of clusters k is not given beforehand. Therefore, we used a loss function that measures the degree of disappointment in any differences between the true distribution over inputs and the learner's prediction to select the appropriate number of clusters k. Once the optimal number of k is selected, the procedure is repeated for each cluster. Our evaluation using the 1996 Reuters corpus, which consists of 806,791 documents, showed that automatically constructing hierarchies improves classification accuracy.

  • Link Analysis Based on Rhetorical Relations for Multi-Document Summarization

    Nik Adilah Hanin BINTI ZAHRI  Fumiyo FUKUMOTO  Suguru MATSUYOSHI  

     
    PAPER-Natural Language Processing

      Vol:
    E96-D No:5
      Page(s):
    1182-1191

    This paper presents link analysis based on rhetorical relations with the aim of performing extractive summarization for multiple documents. We first extracted sentences with salient terms from individual document using statistical model. We then ranked the extracted sentences by measuring their relative importance according to their connectivity among the sentences in the document set using PageRank based on the rhetorical relations. The rhetorical relations were examined beforehand to determine which relations are crucial to this task, and the relations among sentences from documents were automatically identified by SVMs. We used the relations to emphasize important sentences during sentence ranking by PageRank and eliminate redundancy from the summary candidates. Our framework omits fully annotated sentences by humans and the evaluation results show that the combination of PageRank along with rhetorical relations does help to improve the quality of extractive summarization.

  • Citation Count Prediction Based on Neural Hawkes Model

    Lisha LIU  Dongjin YU  Dongjing WANG  Fumiyo FUKUMOTO  

     
    PAPER-Biocybernetics, Neurocomputing

      Pubricized:
    2020/08/03
      Vol:
    E103-D No:11
      Page(s):
    2379-2388

    With the rapid development of scientific research, the number of publications, such as scientific papers and patents, has grown rapidly. It becomes increasingly important to identify those with high quality and great impact from such a large volume of publications. Citation count is one of the well-known indicators of the future impact of the publications. However, how to interpret a large number of uncertain factors of publications as relevant features and utilize them to capture the impact of publications over time is still a challenging problem. This paper presents an approach that effectively leverages a variety of factors with a neural-based citation prediction model. Specifically, the proposed model is based on the Neural Hawkes Process (NHP) with the continuous-time Long Short-Term Memory (cLSTM), which can capture the aging effect and the phenomenon of sleeping beauty more effectively from publication covariates as well as citation counts. The experimental results on two datasets show that the proposed approach outperforms the state-of-the-art baselines. In addition, the contribution of covariates to performance improvement is also verified.