The search functionality is under construction.

Author Search Result

[Author] Yasuhito ASANO(14hit)

1-14hit
  • Mining and Explaining Relationships in Wikipedia

    Xinpeng ZHANG  Yasuhito ASANO  Masatoshi YOSHIKAWA  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E95-D No:7
      Page(s):
    1918-1931

    Mining and explaining relationships between concepts are challenging tasks in the field of knowledge search. We propose a new approach for the tasks using disjoint paths formed by links in Wikipedia. Disjoint paths are easy to understand and do not contain redundant information. To achieve this approach, we propose a naive method, as well as a generalized flow based method, and a technique for mining more disjoint paths using the generalized flow based method. We also apply the approach to classification of relationships. Our experiments reveal that the generalized flow based method can mine many disjoint paths important for understanding a relationship, and the classification is effective for explaining relationships.

  • Detecting Anomalous Reviewers and Estimating Summaries from Early Reviews Considering Heterogeneity

    Yasuhito ASANO  Junpei KAWAMOTO  

     
    PAPER

      Pubricized:
    2018/01/18
      Vol:
    E101-D No:4
      Page(s):
    1003-1011

    Early reviews, posted on online review sites shortly after products enter the market, are useful for estimating long-term evaluations of those products and making decisions. However, such reviews can be influenced easily by anomalous reviewers, including malicious and fraudulent reviewers, because the number of early reviews is usually small. It is therefore challenging to detect anomalous reviewers from early reviews and estimate long-term evaluations by reducing their influences. We find that two characteristics of heterogeneity on actual review sites such as Amazon.com cause difficulty in detecting anomalous reviewers from early reviews. We propose ideas for consideration of heterogeneity, and a methodology for computing reviewers' degree of anomaly and estimating long-term evaluations simultaneously. Our experimental evaluations with actual reviews from Amazon.com revealed that our proposed method achieves the best performance in 19 of 20 tests compared to state-of-the-art methodologies.

  • Geo-Graph-Indistinguishability: Location Privacy on Road Networks with Differential Privacy

    Shun TAKAGI  Yang CAO  Yasuhito ASANO  Masatoshi YOSHIKAWA  

     
    PAPER

      Pubricized:
    2023/01/16
      Vol:
    E106-D No:5
      Page(s):
    877-894

    In recent years, concerns about location privacy are increasing with the spread of location-based services (LBSs). Many methods to protect location privacy have been proposed in the past decades. Especially, perturbation methods based on Geo-Indistinguishability (GeoI), which randomly perturb a true location to a pseudolocation, are getting attention due to its strong privacy guarantee inherited from differential privacy. However, GeoI is based on the Euclidean plane even though many LBSs are based on road networks (e.g. ride-sharing services). This causes unnecessary noise and thus an insufficient tradeoff between utility and privacy for LBSs on road networks. To address this issue, we propose a new privacy notion, Geo-Graph-Indistinguishability (GeoGI), for locations on a road network to achieve a better tradeoff. We propose Graph-Exponential Mechanism (GEM), which satisfies GeoGI. Moreover, we formalize the optimization problem to find the optimal GEM in terms of the tradeoff. However, the computational complexity of a naive method to find the optimal solution is prohibitive, so we propose a greedy algorithm to find an approximate solution in an acceptable amount of time. Finally, our experiments show that our proposed mechanism outperforms GeoI mechanisms, including optimal GeoI mechanism, with respect to the tradeoff.

  • Estimating Knowledge Category Coverage by Courses Based on Centrality in Taxonomy

    Yiling DAI  Masatoshi YOSHIKAWA  Yasuhito ASANO  

     
    PAPER

      Pubricized:
    2019/12/26
      Vol:
    E103-D No:5
      Page(s):
    928-938

    The proliferation of Massive Open Online Courses has made it a challenge for the user to select a proper course. We assume a situation in which the user has targeted on the knowledge defined by some knowledge categories. Then, knowing how much of the knowledge in the category is covered by the courses will be helpful in the course selection. In this study, we define a concept of knowledge category coverage and aim to estimate it in a semi-automatic manner. We first model the knowledge category and the course as a set of concepts, and then utilize a taxonomy and the idea of centrality to differentiate the importance of concepts. Finally, we obtain the coverage value by calculating how much of the concepts required in a knowledge category is also taught in a course. Compared with treating the concepts uniformly important, we found that our proposed method can effectively generate closer coverage values to the ground truth assigned by domain experts.

  • Improvements of HITS Algorithms for Spam Links

    Yasuhito ASANO  Yu TEZUKA  Takao NISHIZEKI  

     
    PAPER-Scoring Algorithms

      Vol:
    E91-D No:2
      Page(s):
    200-208

    The HITS algorithm proposed by Kleinberg is one of the representative methods of scoring Web pages by using hyperlinks. In the days when the algorithm was proposed, most of the pages given high score by the algorithm were really related to a given topic, and hence the algorithm could be used to find related pages. However, the algorithm and the variants including Bharat's improved HITS, abbreviated to BHITS, proposed by Bharat and Henzinger cannot be used to find related pages any more on today's Web, due to an increase of spam links. In this paper, we first propose three methods to find "linkfarms," that is, sets of spam links forming a densely connected subgraph of a Web graph. We then present an algorithm, called a trust-score algorithm, to give high scores to pages which are not spam pages with a high probability. Combining the three methods and the trust-score algorithm with BHITS, we obtain several variants of the HITS algorithm. We ascertain by experiments that one of them, named TaN+BHITS using the trust-score algorithm and the method of finding linkfarms by employing name servers, is most suitable for finding related pages on today's Web. Our algorithms take time and memory no more than those required by the original HITS algorithm, and can be executed on a PC with a small amount of main memory.

  • Purpose-Feature Relationship Mining from Online Reviews towards Purpose-Oriented Recommendation

    Sopheaktra YONG  Yasuhito ASANO  

     
    PAPER

      Pubricized:
    2018/01/18
      Vol:
    E101-D No:4
      Page(s):
    1021-1029

    To help with decision making, online shoppers tend to go through both a list of a product's features and functionality provided by the vendor, as well as a list of reviews written by other users. Unfortunately, this process is ineffective when the buyer is confronted with large amounts of information, particularly when the buyer has limited experience with and knowledge of the product. In order to avoid this problem, we propose a framework of purpose-oriented recommendation that presents a ranked list of products suitable for a designated user purpose by identifying important product features to fulfill the purpose from online reviews. As technical foundation for realizing the framework, we propose several methods to mine relation between user purposes and product features from the consumer reviews. Using digital camera reviews on Amazon.com, the experimental results show that our proposed method is both effective and stable, with an acceptable rate of precision and recall.

  • Finding Neighbor Communities in the Web Using an Inter-Site Graph

    Yasuhito ASANO  Hiroshi IMAI  Masashi TOYODA  Masaru KITSUREGAWA  

     
    PAPER-Database

      Vol:
    E87-D No:9
      Page(s):
    2163-2170

    In this paper, we present Neighbor Community Finder (NCF, for short), a tool for finding Web communities related to given URLs. While existing link-based methods of finding communities, such as HITS, trawling, and Companion, use algorithms running on a Web graph whose vertices are pages and edges are links on the Web, NCF uses an algorithm running on an inter-site graph whose vertices are sites and edges are global-links (links between sites). Since the phrase "Web site" is used ambiguously in our daily life and has no unique definition, NCF uses directory-based sites proposed by the authors as a model of Web sites. NCF receives URLs interested in by a user and constructs an inter-site graph containing neighbor sites of the given URLs by using a method of identifying directory-based sites from URL and link data obtained from the actual Web on demand. By computational experiments, we show that NCF achieves higher quality than Google's "Similar Pages" service for finding pages related to given URLs corresponding to various topics selected from among the directories of Yahoo! Japan.

  • Compact Encoding of the Web Graph Exploiting Various Power Distributions

    Yasuhito ASANO  Tsuyoshi ITO  Hiroshi IMAI  Masashi TOYODA  Masaru KITSUREGAWA  

     
    LETTER

      Vol:
    E87-A No:5
      Page(s):
    1183-1184

    Compact encodings of the web graph are required in order to keep the graph on the main memory and to perform operations on the graph efficiently. In this paper, we propose a new compact encoding of the web graph. It is 10% more compact than Link2 used in the Connectivity Server of Altavista and 20% more compact than the encoding proposed by Guillaume et al. in 2002 and is comparable to it in terms of extraction time.

  • Composition Proposal Generation for Manga Creation Support

    Hironori ITO  Yasuhito ASANO  

     
    PAPER

      Pubricized:
    2019/12/27
      Vol:
    E103-D No:5
      Page(s):
    949-957

    In recent years, cognition and use of manga pervade, and people who use manga for various purposes such as entertainment, study, marketing are increasing more and more. However, when people who do not specialize in it create it for these purposes, they can write plots expressing what they want to convey but the technique of the composition which arranges elements in manga such as characters or balloons corresponding to the plot create obstacles to using its merits for comprehensibility based on high flexibility of its expression. Therefore, we consider that support of this composition technique is necessary for amateurs to use manga while taking advantage of its benefits. We propose a method of generating composition proposal to support manga creation by amateurs. For the method, we also define new manga metadata model which summarize and extend metadata models by earlier studies. It represents the compostion and the plot in manga. We apply a neural machine translation mechanism for learing the relation between the composition and the plot. It considers that the plot annotation is the source of the composition annotation that is the target, and learns from the annotation dataset based on the metadata model. We conducted experiments to evaluate how the composition proposal generated by our method helps amateur manga creation, and demonstrated that it is useful.

  • Adaptive Balanced Allocation for Peer Assessments

    Hideaki OHASHI  Yasuhito ASANO  Toshiyuki SHIMIZU  Masatoshi YOSHIKAWA  

     
    PAPER

      Pubricized:
    2019/12/26
      Vol:
    E103-D No:5
      Page(s):
    939-948

    Peer assessments, in which people review the works of peers and have their own works reviewed by peers, are useful for assessing homework. In conventional peer assessment systems, works are usually allocated to people before the assessment begins; therefore, if people drop out (abandoning reviews) during an assessment period, an imbalance occurs between the number of works a person reviews and that of peers who have reviewed the work. When the total imbalance increases, some people who diligently complete reviews may suffer from a lack of reviews and be discouraged to participate in future peer assessments. Therefore, in this study, we adopt a new adaptive allocation approach in which people are allocated review works only when requested and propose an algorithm for allocating works to people, which reduces the total imbalance. To show the effectiveness of the proposed algorithm, we provide an upper bound of the total imbalance that the proposed algorithm yields. In addition, we extend the above algorithm to consider reviewing ability. The extended algorithm avoids the problem that only unskilled (or skilled) reviewers are allocated to a given work. We show the effectiveness of the proposed two algorithms compared to the existing algorithms through experiments using simulation data.

  • Efficient Compression of Web Graphs

    Yasuhito ASANO  Yuya MIYAWAKI  Takao NISHIZEKI  

     
    PAPER-Data Compression

      Vol:
    E92-A No:10
      Page(s):
    2454-2462

    Several methods have been proposed for compressing the linkage data of a Web graph. Among them, the method proposed by Boldi and Vigna is known as the most efficient one. In the paper, we propose a new method to compress a Web graph. Our method is more efficient than theirs with respect to the size of the compressed data. For example, our method needs only 1.99 bits per link to compress a Web graph containing 3,216,152 links connecting 325,557 pages, while the method of Boldi and Vigna needs 2.84 bits per link to compress the same Web graph.

  • Mining Communities on the Web Using a Max-Flow and a Site-Oriented Framework

    Yasuhito ASANO  Takao NISHIZEKI  Masashi TOYODA  Masaru KITSUREGAWA  

     
    PAPER-Data Mining

      Vol:
    E89-D No:10
      Page(s):
    2606-2615

    There are several methods for mining communities on the Web using hyperlinks. One of the well-known ones is a max-flow based method proposed by Flake et al. The method adopts a page-oriented framework, that is, it uses a page on the Web as a unit of information, like other methods including HITS and trawling. Recently, Asano et al. built a site-oriented framework which uses a site as a unit of information, and they experimentally showed that trawling on the site-oriented framework often outputs significantly better communities than trawling on the page-oriented framework. However, it has not been known whether the site-oriented framework is effective in mining communities through the max-flow based method. In this paper, we first point out several problems of the max-flow based method, mainly owing to the page-oriented framework, and then propose solutions to the problems by utilizing several advantages of the site-oriented framework. Computational experiments reveal that our max-flow based method on the site-oriented framework is very effective in mining communities, related to the topics of given pages, in comparison with the original max-flow based method on the page-oriented framework.

  • Mining Knowledge on Relationships between Objects from the Web

    Xinpeng ZHANG  Yasuhito ASANO  Masatoshi YOSHIKAWA  

     
    PAPER-Artificial Intelligence, Data Mining

      Vol:
    E97-D No:1
      Page(s):
    77-88

    How do global warming and agriculture influence each other? It is possible to answer the question by searching knowledge about the relationship between global warming and agriculture. As exemplified by this question, strong demands exist for searching relationships between objects. Mining knowledge about relationships on Wikipedia has been studied. However, it is desired to search more diverse knowledge about relationships on the Web. By utilizing the objects constituting relationships mined from Wikipedia, we propose a new method to search images with surrounding text that include knowledge about relationships on the Web. Experimental results show that our method is effective and applicable in searching knowledge about relationships. We also construct a relationship search system named “Enishi” based on the proposed new method. Enishi supplies a wealth of diverse knowledge including images with surrounding text to help users to understand relationships deeply, by complementarily utilizing knowledge from Wikipedia and the Web.

  • Time Graph Pattern Mining for Network Analysis and Information Retrieval Open Access

    Yasuhito ASANO  Taihei OSHINO  Masatoshi YOSHIKAWA  

     
    PAPER

      Vol:
    E97-D No:4
      Page(s):
    733-742

    Graph pattern mining has played important roles in network analysis and information retrieval. However, temporal characteristics of networks have not been estimated sufficiently. We propose time graph pattern mining as a new concept of graph mining reflecting the temporal information of a network. We conduct two case studies of time graph pattern mining: extensively discussed topics on blog sites and a book recommendation network. Through examination of case studies, we ascertain that time graph pattern mining has numerous possibilities as a novel means for information retrieval and network analysis reflecting both structural and temporal characteristics.