The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Web pages(2hit)

1-2hit
  • Improvements of HITS Algorithms for Spam Links

    Yasuhito ASANO  Yu TEZUKA  Takao NISHIZEKI  

     
    PAPER-Scoring Algorithms

      Vol:
    E91-D No:2
      Page(s):
    200-208

    The HITS algorithm proposed by Kleinberg is one of the representative methods of scoring Web pages by using hyperlinks. In the days when the algorithm was proposed, most of the pages given high score by the algorithm were really related to a given topic, and hence the algorithm could be used to find related pages. However, the algorithm and the variants including Bharat's improved HITS, abbreviated to BHITS, proposed by Bharat and Henzinger cannot be used to find related pages any more on today's Web, due to an increase of spam links. In this paper, we first propose three methods to find "linkfarms," that is, sets of spam links forming a densely connected subgraph of a Web graph. We then present an algorithm, called a trust-score algorithm, to give high scores to pages which are not spam pages with a high probability. Combining the three methods and the trust-score algorithm with BHITS, we obtain several variants of the HITS algorithm. We ascertain by experiments that one of them, named TaN+BHITS using the trust-score algorithm and the method of finding linkfarms by employing name servers, is most suitable for finding related pages on today's Web. Our algorithms take time and memory no more than those required by the original HITS algorithm, and can be executed on a PC with a small amount of main memory.

  • Distributing Requests by (around k)-Bounded Load-Balancing in Web Server Cluster with High Scalability

    MinHwan OK  Myong-soon PARK  

     
    PAPER-Parallel/Distributed Algorithms

      Vol:
    E89-D No:2
      Page(s):
    663-672

    Popular Web sites form their Web servers into Web server clusters. The Web server cluster operates with a load-balancing algorithm to distribute Web requests evenly among Web servers. The load-balancing algorithms founded on conventional periodic load-information update mechanism are not scalable due to the synchronized update of load-information. We propose a load-balancing algorithm that the load-information update is not synchronized by exploiting variant execution times of executing scripts in dynamic Web pages. The load-information of each server is updated 'individually' by a new load-information update mechanism, and the proposed algorithm supports high scalability based on this individual update. Simulation results have proven the improvement in system performance through another aspect of high scalability. Furthermore, the proposed algorithm guarantees some level of QoS for Web clients by fairly distributing requests. A fundamental merit of the proposed algorithm is its simplicity, which supports higher throughput of the Web switch.