The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Web(221hit)

121-140hit(221hit)

  • An Unsupervised Model of Redundancy for Answer Validation

    Youzheng WU  Hideki KASHIOKA  Satoshi NAKAMURA  

     
    PAPER-Natural Language Processing

      Vol:
    E93-D No:3
      Page(s):
    624-634

    Given a question and a set of its candidate answers, the task of answer validation (AV) aims to return a Boolean value indicating whether a given candidate answer is the correct answer to the question. Unlike previous works, this paper presents an unsupervised model, called the U-model, for AV. This approach regards AV as a classification task and investigates how effectively using redundancy of the Web into the proposed architecture. Experimental results with TREC factoid test sets and Chinese test sets indicate that the proposed U-model with redundancy information is very effective for AV. For example, the top@1/mrr@5 scores on the TREC05, and 06 tracks are 40.1/51.5% and 35.8/47.3%, respectively. Furthermore, a cross-model comparison experiment demonstrates that the U-model is the best among the redundancy-based models considered. Even compared with a syntax-based approach, a supervised machine learning approach and a pattern-based approach, the U-model performs much better.

  • Service Independent Access Control Architecture for User Generated Content (UGC) and Its Implementation

    Akira YAMADA  Ayumu KUBOTA  Yutaka MIYAKE  Kazuo HASHIMOTO  

     
    PAPER-DRM and Security

      Vol:
    E92-D No:10
      Page(s):
    1961-1970

    Using Web-based content management systems such as Blog, an end user can easily publish User Generated Content (UGC). Although publishing of UGCs is easy, controlling access to them is a difficult problem for end users. Currently, most of Blog sites offer no access control mechanism, and even when it is available to users, it is not sufficient to control users who do not have an account at the site, not to mention that it cannot control accesses to content hosted by other UGC sites. In this paper, we propose new access control architecture for UGC, in which third party entities can offer access control mechanism to users independently of UGC hosting sites. With this architecture, a user can control accesses to his content that might be spread over many different UGC sites, regardless of whether those sites have access control mechanism or not. The key idea to separate access control mechanism from UGC sites is to apply cryptographic access control and we implemented the idea in such a way that it requires no modification to UGC sites and Web browsers. Our prototype implementation shows that the proposed access control architecture can be easily deployed in the current Web-based communication environment and it works quite well with popular Blog sites.

  • Efficient Compression of Web Graphs

    Yasuhito ASANO  Yuya MIYAWAKI  Takao NISHIZEKI  

     
    PAPER-Data Compression

      Vol:
    E92-A No:10
      Page(s):
    2454-2462

    Several methods have been proposed for compressing the linkage data of a Web graph. Among them, the method proposed by Boldi and Vigna is known as the most efficient one. In the paper, we propose a new method to compress a Web graph. Our method is more efficient than theirs with respect to the size of the compressed data. For example, our method needs only 1.99 bits per link to compress a Web graph containing 3,216,152 links connecting 325,557 pages, while the method of Boldi and Vigna needs 2.84 bits per link to compress the same Web graph.

  • A Novel Video Retrieval Method Based on Web Community Extraction Using Features of Video Materials

    Yasutaka HATAKEYAMA  Takahiro OGAWA  Satoshi ASAMIZU  Miki HASEYAMA  

     
    PAPER-Image

      Vol:
    E92-A No:8
      Page(s):
    1961-1969

    A novel video retrieval method based on Web community extraction using audio and visual features and textual features of video materials is proposed in this paper. In this proposed method, canonical correlation analysis is applied to these three features calculated from video materials and their Web pages, and transformation of each feature into the same variate space is possible. The transformed variates are based on the relationships between visual, audio and textual features of video materials, and the similarity between video materials in the same feature space for each feature can be calculated. Next, the proposed method introduces the obtained similarities of video materials into the link relationship between their Web pages. Furthermore, by performing link analysis of the obtained weighted link relationship, this approach extracts Web communities including similar topics and provides the degree of attribution of video materials in each Web community for each feature. Therefore, by calculating similarities of the degrees of attribution between the Web communities extracted from the three kinds of features, the desired ones are automatically selected. Consequently, by monitoring the degrees of attribution of the obtained Web communities, the proposed method can perform effective video retrieval. Some experimental results obtained by applying the proposed method to video materials obtained from actual Web pages are shown to verify the effectiveness of the proposed method.

  • Information-Flow-Based Access Control for Web Browsers

    Sachiko YOSHIHAMA  Takaaki TATEISHI  Naoshi TABUCHI  Tsutomu MATSUMOTO  

     
    PAPER-Authentication and Authorization Techniques

      Vol:
    E92-D No:5
      Page(s):
    836-850

    The emergence of Web 2.0 technologies such as Ajax and Mashup has revealed the weakness of the same-origin policy [1], the current de facto standard for the Web browser security model. We propose a new browser security model to allow fine-grained access control in the client-side Web applications for secure mashup and user-generated contents. We propose a browser security model that is based on information-flow-based access control (IBAC) to overcome the dynamic nature of the client-side Web applications and to accurately determine the privilege of scripts in the event-driven programming model.

  • Enriching OSGi Service Composition with Web Services

    Choonhwa LEE  Sunghoon KO  Eunsam KIM  Wonjun LEE  

     
    LETTER-System Programs

      Vol:
    E92-D No:5
      Page(s):
    1177-1180

    This letter describes combining OSGi and Web Services in service composition. According to our approach, a composite service is described in WS-BPEL. Each component service in the description may be resolved to either an OSGi service or Web Service at runtime. The proposal can overcome current limitations with OSGi technology in terms of its geographical coverage and candidate service population available for service composition.

  • Development of NETCONF-Based Network Management Systems in Web Services Framework

    Tomoyuki IIJIMA  Hiroyasu KIMURA  Makoto KITANI  Yoshifumi ATARASHI  

     
    PAPER

      Vol:
    E92-B No:4
      Page(s):
    1104-1111

    To develop a network management system (NMS) more easily, the authors developed an application programming interface (API) for configuring network devices. Because this API is used in a Java development environment, an NMS can be developed by utilizing the API and other commonly available Java libraries. It is thus possible to easily develop an NMS that is highly compatible with other IT systems. And operations that are generated from the API and that are exchanged between the NMS and network devices are based on NETCONF, which is standardized by the Internet Engineering Task Force (IETF) as a next-generation network-configuration protocol. Adopting a standardized technology ensures that the NMS developed by using the API can manage network devices provided from multi-vendors in a unified manner. Furthermore, the configuration items exchanged over NETCONF are specified in an object-oriented design. They are therefore easier to manage than such items in the Management Information Base (MIB), which is defined as data to be managed by the Simple Network Management Protocol (SNMP). We actually developed several NMSs by using the API. Evaluation of these NMSs showed that, in terms of configuration time and development time, the NMS developed by using the API performed as well as NMSs developed by using a command line interface (CLI) and SNMP. The NMS developed by using the API showed feasibility to achieve "autonomic network management" and "high interoperability with IT systems."

  • An Effective Self-Adaptive Admission Control Algorithm for Large Web Caches

    Chul-Woong YANG  Ki Yong LEE  Yon Dohn CHUNG  Myoung Ho KIM  Yoon-Joon LEE  

     
    LETTER-Contents Technology and Web Information Systems

      Vol:
    E92-D No:4
      Page(s):
    732-735

    In this paper, we propose an effective Web cache admission control algorithm. By selectively admitting objects into the cache, the proposed algorithm can significantly reduce the amount of disk I/O on a Web cache while maintaining a high hit ratio. The proposed algorithm adaptively adjusts its own admission control parameter, requiring no user-supplied parameters. Through extensive experiments, we show the effectiveness of the proposed algorithm.

  • RDFacl: A Secure Access Control Model Based on RDF Triple

    Jaehoon KIM  Seog PARK  

     
    PAPER-Application Information Security

      Vol:
    E92-D No:1
      Page(s):
    41-50

    An expectation for more intelligent Web is recently being reflected through the new research field called Semantic Web. In this paper, related with Semantic Web security, we introduce an RDF triple based access control model having explicit authorization propagation by inheritance and implicit authorization propagation by inference. Especially, we explain an authorization conflict problem between the explicit and the implicit authorization propagation, which is an important concept in access control for Semantic Web. We also propose a novel conflict detection algorithm using graph labeling techniques in order to efficiently find authorization conflicts. Some experimental results show that the proposed detection algorithm has much better performance than the existing detection algorithm when data size and number of specified authorizations become larger.

  • Monotone Increasing Binary Similarity and Its Application to Automatic Document-Acquisition of a Category

    Izumi SUZUKI  Yoshiki MIKAMI  Ario OHSATO  

     
    PAPER-Knowledge Acquisition

      Vol:
    E91-D No:11
      Page(s):
    2545-2551

    A technique that acquires documents in the same category with a given short text is introduced. Regarding the given text as a training document, the system marks up the most similar document, or sufficiently similar documents, from among the document domain (or entire Web). The system then adds the marked documents to the training set to learn the set, and this process is repeated until no more documents are marked. Setting a monotone increasing property to the similarity as it learns enables the system to 1) detect the correct timing so that no more documents remain to be marked and to 2) decide the threshold value that the classifier uses. In addition, under the condition that the normalization process is limited to what term weights are divided by a p-norm of the weights, the linear classifier in which training documents are indexed in a binary manner is the only instance that satisfies the monotone increasing property. The feasibility of the proposed technique was confirmed through an examination of binary similarity and using English and German documents randomly selected from the Web.

  • Novel Topic Maps to RDF/RDF Schema Translation Method

    Shinae SHIN  Dongwon JEONG  Doo-Kwon BAIK  

     
    PAPER-Knowledge Representation

      Vol:
    E91-D No:11
      Page(s):
    2626-2637

    We propose an enhanced method for translating Topic Maps to RDF/RDF Schema, to realize the Semantic Web. A critical issue for the Semantic Web is to efficiently and precisely describe Web information resources, i.e., Web metadata. Two representative standards, Topic Maps and RDF have been used for Web metadata. RDF-based standardization and implementation of the Semantic Web have been actively performed. Since the Semantic Web must accept and understand all Web information resources that are represented with the other methods, Topic Maps-to-RDF translation has become an issue. Even though many Topic Maps to RDF translation methods have been devised, they still have several problems (e.g. semantic loss, complex expression, etc.). Our translation method provides an improved solution to these problems. This method shows lower semantic loss than the previous methods due to extract both explicit semantics and implicit semantics. Compared to the previous methods, our method reduces the encoding complexity of resulting RDF. In addition, in terms of reversibility, the proposed method regenerates all Topic Maps constructs in an original source when is reverse translated.

  • combiSQORE: A Combinative-Ontology Retrieval System for Next Generation Semantic Web Applications

    Rachanee UNGRANGSI  Chutiporn ANUTARIYA  Vilas WUWONGSE  

     
    PAPER-Knowledge Representation

      Vol:
    E91-D No:11
      Page(s):
    2616-2625

    In order to timely response to a user query at run-time, next generation Semantic Web applications demand a robust mechanism to dynamically select one or more existing ontologies available on the Web and combine them automatically if needed. Although existing ontology retrieval systems return a lengthy list of resultant ontologies, they cannot identify which ones can completely meet the query requirements nor determine a minimum set of resultant ontologies that can jointly satisfy the requirements if no single ontology is available to satisfy them. Therefore, this paper presents an ontology retrieval system, namely combiSQORE, which can return single or combinative ontologies that completely satisfy a submitted query when the available ontology database is adequate to answer such query. In addition, the proposed system ranks the returned results based on their semantic similarities to the given query and their modification (integration) costs. The experimental results show that combiSQORE system yields practical combinative ontologies and useful rankings.

  • Distributed Computing Software Building-Blocks for Ubiquitous Computing Societies

    K.H. (Kane) KIM  

     
    INVITED PAPER

      Vol:
    E91-D No:9
      Page(s):
    2233-2242

    The steady approach of advanced nations toward realization of ubiquitous computing societies has given birth to rapidly growing demands for new-generation distributed computing (DC) applications. Consequently, economic and reliable construction of new-generation DC applications is currently a major issue faced by the software technology research community. What is needed is a new-generation DC software engineering technology which is at least multiple times more effective in constructing new-generation DC applications than the currently practiced technologies are. In particular, this author believes that a new-generation building-block (BB), which is much more advanced than the current-generation DC object that is a small extension of the object model embedded in languages C++, Java, and C#, is needed. Such a BB should enable systematic and economic construction of DC applications that are capable of taking critical actions with 100-microsecond-level or even 10-microsecond-level timing accuracy, fault tolerance, and security enforcement while being easily expandable and taking advantage of all sorts of network connectivity. Some directions considered worth pursuing for finding such BBs are discussed.

  • An Effective GML Documents Compressor

    Jihong GUAN  Shuigeng ZHOU  Yan CHEN  

     
    PAPER-Database

      Vol:
    E91-D No:7
      Page(s):
    1982-1990

    As GML is becoming the de facto standard for geographic data storage, transmission and exchange, more and more geographic data exists in GML format. In applications, GML documents are usually very large in size because they contain a large number of verbose markup tags and a large amount of spatial coordinate data. In order to speedup data transmission and reduce network cost, it is essential to develop effective and efficient GML compression tools. Although GML is a special case of XML, current XML compressors are not effective if directly applied to GML, because these compressors have been designed for general XML data. In this paper, we propose GPress, a compressor for effectively compressing GML documents. To the best of our knowledge, GPress is the first compressor specifically for GML documents compression. GPress exploits the unique characteristics of GML documents to achieve good performance. Extensive experiments over real-world GML documents show that GPress evidently outperforms XMill (one of the best existing XML compressors) in compression ratio, while its compression efficiency is comparable to the existing XML compressors.

  • Extending LogicWeb via Hereditary Harrop Formulas

    Keehang KWON  Dae-Seong KANG  

     
    LETTER-Fundamentals of Software and Theory of Programs

      Vol:
    E91-D No:6
      Page(s):
    1827-1829

    We propose HHWeb, an extension to LogicWeb with hereditary Harrop formulas. HHWeb extends the LogicWeb of Loke and Davison by allowing goals of the form ( x1... xn D) G (or equivalently x1... xn(D G)) where D is a web page and G is a goal. This goal is intended to be solved by instantiating x1,...,xn in D by new names and then solving the resulting goal. The existential quantifications at the head of web pages are particularly flexible in controlling the visibility of names. For example, they can provide scope to functions and constants as well as to predicates. In addition, they have such simple semantics that implementation becomes more efficient. Finally, they provide a client-side interface which is useful for customizing web pages.

  • An Unsupervised Opinion Mining Approach for Japanese Weblog Reputation Information Using an Improved SO-PMI Algorithm

    Guangwei WANG  Kenji ARAKI  

     
    PAPER-Data Mining

      Vol:
    E91-D No:4
      Page(s):
    1032-1041

    In this paper, we propose an improved SO-PMI (Semantic Orientation Using Pointwise Mutual Information) algorithm, for use in Japanese Weblog Opinion Mining. SO-PMI is an unsupervised approach proposed by Turney that has been shown to work well for English. When this algorithm was translated into Japanese naively, most phrases, whether positive or negative in meaning, received a negative SO. For dealing with this slanting phenomenon, we propose three improvements: to expand the reference words to sets of words, to introduce a balancing factor and to detect neutral expressions. In our experiments, the proposed improvements obtained a well-balanced result: both positive and negative accuracy exceeded 62%, when evaluated on 1,200 opinion sentences sampled from three different domains (reviews of Electronic Products, Cars and Travels from Kakaku.com). In a comparative experiment on the same corpus, a supervised approach (SA-Demo) achieved a very similar accuracy to our method. This shows that our proposed approach effectively adapted SO-PMI for Japanese, and it also shows the generality of SO-PMI.

  • An Informative DOM Subtree Identification Method from Web Pages in Unfamiliar Web Sites

    Masanobu TSURUTA  Hiroyuki SAKAI  Shigeru MASUYAMA  

     
    LETTER

      Vol:
    E91-D No:4
      Page(s):
    986-989

    We propose a method of informative DOM subtree identification from a Web page in an unfamiliar Web site. Our method uses layout data of DOM nodes generated by a generic Web browser. The results show that our method outperforms a baseline method, and was able to identify informative DOM subtrees from Web pages robustly.

  • Accelerating Web Content Filtering by the Early Decision Algorithm

    Po-Ching LIN  Ming-Dao LIU  Ying-Dar LIN  Yuan-Cheng LAI  

     
    PAPER-Contents Technology and Web Information Systems

      Vol:
    E91-D No:2
      Page(s):
    251-257

    Real-time content analysis is typically a bottleneck in Web filtering. To accelerate the filtering process, this work presents a simple, but effective early decision algorithm that analyzes only part of the Web content. This algorithm can make the filtering decision, either to block or to pass the Web content, as soon as it is confident with a high probability that the content really belongs to a banned or an allowed category. Experiments show the algorithm needs to examine only around one-fourth of the Web content on average, while the accuracy remains fairly good: 89% for the banned content and 93% for the allowed content. This algorithm can complement other Web filtering approaches, such as URL blocking, to filter the Web content with high accuracy and efficiency. Text classification algorithms in other applications can also follow the principle of early decision to accelerate their applications.

  • Improvements of HITS Algorithms for Spam Links

    Yasuhito ASANO  Yu TEZUKA  Takao NISHIZEKI  

     
    PAPER-Scoring Algorithms

      Vol:
    E91-D No:2
      Page(s):
    200-208

    The HITS algorithm proposed by Kleinberg is one of the representative methods of scoring Web pages by using hyperlinks. In the days when the algorithm was proposed, most of the pages given high score by the algorithm were really related to a given topic, and hence the algorithm could be used to find related pages. However, the algorithm and the variants including Bharat's improved HITS, abbreviated to BHITS, proposed by Bharat and Henzinger cannot be used to find related pages any more on today's Web, due to an increase of spam links. In this paper, we first propose three methods to find "linkfarms," that is, sets of spam links forming a densely connected subgraph of a Web graph. We then present an algorithm, called a trust-score algorithm, to give high scores to pages which are not spam pages with a high probability. Combining the three methods and the trust-score algorithm with BHITS, we obtain several variants of the HITS algorithm. We ascertain by experiments that one of them, named TaN+BHITS using the trust-score algorithm and the method of finding linkfarms by employing name servers, is most suitable for finding related pages on today's Web. Our algorithms take time and memory no more than those required by the original HITS algorithm, and can be executed on a PC with a small amount of main memory.

  • Web Structure Mining by Isolated Cliques

    Yushi UNO  Yoshinobu OTA  Akio UEMICHI  

     
    PAPER-Data Mining

      Vol:
    E90-D No:12
      Page(s):
    1998-2006

    The link structure of the Web is generally viewed as the webgraph. Web structure mining is a research area that mainly aims to find hidden communities by focusing on the webgraph, and communities or their cores are supposed to constitute dense subgraphs. Therefore, structure mining can actually be realized by enumerating such substructures, and Kleinberg's biclique model is well-known among them. In this paper, we examine some candidate substructures, including conventional bicliques, and attempt to find useful information from the real web data. Especially, we newly exploit isolated cliques for our experiments of structure mining. As a result, we discovered that isolated cliques that lie over multiple domains can stand for useful communities, which implies the validity of isolated clique as a candidate substructure for structure mining. On the other hand, we also observed that most of isolated cliques on the Web correspond to menu structures and are inherent in single domains, and that isolated cliques can be quite useful for detecting harmful link farms.

121-140hit(221hit)