1-3hit |
Jihong GUAN Shuigeng ZHOU Yan CHEN
As GML is becoming the de facto standard for geographic data storage, transmission and exchange, more and more geographic data exists in GML format. In applications, GML documents are usually very large in size because they contain a large number of verbose markup tags and a large amount of spatial coordinate data. In order to speedup data transmission and reduce network cost, it is essential to develop effective and efficient GML compression tools. Although GML is a special case of XML, current XML compressors are not effective if directly applied to GML, because these compressors have been designed for general XML data. In this paper, we propose GPress, a compressor for effectively compressing GML documents. To the best of our knowledge, GPress is the first compressor specifically for GML documents compression. GPress exploits the unique characteristics of GML documents to achieve good performance. Extensive experiments over real-world GML documents show that GPress evidently outperforms XMill (one of the best existing XML compressors) in compression ratio, while its compression efficiency is comparable to the existing XML compressors.
The software requirements specification process consists of three steps; requirements capture and analysis, requirements definition and specification, and requirements validation. At the beginning of the second step which this paper focuses on, there have been several types of massive documents generated in the first step. Since the developers and the clients/users of the new software system may not have common knowledge in the field which the system deals with, it is difficult for the developers to produce correct requirements specification by using these documents. There has been few research work to solve this problem. The authors have developed a support tool to produce correct requirements specification by arranging and restructuring those documents into clearly understandable forms. In the second step, the developers must specify the functions and their constraints of the new system from those documents. Analyzing the developers' real activities for designing the support tool, the authors propose a model of this step as the following four activities. To specify the functions of the new system, the developers must collect the sentences which may suggest the functions scattering those documents. To define the details of each function, the developers must gather the paragraphs including the descriptions of the functions. To verify the correctness of each function, the developers must survey all related documents. To perform above activities successfully, the developers must manage various versions of those documents correctly. According to these four types of activities, the authors propose the effective ways to support the developers by arranging those documents. This paper shows algorithms based on this model by using the structures of the documents and keywords which may suggest the functions or constraints. To examine the feasibility of their proposal, the authors implemented a prototype tool. Their tool extracts complete information scattering those documents. The effectiveness of their proposal is demonstrated by their experiments.
Ron SACKS-DAVIS Timothy ARNOLD-MOORE Justin ZOBEL
Documents stored in a database system can have complex internal structure described by languages such as SGML. How to take advantage of this structure presents challenges for database system implementors. We classify the types of queries that need to be supported by SGML-conformant database systems. We then describe several data models that have been proposed for representing documents in a database system and discuss the support these models provide for SGML. Finally we consider query evaluation.