Structured documents often contain character strings of which semantics can be naturally stored as database values or has direct correspondence with database values. By building bilateral logical links between character strings in documents and corresponding database values, semantically rich queries are made expressible. We have introduced a new ADT, named "paratext," to model text which has links with database values. Paratexts are logically viewed as consisting of two parallel layers; on the "appearance" layer, ordinary text (i. e. a linear sequence of character strings) is placed, while the "reference" layer holds an array of OIDs and literals. Each OID or literal on the reference layer is associated with a contiguous substring of the appearance layer text, and represents the semantics of the associated substring. We have also designed domain-specific functions for this document model. Using the functions, we can express queries which go back and forth between the two layers. In structured documents, such character strings can appear in the whole content of logical elements, or as phrases inside logical elements. We also present frameworks for the implementation of the paratext ADT, and discuss how traditional full-text indexing techniques can be extended to support paratext.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Masatoshi YOSHIKAWA, Hiroyuki KATO, Hiroko KINUTANI, "Design Framework of a Database for Structured Documents with Object Links" in IEICE TRANSACTIONS on Information,
vol. E82-D, no. 1, pp. 147-155, January 1999, doi: .
Abstract: Structured documents often contain character strings of which semantics can be naturally stored as database values or has direct correspondence with database values. By building bilateral logical links between character strings in documents and corresponding database values, semantically rich queries are made expressible. We have introduced a new ADT, named "paratext," to model text which has links with database values. Paratexts are logically viewed as consisting of two parallel layers; on the "appearance" layer, ordinary text (i. e. a linear sequence of character strings) is placed, while the "reference" layer holds an array of OIDs and literals. Each OID or literal on the reference layer is associated with a contiguous substring of the appearance layer text, and represents the semantics of the associated substring. We have also designed domain-specific functions for this document model. Using the functions, we can express queries which go back and forth between the two layers. In structured documents, such character strings can appear in the whole content of logical elements, or as phrases inside logical elements. We also present frameworks for the implementation of the paratext ADT, and discuss how traditional full-text indexing techniques can be extended to support paratext.
URL: https://global.ieice.org/en_transactions/information/10.1587/e82-d_1_147/_p
Copy
@ARTICLE{e82-d_1_147,
author={Masatoshi YOSHIKAWA, Hiroyuki KATO, Hiroko KINUTANI, },
journal={IEICE TRANSACTIONS on Information},
title={Design Framework of a Database for Structured Documents with Object Links},
year={1999},
volume={E82-D},
number={1},
pages={147-155},
abstract={Structured documents often contain character strings of which semantics can be naturally stored as database values or has direct correspondence with database values. By building bilateral logical links between character strings in documents and corresponding database values, semantically rich queries are made expressible. We have introduced a new ADT, named "paratext," to model text which has links with database values. Paratexts are logically viewed as consisting of two parallel layers; on the "appearance" layer, ordinary text (i. e. a linear sequence of character strings) is placed, while the "reference" layer holds an array of OIDs and literals. Each OID or literal on the reference layer is associated with a contiguous substring of the appearance layer text, and represents the semantics of the associated substring. We have also designed domain-specific functions for this document model. Using the functions, we can express queries which go back and forth between the two layers. In structured documents, such character strings can appear in the whole content of logical elements, or as phrases inside logical elements. We also present frameworks for the implementation of the paratext ADT, and discuss how traditional full-text indexing techniques can be extended to support paratext.},
keywords={},
doi={},
ISSN={},
month={January},}
Copy
TY - JOUR
TI - Design Framework of a Database for Structured Documents with Object Links
T2 - IEICE TRANSACTIONS on Information
SP - 147
EP - 155
AU - Masatoshi YOSHIKAWA
AU - Hiroyuki KATO
AU - Hiroko KINUTANI
PY - 1999
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E82-D
IS - 1
JA - IEICE TRANSACTIONS on Information
Y1 - January 1999
AB - Structured documents often contain character strings of which semantics can be naturally stored as database values or has direct correspondence with database values. By building bilateral logical links between character strings in documents and corresponding database values, semantically rich queries are made expressible. We have introduced a new ADT, named "paratext," to model text which has links with database values. Paratexts are logically viewed as consisting of two parallel layers; on the "appearance" layer, ordinary text (i. e. a linear sequence of character strings) is placed, while the "reference" layer holds an array of OIDs and literals. Each OID or literal on the reference layer is associated with a contiguous substring of the appearance layer text, and represents the semantics of the associated substring. We have also designed domain-specific functions for this document model. Using the functions, we can express queries which go back and forth between the two layers. In structured documents, such character strings can appear in the whole content of logical elements, or as phrases inside logical elements. We also present frameworks for the implementation of the paratext ADT, and discuss how traditional full-text indexing techniques can be extended to support paratext.
ER -