The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] molecular biology database(1hit)

1-1hit
  • Querying Molecular Biology Databases by Integration Using Multiagents

    Hideo MATSUDA  Takashi IMAI  Michio NAKANISHI  Akihiro HASHIMOTO  

     
    PAPER-Distributed and Heterogeneous Databases

      Vol:
    E82-D No:1
      Page(s):
    199-207

    In this paper, we propose a method for querying heterogeneous molecular biology databases. Since molecular biology data are distributed into multiple databases that represent different biological domains, it is highly desirable to integrate data together with the correlations between the domains. However, since the total amount of such databases is very large and the data contained are frequently updated, it is difficult to maintain the integration of the entire contents of the databases. Thus, we propose a method for dynamic integration based on user demand, which is expressed with an OQL-based query language. By restricting search space according to user demand, the cost of integration can be reduced considerably. Multiple databases also exhibit much heterogeneity, such as semantic mismatching between the database schemas. For example, many databases employ their own independent terminology. For this reason, it is usually required that the task for integrating data based on a user demand should be carried out transitively; first search each database for data that satisfy the demand, then repeatedly retrieve other data that match the previously found data across every database. To cope with this issue, we introduce two types of agents; a database agent and a user agent, which reside at each database and at a user, respectively. The integration task is performed by the agents; user agents generate demands for retrieving data based on the previous search results by database agents, and database agents search their databases for data that satisfy the demands received from the user agents. We have developed a prototype system on a network of workstations. The system integrates four databases; GenBank (a DNA nucleotide database), SWISS-PROT, PIR (protein amino-acid sequence databases), and PDB (a protein three-dimensional structure database). Although the sizes of GenBank and PDB are each over one billion bytes, the system achieved good performance in searching such very large heterogeneous databases.