Title: Distributed Database Management for Scientific Data Analysis

Author(s): Nabil I. Hachem, Michael A. Gennert, and Matthew O. Ward, Computer Science Department, Worcester Polytechnic Institute, Worcester, MA 01609

Source: Int. Workshop on Global GIS, Tokyo, Aug. 1993.

Abstract: Scientific databases have recently become a challenging research area for a number of reasons: 1) the amount of data stored in scientific databases is rapidly increasing, with orders of magnitude increases on the horizon, 2) the data are becoming increasing complex, as more complicated data structures and data relationships must be captured, 3) there is a need to integrate incompatible data formats, commercial databases, and analysis tools into a seamless environment, and 4) scientific databases are becoming distributed, i.e., no single site can archive all the data potentially required to conduct some experiments. Unless these challenges can be met, the scientific researcher will spend an inordinate amount of time manipulating bits and bytes, instead of focusing on the scientific problems of most interest.

In this paper we discuss these database issues in more depth and then describe the Gaea system, a spatio-temporal database management system under development at Worcester Polytechnic Institute. Gaea's main objectives include:

  1. providing DBMS support to all phases of scientific investigations by management of scientific data and meta-data,
  2. extending database technology with an intrinsic class of operators which is extensible and responds to the growing needs of scientific research,
  3. integrating spatial and temporal domains involving very large amounts of data,
  4. allowing a clean extension to a distributed computing environment containing heterogeneous, specialized database and computing resources.

Matthew O. Ward (matt@cs.wpi.edu)