Title: Managing Derived Data in the Gaea Scientific DBMS

Author(s): Nabil I. Hachem, Ke Qiu, Michael Gennert, and Matthew O. Ward, Computer Science Department, Worcester Polytechnic Institute, Worcester, MA 01609

Source: 1993 VLDB Conference, August, 1993.

Abstract: One important aspect of scientific data management is metadata management. Metadata is information about data (e.g., content, source, processing applied, precision). One kind of metadata which needs special attention is the data derivation information, i.e., how data are generated.

In our application domain of geographical information systems (GIS) and global change research, we view scientific objects according to three different extents: spatial, temporal, and derivation. While the spatial and temporal extents have been studied and formal semantics to those extents proposed, derivation semantics have been ignored.

This paper presents a framework for capturing and managing scientific data derivation histories as implemented in the Gaea scientific database management system. We focus on how Gaea handles metadata and propose to extend current semantic modeling and object-oriented technology with special constructs: concepts, processes, and tasks. Concepts are used to capture entity sets with imprecise definitions. A process captures the derivation procedure of a specific scientific object class, while a task is the instance representing the derivation of a scientific data object. We believe that this framework, useful for GIS and global change studies, generalizes well to other scientific fields.

Matthew O. Ward (matt@cs.wpi.edu)