About

This project will result in the creation of a software that will support research in the Earth's deep time history. The co-evolution of the geosphere and biosphere is one of the fundamental questions for the 21st century Earth science. The multi-disciplinary characteristics of the research questions on co-evolution are reflected in the various subjects of datasets that need to be integrated. In the past decades, many open data facilities have been built through the support from NSF and other sources. However, the shortage of efficient methods for accessing and synthesizing multi-source datasets hamper the data-intensive co-evolution research. Geologic time is an essential topic in the co-evolving geosphere and biosphere, and can be used as a common reference to connect various parameters among the data silos. This project will improve the machine readability and alignment of various global, local and regional geologic time standards and build a knowledge base of deep time and its service on the Web. All the deliverables will be well-documented and offered under open-access to promote a national cyberinfrastructure ecosystem. The planned tasks and activities will leverage the usage of existing data facilities, facilitate executable and reproducible workflows, generate best practices of cross-disciplinary data science, generate state-of-the-art materials to education programs, and engage the participation of female and underrepresented groups. Shared in the national cyberinfrastructure, the knowledge base built in the project will be able to support a broad range of research, education and outreach programs, which will benefit not only science and engineering but also the society at large.

The research question to be addressed is the heterogeneity of geologic time concepts that hamper the data synthesis among multiple data facilities. Accordingly, the objective of this project is to build a knowledge base of deep time to automate geoscience data access and integration in the open data environment, and to support data synthesis in executable workflows for data-intensive scientific discovery. The development approach will include both top-down and bottom-up tracks to leverage previous works on geologic time ontologies and address end user needs through use case analyses. With carefully designed activities and work plan, deliverables from this project will include a machine-readable knowledge base of aligned geologic time standards, services and packages for accessing and querying the knowledge base, and best practices of data synthesis in workflow platforms for studying the co-evolution. The developed knowledge base of deep time will provide powerful support to co-evolution researchers to tackle data heterogeneity issues. Robust services the knowledge base will be built to support automated data synthesis in workflow platforms to advance the co-evolution research. The source code and metadata of the knowledge base will be released on GitHub and registered on community repositories to enable reuse and adaptation.