Welcome to the Interactive Oceans Data Portal

February 18, 2020

The interactive map interface provides a landing page that shows all OOI Regional Cabled Array assets at a glance with intuitive navigation tools.

The Interactive Oceans Data Portal is an interface designed to provide scientists, educators, and students easy access to Ocean Observatories Initiative (OOI) data produced by the Regional Cabled Array (RCA). The goals of this cloud-based portal are to: 1) increase active use of RCA data by scientists and to support educators and public exploration of data in the future; 2) provide additional data search capabilities and visualization tools for all users to easily discover and explore RCA data sets suitable for addressing specific science hypotheses; 3) provide users with an intuitive, well-integrated, and easily accessible data download mechanism; and 4) facilitate scientific research output, accelerate frequency of scientific discoveries, and engage a broader user base for OOI data.

The plotting interface is an easy to use application that allows new users to quickly explore OOI Regional Cabled Array data in different ways, including multiple-parameter time series (left) and depth profile plots compared with shipboard discrete samples (right).

The entry point for the Data Portal is a user-friendly highly interactive map with high-resolution bathymetry that also highlights the RCA infrastructure. Icons on the map indicate specific cabled assets and are linked to instrument descriptions with operational status and images or video georeferenced to those sites. Data product details and data availability timelines are also included. A sidebar menu provides advanced search capabilities and data visualization tools to quickly create plots of various types for cabled streaming data. Demonstrations in progress include a prototype real-time plotting functionality and echogram generation from the seafloor bioacoustic sonar installations.

Initially, the interface provided descriptions and data from a diverse suite of instruments hosted on the cabled Shallow Profiler moorings (including CTD, oxygen, pCO2, pH, nitrate, fluorometer, and PAR sensors). More recently, the underlying database has been expanded to include entries for multiple seafloor nodes and instrumentation, and the interface will soon include pathways to access data from all cabled instrumentation. The Data Portal provides Python-based libraries and an API (Yodapy) to allow users to more readily access and download OOI data using the OOI machine-to-machine (M2M) interface, i.e. with simpler syntax and request construction.

The multi-instrument RCA datasets that are made available to users by the Interactive Oceans Data Portal are pulled from the OOI uFrame system using M2M and subsequently stored on the cloud. The required computing machines and networking components are deployed on Amazon Web Services (AWS) with a Kubernetes cluster hosting the various services for the data portal. The architecture is designed to be scalable and automatically adjusts the number of servers to accommodate the number of users. The cloud-based storage uses both a PostgreSQL database and S3 buckets, with data stored in both NetCDF and ZAR formats. Recently, “smart data decimation” has been implemented in the portal’s data visualization interface and continued development and refinement of such data exploration tools will improve the user experience and ease of access to RCA data.

Future plans include providing quick-access preview plots, full access to real-time data plotting tools, visualization of data quality control results from the OOI system, and the addition of model and shipboard data to the plotting interface for use in cross-comparisons and data validation. Plotting and data visualization tools for multi-dimensional data (e.g., bioacoustics sonar, broadband hydrophone, and ADCP data) are also planned. Additional value-added products will include Python notebooks on topics such as whale vocalization and incorporation of satellite data for educational and outreach purposes. These notebooks will be hosted in GitHub, using the scalable JupyterHub open-source tools, and will democratize data access and processing in a reproducible way.