The traditional flow of coastal ocean model data is from High Performance Computing (HPC) centers to the local desktop, or to a file server where just the data needed can be extracted via services such as OPeNDAP. Analysis and visualization is then conducted using local hardware and software. This requires moving large amounts of data across the internet as well as acquiring and maintaining local hardware, software and support personnel. Further, as data sets increase in size, the traditional workflow may not be scalable. Alternatively, recent advances make it possible to move data from HPC to the Cloud and perform interactive, scalable, data-proximate analysis and visualization, with simply a web browser user interface. We use the framework advanced by the NSF-funded Pangeo project, a free, open-source Python system which provides multi-user login via JupyterHub and parallel analysis via Dask, both running in Docker containers orchestrated by Kubernetes. Data is stored in the Zarr format, a Cloud-friendly ndarray format that allows performant extraction of data by anyone without relying on data services like OPeNDAP. Interactive visual exploration of data on massive model grids is made possible by new tools in the Python PyViz ecosystem, which can render maps at screen resolution, dynamically updating on pan and zoom operations. Two example are given: (1) calculating the maximum water level at each grid cell from a 53GB, 720 time step, 9 million node triangular mesh ADCIRC simulation of Hurricane Ike; (2) creating a dashboard for visualizing data from the curvilinear orthogonal COAWST/ROMS forecast model.
|Title||Analysis and visualization of coastal ocean model data in the cloud|
|Authors||Richard P. Signell, Dharhas Pothina|
|Publication Subtype||Journal Article|
|Series Title||Journal of Marine Science and Engineering|
|Record Source||USGS Publications Warehouse|
|USGS Organization||Woods Hole Coastal and Marine Science Center|