Dr. Jacob Zwart (he/him) is a data scientist for the USGS Water Resources Mission Area.
Jacob Zwart works within the Data Science Branch of the Water Resources Mission Area to develop aquatic ecosystem modeling techniques that provide timely information to stakeholders about important water resources across the nation. He uses his expertise in computational modeling, data assimilation, and limnology to help produce short-term forecasts of water quality at regional scales to aid in water resources decision making. Jacob’s research themes are: 1) improve understanding of aquatic biogeochemical processes and predicting how these processes may respond to future global change, 2) develop techniques to inject scientific knowledge into machine learning models to make accurate predictions of environmental variables (also known as “knowledge-guided machine learning”), and 3) advance methods for assimilating real-time observations into knowledge-guided machine learning models to improve near-term forecasts of water quality. Jacob also serves as a Peer Support Worker at USGS promoting awareness and education on topics and USGS policies for antiharassment, discrimination, biases, and scientific integrity, as well as providing peer-to-peer support for USGS employees.
Professional Experience
2021 – present: Data Scientist, Integrated Information Dissemination Division
2019 – 2021: Mendenhall Postdoctoral Fellow, Integrated Information Dissemination Division
2017 – 2019: National Science Foundation Earth Sciences Postdoctoral Fellow, Integrated Information Dissemination Division
2014 – 2017: National Science Foundation Graduate Research Fellow, University of Notre Dame
2012 – 2014: Research and Teaching Assistant, University of Notre Dame
Education and Certifications
Ph.D., Biological Sciences, University of Notre Dame, 2017
B.S., Biology, Calvin College, 2012
Honors and Awards
U.S. Geological Survey Mendenhall Postdoctoral Fellowship, 2019 – 2021
National Science Foundation Earth Sciences Postdoctoral Fellowship, 2017 – 2019
National Science Foundation Graduate Research Fellowship, 2014 – 2017
University of Notre Dame Linked Experimental Ecosystem Facility Research Grant, 2017
Exceptional Promise in Graduate Research Award, Ecological Society of America Aquatic Ecology Section, 2015
University of Notre Dame Center for Aquatic Conservation Graduate Fellow, 2014
University of Notre Dame Environmental Research Center Graduate Research Fellowship, 2013 – 2015
University of Notre Dame Environmental Research Center Graduate Mentoring Fellowship, 2012
Science and Products
Data to support near-term forecasts of stream temperature using process-guided deep learning and data assimilation
Multi-task Deep Learning for Water Temperature and Streamflow Prediction (ver. 1.1, June 2022)
Data to support water quality modeling efforts in the Delaware River Basin
Predicting water temperature in the Delaware River Basin
Data release: Process-based predictions of lake water temperature in the Midwest US
Machine learning for understanding inland water quantity, quality, and ecology
Physics-guided recurrent neural networks for predicting lake water temperature
Can machine learning accelerate process understanding and decision-relevant predictions of river water quality?
Multi-task deep learning of daily streamflow and water temperature
Physics-guided machine learning from simulation data: An application in modeling lake and river systems
The AEMON-J “Hacking Limnology” workshop series & virtual summit: Incorporating data science and open science in aquatic research
Physics-guided machine learning for scientific discovery: An application in simulating lake temperature profiles
Graph-based reinforcement learning for active learning in real time: An application in modeling river networks
Heterogeneous stream-reservoir graph networks with data assimilation
Process-guided deep learning predictions of lake water temperature
Cross-scale interactions dictate regional lake carbon flux and productivity response to future climate
Improving estimates and forecasts of lake carbon dynamics using data assimilation
Science and Products
- Data
Data to support near-term forecasts of stream temperature using process-guided deep learning and data assimilation
This data release contains the forcings and outputs of 7-day ahead maximum water temperature forecasting models that made real-time predictions in the Delaware River Basin during 2021. The model is driven by weather forecasts and observed reservoir releases and produces maximum water temperature forecasts for the issue day (day 0) and 7 days into the future (days 1-7) at five sites. This data releMulti-task Deep Learning for Water Temperature and Streamflow Prediction (ver. 1.1, June 2022)
This item contains data and code used in experiments that produced the results for Sadler et. al (2022) (see below for full reference). We ran five experiments for the analysis, Experiment A, Experiment B, Experiment C, Experiment D, and Experiment AuxIn. Experiment A tested multi-task learning for predicting streamflow with 25 years of training data and using a different model for each of 101 sitData to support water quality modeling efforts in the Delaware River Basin
This data release contains information to support water quality modeling in the Delaware River Basin (DRB). These data support both process-based and machine learning approaches to water quality modeling, including the prediction of stream temperature. Reservoirs in the DRB serve an important role as a source of drinking water, but also affect downstream water quality. Therefore, this data releasePredicting water temperature in the Delaware River Basin
Daily temperature predictions in the Delaware River Basin (DRB) can inform decision makers who can use cold-water reservoir releases to maintain thermal habitat for sensitive fish and mussel species. This data release supports a variety of flow and water temperature modeling efforts and provides the inputs and outputs of both machine learning and process-based modeling methods across 456 river reaData release: Process-based predictions of lake water temperature in the Midwest US
Climate change has been shown to influence lake temperatures in different ways. To better understand the diversity of lake responses to climate change and give managers tools to manage individual lakes, we focused on improving prediction accuracy for daily water temperature profiles in 7,150 lakes in Minnesota and Wisconsin during 1980-2019. The data are organized into these items: Spatial data - Publications
Filter Total Items: 16
Machine learning for understanding inland water quantity, quality, and ecology
This chapter provides an overview of machine learning models and their applications to the science of inland waters. Such models serve a wide range of purposes for science and management: predicting water quality, quantity, or ecological dynamics across space, time, or hypothetical scenarios; vetting and distilling raw data for further modeling or analysis; generating and exploring hypotheses; estPhysics-guided recurrent neural networks for predicting lake water temperature
This chapter presents a physics-guided recurrent neural network model (PGRNN) for predicting water temperature in lake systems. Standard machine learning (ML) methods, especially deep learning models, often require a large amount of labeled training samples, which are often not available in scientific problems due to the substantial human labor and material costs associated with data collection. MCan machine learning accelerate process understanding and decision-relevant predictions of river water quality?
The global decline of water quality in rivers and streams has resulted in a pressing need to design new watershed management strategies. Water quality can be affected by multiple stressors including population growth, land use change, global warming, and extreme events, with repercussions on human and ecosystem health. A scientific understanding of factors affecting riverine water quality and predMulti-task deep learning of daily streamflow and water temperature
Deep learning (DL) models can accurately predict many hydrologic variables including streamflow and water temperature; however, these models have typically predicted hydrologic variables independently. This study explored the benefits of modeling two interdependent variables, daily average streamflow and daily average stream water temperature, together using multi-task DL. A multi-task scaling facPhysics-guided machine learning from simulation data: An application in modeling lake and river systems
This paper proposes a new physics-guided machine learning approach that incorporates the scientific knowledge in physics-based models into machine learning models. Physics-based models are widely used to study dynamical systems in a variety of scientific and engineering problems. Although they are built based on general physical laws that govern the relations from input to output variables, theseThe AEMON-J “Hacking Limnology” workshop series & virtual summit: Incorporating data science and open science in aquatic research
Following the 2020 “Virtual Summit: Incorporating Data Science and Open Science in Aquatic Research” (DSOS; Meyer and Zwart 2020), a grassroots group of scientists convened the 2nd Virtual DSOS Summit on 22–23 July 2021. DSOS combined forces with the Aquatic Ecosystem MOdeling Network - Junior (AEMON-J; https://github.com/aemon-j) to host a 4-d “Hacking Limnology” Workshop Series prior to the summPhysics-guided machine learning for scientific discovery: An application in simulating lake temperature profiles
Physics-based models are often used to study engineering and environmental systems. The ability to model these systems is the key to achieving our future environmental sustainability and improving the quality of human life. This article focuses on simulating lake water temperature, which is critical for understanding the impact of changing climate on aquatic ecosystems and assisting in aquatic resGraph-based reinforcement learning for active learning in real time: An application in modeling river networks
Effective training of advanced ML models requires large amounts of labeled data, which is often scarce in scientific problems given the substantial human labor and material cost to collect labeled data. This poses a challenge on determining when and where we should deploy measuring instruments (e.g., in-situ sensors) to collect labeled data efficiently. This problem differs from traditional pool-bHeterogeneous stream-reservoir graph networks with data assimilation
Accurate prediction of water temperature in streams is critical for monitoring and understanding biogeochemical and ecological processes in streams. Stream temperature is affected by weather patterns (such as solar radiation) and water flowing through the stream network. Additionally, stream temperature can be substantially affected by water releases from man-made reservoirs to downstream segmentsProcess-guided deep learning predictions of lake water temperature
The rapid growth of data in water resources has created new opportunities to accelerate knowledge discovery with the use of advanced deep learning tools. Hybrid models that integrate theory with state‐of‐the art empirical techniques have the potential to improve predictions while remaining true to physical laws. This paper evaluates the Process‐Guided Deep Learning (PGDL) hybrid modeling frameworkCross-scale interactions dictate regional lake carbon flux and productivity response to future climate
Lakes support globally important food webs through algal productivity and contribute significantly to the global carbon cycle. However, predictions of how broad-scale lake carbon flux and productivity may respond to future climate are extremely limited. Here, we used an integrated modeling framework to project changes in lake-specific and regional primary productivity and carbon fluxes under 21stImproving estimates and forecasts of lake carbon dynamics using data assimilation
Lakes are biogeochemical hotspots on the landscape, contributing significantly to the global carbon cycle despite their small areal coverage. Observations and models of lake carbon pools and fluxes are rarely explicitly combined through data assimilation despite successful use of this technique in other fields. Data assimilation adds value to both observations and models by constraining models wit