Quantifying watershed controls on fine sediment particles and nutrient loading to Lake Tahoe using data mining and machine learning

Science Center Objects

Since the late 1980’s, the USGS has collected discharge, sediment, and water quality data at seven major drainages under the Lake Tahoe Interagency Monitoring Program (LTIMP). Recently, continuous, real-time measurements of turbidity were added to the LTIMP. These data can be combined with in situ, model simulations, and remotely-sensed datasets available from the USGS, National Aeronautics and Space Administration (NASA), Natural Resources Conservation Service (NRCS), and the U.S. Forest Service (USFS). Together, these data will be analyzed using machine learning techniques to determine the key factors controlling FSP and nutrient loads in LTIMP streams draining to Lake Tahoe.


The transport of fine sediment particles (FSP, < 16 µm in diameter) from upland drainage basins to Lake Tahoe has a profound impact on lake clarity. Persistent drought and decreased streamflow from 2012 to 2016 resulted in lower FSP and nutrient loads, significantly improving lake clarity.  However, clarity was significantly impacted by above average precipitation during water year 2017  (wettest year on record) and 2019 (4th wettest year on record).  The Lake Tahoe total maximum daily load has targeted reduction efforts for FSP (Coats, 2010; Roberts and others, 2018). In addition, the overall trend in summer clarity has been decreasing with large annual variations likely being caused by extremes in climate conditions (TERC, 2019). The impacts of climate and watershed dynamics on lake clarity remain poorly constrained.

Lake Tahoe is exhibiting a higher than average rate of warming at +0.015 °C yr-1 (Coats and others, 2006). This decadal warming trend is further exacerbated by more frequent snow accumulation and melt cycles that occur earlier in spring, promoting thermal stratification, reducing deep mixing, and promoting biological processes that further impact lake clarity (Coats, 2010). The timing and magnitude of streamflow is largely driven by snow accumulation and melt. During heavy snowpack years, runoff occurs later in the year, but streams carry substantial sediment loads to Lake Tahoe well into the summer months. During snow-drought years, peak discharge and sediment load from streams is much lower and occurs earlier in spring. Increases in global temperatures over the last century have resulted in changes to air temperatures, snow accumulation, snow to precipitation ratio, snowmelt timing, and runoff (Cayan and others, 2001; Coats, 2010; Dettinger, 2013).

Figure showing stream discharge, real-time turbidity and fine sediment particle (FSP) load for five USGS- LTIMP stations

Stream discharge, real-time turbidity and fine sediment particle (FSP) load for five USGS- LTIMP stations in the Lake Tahoe Basin. The project goal is to interpret the drivers of FS flux within each watershed using machine learning.

Map showing real-time turbidity data in the Lake Tahoe Basin

LTIMP watershed with real-time turbidity data. Each 1 km grid denotes modeling cell where soil moisture, temperature and snow-water equivalent will be simulated and used as training data for ensemble machine learning methods such as random forest, empirical orthogonal function, principal component regression, and step-wise regression.


The following research questions will be addressed using historical LTIMP data and real-time monitoring:

  1. How do watershed processes related to snowmelt and runoff timing influence FSP and nutrients delivered to Lake Tahoe?
  2. What hydrologic drivers control FSP/nutrient flux at the daily, monthly, and seasonal time scale at LTIMP sites?

This project aims to exploit the abundant scientific datasets available for Lake Tahoe and its major tributaries by compiling these data into a web-based extraction and visualization tool (Lake Tahoe HydroMapper, LTH) and then using a machine learning approach to identify watershed conditions controlling loads entering the lake. We further propose to test the following hypotheses: 

  • H1 – Individual storm events with higher ratios of rain to snowfall within a given watershed result in larger pulses of FSP; 
  • H2 – Antecedent watershed conditions (e.g. soil moisture/temperature) in the fall months directly regulate seasonal FSP by controlling the trajectory (infiltration verse overland flow) of seasonal runoff waters; and
  •  H3 – Annual variations in runoff timing and volume control total seasonal sediment flux and nutrient concentrations, which in turn impact summer lake clarity.

This project will review and incorporate non-USGS data as identified and prioritized by partners for display on the Lake Tahoe Hydromapper (LTH) web application developed as part of this project, and to the USFS Lake Tahoe Basin Management Unit (LTBMU) website.  Online display of compiled data to LTH and LTBMU websites will serve to highlight interagency federal partnerships, data collection, and scientific research in the Tahoe Basin.



The USGS will compile and analyze hydrological data hypothesized to effect nutrient, turbidity, and suspended sediment levels in Lake Tahoe for the period of record covered by the LTIMP monitoring (approximately 1980 – present).  Data analysis will assess the importance of watershed factors contributing to more recent instream continuous turbidity measurements using statistical approaches.

In particular, the hydrologic function of the seven noted watersheds in Table 1 will be assessed by determining key hydrologic conditions that drive daily variability of instream water quality at daily, monthly, and annual time intervals. The drivers may be a combination of dynamic (e.g., snowpack level, antecedent soil moisture or temperature, meteorological conditions) and static (e.g., terrain, geology, soil/vegetation) conditions within each watershed as well as various urban impacts and best management practices.


Table 1. LTIMP monitoring locations for discharge, sediment and nutrient collection and real-time, temperature and turbidity.
Site ID
USGS Site Name Drainage Area
Year Data Collection Started
Discharge Stream Temperature Turbidity Sediment
nutrient monitoring
10336610 UPPER TRUCKEE RB AT SOUTH LAKE TAHOE, CA 54.9 1971 1981 2014 1980
10336645 GENERAL C NR MEEKS BAY, CA 7.44 1980 1980 2016 1980
10336660 BLACKWOOD C NR TAHOE CITY, CA 11.2 1960 1980 2016 1980
10336676 WARD C AT HWY 89 NR TAHOE PINES, CA 9.70 1972 1980 2016 1980
10336780 TROUT CK NR TAHOE VALLEY, CA 36.7 1960 1981 2016 1980
10336698 THIRD CK NR CRYSTAL BAY, NV 6.05 1969 1980 2019 1980
10336700 INCLINE CK NR CRYSTAL BAY, NV 6.741 1969 2019 2019 2019


Incline Creek, NV

View of near shore Incline Creek, NV.

(Public domain.)


Using statistical models, each individual dataset will be assessed to predict stream turbidity (i.e., FSP) at intra-annual times scales, and LTIMP nutrients and FSP loads interannually. A machine learning approach, such as Random Forests (Breiman, 2001; Clewley and others, 2017), will be utilized to assess the relative importance of each input dataset to the statistical model. For example, Random forest is an extension of the classification and regression tree algorithm using multiple decision trees. Each decision tree is constructed from randomly selected subsamples of each input training datasets and randomly resampled using a bootstrap technique. This method allows the average ‘importance’ of each input dataset to be determined for a target variable (e.g. continuous FSP, annual nutrient load).



Breiman, L., 2001, Random forests: Machine Learning, v. 45, no. 1, p. 5-32.

Cayan, D.R., Dettinger, M.D., Kammerdiener, S.A., Caprio, J.M., and Peterson, D.H., 2001, Changes in the Onset of Spring in the Western United States: Bulletin of the American Meteorological Society, v. 82, no. 3, p. 399-415.

Clewley, D., Whitcomb, J.B., Akbar, R., Silva, A.R., Berg, A., Adams, J.R., Caldwell, T., Entekhabi, D., and Moghaddam, M., 2017, A method for upscaling in situ soil moisture measurements to satellite footprint scale using random forests: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, v. 10, no. 6, p. 2663-2673.

Coats, R., 2010, Climate change in the Tahoe basin: regional trends, impacts and drivers: Climatic Change, v. 102, no. 3-4, p. 435-466.

Coats, R., Perez-Losada, J., Schladow, G., Richards, R., and Goldman, C., 2006, The warming of Lake Tahoe: Climatic Change, v. 76, no. 1-2, p. 121-148.

Dettinger, M.D., 2013, Projections and downscaling of 21st century temperatures, precipitation, radiative fluxes and winds for the Southwestern US, with focus on Lake Tahoe: Climatic Change, v. 116, no. 1, p. 17-33.

Roberts, D.C., Forrest, A.L., Sahoo, G.B., Hook, S.J., and Schladow, S.G., 2018, Snowmelt Timing as a Determinant of Lake Inflow Mixing: Water Resources Research, v. 54, no. 2, p. 1237-1251  10.1002/2017wr021977.

TERC, 2019, Tahoe State of the Lake Report 2019: Tahoe Environmental Research Center, 97 p.