Model Estimates of Chlorophyll a and CyanoHABs Occurrence for Select New York State Lakes
This data release contains modeled estimates of chlorophyll a concentration and cyanoHAB occurrence for a subset of lakes in New York State. Estimates of chlorophyll a concentration were generated using a random forest model. Estimates of cyanoHAB occurrence were generated based on thresholds derived from bootstrapped logistic regression. All analysis was done in R 4.4.1 (R Core Team, 2024) and the full methods are described in Savoy et al. (2025).
Items in this data release includes the following data, each bundled as a .zip file with the associated metadata:
random forest inputs and output.txt: Contains observed chlorophyll a concentrations, modeled estimates of chlorophyll a from a random forest model, and all necessary inputs to the model. The file format is tab-delimited with one observation per row.
chla_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of chlorophyll a concentration, observed cyanoHAB occurrence, and chlorophyll a concentrations. The file format is tab-delimited with one observation per row.
dwl_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of dominant wavelength, observed cyanoHAB occurrence, and dominant wavelength. The file format is tab-delimited with one observation per row.
kd_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of the irradiance attenuation coefficient, observed cyanoHAB occurrence, and irradiance attenuation coefficient. The file format is tab-delimited with one observation per row.
tn_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of total nitrogen, observed cyanoHAB occurrence, and total nitrogen. The file format is tab-delimited with one observation per row.
tp_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of total phosphorus, observed cyanoHAB occurrence, and total phosphorus. The file format is tab-delimited with one observation per row.
Associated metadata for each data file follow the same naming convention with a .xml extension.
Citation Information
Publication Year | 2025 |
---|---|
Title | Model Estimates of Chlorophyll a and CyanoHABs Occurrence for Select New York State Lakes |
DOI | 10.5066/P13NI4NL |
Authors | Philip R. Savoy, Rebecca M Gorney, Jennifer L Graham |
Product Type | Data Release |
Record Source | USGS Asset Identifier Service (AIS) |
USGS Organization | New York Water Science Center |
Rights | This work is marked with CC0 1.0 Universal |