Skip to main content
U.S. flag

An official website of the United States government

Predicted pH at the domestic and public supply drinking water depths, Central Valley, California

March 8, 2017

This scientific investigations map is a product of the U.S. Geological Survey (USGS) National Water-Quality Assessment (NAWQA) project modeling and mapping team. The prediction grids depicted in this map are of continuous pH and are intended to provide an understanding of groundwater-quality conditions at the domestic and public supply drinking water zones in the groundwater of the Central Valley of California. The chemical quality of groundwater and the fate of many contaminants is often influenced by pH in all aquifers. These grids are of interest to water-resource managers, water-quality researchers, and groundwater modelers concerned with the occurrence of natural and anthropogenic contaminants related to pH. In this work, the median well depth categorized as domestic supply was 30 meters below land surface, and the median well depth categorized as public supply is 100 meters below land surface. Prediction grids were created using prediction modeling methods, specifically boosted regression trees (BRT) with a Gaussian error distribution within a statistical learning framework within the computing framework of R ( The statistical learning framework seeks to maximize the predictive performance of machine learning methods through model tuning by cross validation. The response variable was measured pH from 1,337 wells and was compiled from two sources: USGS National Water Information System (NWIS) database (all data are publicly available from the USGS: and the California State Water Resources Control Board Division of Drinking Water (SWRCB-DDW) database (water quality data are publicly available from the SWRCB: Only wells with measured pH and well depth data were selected, and for wells with multiple records, only the most recent sample in the period 1993–2014 was used. A total of 1,003 wells (training dataset) were used to train the BRT model, and 334 wells (hold-out dataset) were used to validate the prediction model. The training r-squared was 0.70, and the root-mean-square error (RMSE) in standard pH units was 0.26. The hold-out r-squared was 0.43, and RMSE in standard pH units was 0.37. Predictor variables consisting of more than 60 variables from 7 sources were assembled to develop a model that incorporates regional-scale soil properties, soil chemistry, land use, aquifer textures, and aquifer hydrology. Previously developed Central Valley model outputs of textures (Central Valley Textural Model, CVTM; Faunt and others, 2010) and MODFLOW-simulated vertical water fluxes and predicted depth to water table (Central Valley Hydrologic Model, CVHM; Faunt, 2009) were used to represent aquifer textures and groundwater hydraulics, respectively. In this work, wells were attributed to predictor variable values in ArcGIS using a 500-meter buffer.

Faunt, C.C., ed., 2009, Groundwater availability in the Central Valley aquifer, California: U.S. Geological Survey Professional Paper 1776, 225 p., accessed at

Faunt, C.C., Belitz, K., and Hanson, R.T., 2010, Development of a three-dimensional model of sedimentary texture in valley-fill deposits of Central Valley, California, USA: Hydrogeology Journal, v. 18, no. 3, p. 625–649,

Publication Year 2017
Title Predicted pH at the domestic and public supply drinking water depths, Central Valley, California
DOI 10.3133/sim3377
Authors Celia Z. Rosecrans, Bernard T. Nolan, Jo Ann M. Gronberg
Publication Type Report
Publication Subtype USGS Numbered Series
Series Title Scientific Investigations Map
Series Number 3377
Index ID sim3377
Record Source USGS Publications Warehouse
USGS Organization California Water Science Center; National Water Quality Assessment Program