Skip to main content
U.S. flag

An official website of the United States government

Methods for estimating magnitude and frequency of floods in Arizona, developed with unregulated and rural peak-flow data through water year 2010

December 10, 2014

Flooding is among the worst natural disasters responsible for loss of life and property in Arizona, underscoring the importance of accurate estimation of flood magnitude for proper structural design and floodplain mapping. Twenty-four years of additional peak-flow data have been recorded since the last comprehensive regional flood frequency analysis conducted in Arizona. Periodically, flood frequency estimates and regional regression equations must be revised to maintain the accurate estimation of flood frequency and magnitude.


Annual peak-flow data collected through water year 2010 were compiled from 448 unregulated streamflow-gaging stations, hereafter referred to as streamgages, in Arizona having a minimum of 10 years of record. Flood frequency estimates were first computed with station (or at-site) skew using the Expected Moments Algorithm with a multiple Grubbs-Beck test to identify multiple potentially influential low flows to fit a Pearson Type III distribution. Next, a multiple step Bayesian least-squares-regression approach was used to determine a new statewide regional skew of −0.09. No basin characteristics analyzed were statistically significant in explaining the variation in skew and as a result, the constant model was chosen as the best regional skew model for the Arizona study area. The mean square error used in Bulletin 17B (B17B) of the Interagency Advisory Committee on Water Data is used to describe the precision of the regional skew. The constant model had a mean square error equal to 0.08, which corresponds to an effective record length of 85 years. This is a marked improvement over a previous Arizona regional skew analysis, with a reported mean square error of 0.31, for a corresponding effective record length of around 17 years. Thus the new regional model had almost five times the information content (as measured by effective record length) of that calculated in USGS Water Supply Paper 2433, published in 1997, or the value of 0.302 reported in the B17B generalized skew map. The flood frequency estimates were recalculated using a weighted skew of the station and regional skew. Station flood frequency estimates for each streamgage are presented for the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities.


Geographical information systems were used to compute basin characteristic information for each streamgage for the purpose of developing regional equations to estimate flood statistics at ungaged basins. Five hydrologic flood regions in Arizona were defined in a multivariate regionalization process based on mean basin elevation, mean annual precipitation, and soil permeability. A regional generalized least-squares-regression analysis was used to develop five sets of equations from 344 nonredundant streamgages, corresponding to five regions, for estimating the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities at ungaged basins in Arizona. The regression equations developed for these five regions were based on one or more of the statistically significant explanatory variables: drainage area, mean basin elevation, and mean annual precipitation. Average standard errors of prediction for the regression regions for the five regions ranged from 27 to 122 percent and the pseudo-coefficients of determination (pseudo-R2), a measure of the proportion of peak-flow variation that is explained by the basin characteristics, ranged from 68 to 98 percent. Regression equations for Central Highlands (region 4) had the lowest model error and the greatest pseudo-R2 metrics. The equations for Colorado Plateau (region 2) regression equations generally had greater model error and lower pseudo-R2 metrics. The improvement of regional regression equation model error and pseudo-R2 metrics was related to higher numbers of streamgages, longer period of record, and even spatial coverage within a region.


The regional regression equations were integrated into the U.S. Geological Survey’s StreamStats program. The StreamStats program is a national map-based web application that allows the public to easily access published flood frequency and basin characteristic statistics. The interactive web application allows a user to select a point within a watershed (gaged or ungaged) and retrieve flood-frequency estimates derived from the current regional regression equations and geographic information system data within the selected basin. StreamStats provides users with an efficient and accurate means for retrieving the most up to date flood frequency and basin characteristic data. StreamStats is intended to provide consistent statistics, minimize user error, and reduce the need for large datasets and costly geographic information system software.