Mean squared error, deconstructed
As science becomes increasingly cross-disciplinary and scientific models become increasingly cross-coupled, standardized practices of model evaluation are more important than ever. For normally distributed data, mean squared error (MSE) is ideal as an objective measure of model performance, but it gives little insight into what aspects of model performance are “good” or “bad.” This apparent weakness has led to a myriad of specialized error metrics, which are sometimes aggregated to form a composite score. Such scores are inherently subjective, however, and while their components may be interpretable, the composite itself is not. We contend that, a better approach to model benchmarking and interpretation is to decompose MSE into interpretable components. To demonstrate the versatility of this approach, we outline some fundamental types of decomposition and apply them to predictions at 1,021 streamgages across the conterminous United States from three streamflow models. Through this demonstration, we hope to show that each component in a decomposition represents a distinct concept, like “season” or “variability,” and that simple decompositions can be combined to represent more complex concepts, like “seasonal variability,” creating an expressive language through which to interrogate models and data.
Citation Information
Publication Year | 2021 |
---|---|
Title | Mean squared error, deconstructed |
DOI | 10.1029/2021MS002681 |
Authors | Timothy O. Hodson, Thomas M. Over, Sydney Foks |
Publication Type | Article |
Publication Subtype | Journal Article |
Series Title | Journal of Advances in Earth Systems Modeling |
Index ID | 70226884 |
Record Source | USGS Publications Warehouse |
USGS Organization | Illinois Water Science Center; Central Midwest Water Science Center |