Skip to main content
U.S. flag

An official website of the United States government

ARCHI: A new R package for automated imputation of regionally correlated hydrologic records

February 28, 2025

Missing data in hydrological records can limit resource assessment, process understanding, and predictive modeling. Here, we present ARCHI (Automated Regional Correlation Analysis for Hydrologic Record Imputation), a new, open-source software package in R designed to aggregate, impute, cluster, and visualize regionally correlated hydrologic records. ARCHI imputes missing data in “target” records by linear regression using more complete “reference” records as predictors. Automated imputation is implemented using a novel, iterative algorithm that allows each site to be considered a target or reference for regression, growing the pool of complete references with each imputed record until viable gap-filling ceases. Users can limit artifacts from spurious correlations by specifying model-acceptance criteria and applying geospatial, correlation, and group-based filters to control reference selection. ARCHI provides additional functions for visualizing results, clustering records with similar correlation structures, evaluating holdout data, and interactive parameterization with an accessible and intuitive graphical user interface (GUI). This methods brief provides an overview of the ARCHI package, modeling guidelines, and benchmarking on two regional groundwater-level datasets from the Central Valley, CA and Long Island, NY. We evaluate ARCHI alongside widely used multivariate imputation software to highlight and contextualize its computational efficiency, imputation accuracy, and model transparency when applied to large, groundwater-level datasets.

Publication Year 2025
Title ARCHI: A new R package for automated imputation of regionally correlated hydrologic records
DOI 10.1111/gwat.13474
Authors Zeno Levy, Robin Glas, Timothy Stagnitta, Neil Terry
Publication Type Article
Publication Subtype Journal Article
Series Title Groundwater
Index ID 70269357
Record Source USGS Publications Warehouse
USGS Organization California Water Science Center
Was this page helpful?