Skip to main content
U.S. flag

An official website of the United States government

Comparison of algorithms for replacing missing data in discriminant analysis

January 1, 1992

We examined the impact of different methods for replacing missing data in discriminant analyses conducted on randomly generated samples from multivariate normal and non-normal distributions. The probabilities of correct classification were obtained for these discriminant analyses before and after randomly deleting data as well as after deleted data were replaced using: (1) variable means, (2) principal component projections, and (3) the EM algorithm. Populations compared were: (1) multivariate normal with covariance matrices ∑1=∑2, (2) multivariate normal with ∑1≠∑2 and (3) multivariate non-normal with ∑1=∑2. Differences in the probabilities of correct classification were most evident for populations with small Mahalanobis distances or high proportions of missing data. The three replacement methods performed similarly but all were better than non - replacement.

Publication Year 1992
Title Comparison of algorithms for replacing missing data in discriminant analysis
DOI 10.1080/03610929208830864
Authors Daniel J. Twedt, D.S. Gill
Publication Type Article
Publication Subtype Journal Article
Series Title Communications in Statistics - Theory and Methods
Index ID 70202999
Record Source USGS Publications Warehouse
USGS Organization National Wetlands Research Center