Multimetric indices, such as the Index of Biological Integrity (IBI), are increasingly used by management agencies to determine whether surface water quality is impaired. However, important questions about the variability of these indices have not been thoroughly addressed in the scientific literature. In this study, we used a bootstrap approach to quantify variability associated with fish IBIs developed for streams in two Minnesota river basins. We further placed this variability into a management context by comparing it to impairment thresholds currently used in water quality determinations for Minnesota streams. We found that 95% confidence intervals ranged as high as 40 points for IBIs scored on a 0–100 point scale. However, on average, 90% of IBI scores calculated from bootstrap replicate samples for a given stream site yielded the same impairment status as the original IBI score. We suggest that sampling variability in IBI scores is related to both the number of fish and the number of rare taxa in a field collection. A comparison of the effects of different scoring methods on IBI variability indicates that a continuous scoring method may reduce the amount of bias in IBI scores.