
Advanced Techniques for Handling Missing Data in analysis and prediction workflows
Participants will learn how to form imputation models, how to combine data sets, how to model non-response, how to use diagnostics to inspect the imputed values, how to obtain valid inference on incomplete data and how to avoid many of the pitfalls associated with real-life missing data problems. While there will be plenty of opportunity to ask the experts for help and advice throughout the course, we end the course with the opportunity to consult us on your own specific missing data problem.
Most researchers need to deal with incomplete data. Missing data complicate the statistical analysis of data. Simply removing the missing data is not a good strategy and can bias the results. Multiple imputation is a general and statistically valid technique to analyze incomplete data. Multiple imputation has rapidly become the standard in social and behavioural science research.
This hybrid course will explain modern and flexible imputation techniques that are able to preserve salient data features. The course enhances participants’ knowledge of imputation principles and provides flexible hands-on solutions to incomplete data problems using R. The course discusses principles of missing data theory, outlines a step-by-step approach toward creating high quality imputations, and provides guidelines on how to report the results. The course will use the authors’ MICE package in R.
The lectures will follow the book “Flexible Imputation of Missing Data” by Stef van Buuren ( 2nd edition, Chapman & Hall, 2018). The book can be read online for free at https://stefvanbuuren.name/fimd/.