Data Science: Multiple Imputation in Practice (Hybrid)

  • LocationHome, Utrecht
  • Duration1 week
  • Starting moment8 July 2024
  • LanguageEnglish
  • Teaching methodOnline, At location
  • CertificationCertificate
  • Price730

Participants will learn how to form imputation models, how to combine data sets, how to model non-response, how to use diagnostics to inspect the imputed values, how to obtain valid inference on incomplete data and how to avoid many of the pitfalls associated with real-life missing data problems. While there will be plenty of opportunity to ask the experts for help and advice throughout the course, we end the course with the opportunity to consult us on your own specific missing data problem.

Most researchers need to deal with incomplete data. Missing data complicate the statistical analysis of data. Simply removing the missing data is not a good strategy and can bias the results. Multiple imputation is a general and statistically valid technique to analyze incomplete data. Multiple imputation has rapidly become the standard in social and behavioural science research.

This hybrid course will explain modern and flexible imputation techniques that are able to preserve salient data features. The course enhances participants’ knowledge of imputation principles and provides flexible hands-on solutions to incomplete data problems using R. The course discusses principles of missing data theory, outlines a step-by-step approach toward creating high quality imputations, and provides guidelines on how to report the results. The course will use the authors’ MICE package in R.

The lectures will follow the book “Flexible Imputation of Missing Data” by Stef van Buuren ( 2nd edition, Chapman & Hall, 2018). The book can be read online for free at https://stefvanbuuren.name/fimd/.

Target audience

This course is relevant for applied researchers or statistical researchers that would like to get acquainted with incomplete data theory and the practice of multiple imputation. Participants should have a basic understanding of statistical techniques (such as analysis of variance and (non)linear regression) and the concept of statistical inference. This course is suitable for students at master, advanced master, and PhD level

For an overview of all our summer courses offered by the Department of Methodology and Statistics please click here.

Aim of the course

  • To enhance participants’ knowledge of imputation methodology;
  • To get comfortable with flexible solutions to deal with incomplete data using R.

Learning goals:

  • Participants will learn to make informed decisions on how to handle incomplete data in a scientifically valid way;
  • Participants will be able to implement the approach taken using state-of-the-art R technology.

Visit Utrecht Summer School website

To the course