kirkegaard icon indicating copy to clipboard operation
kirkegaard copied to clipboard

Move missing data functions to their own package

Open Deleetdk opened this issue 7 years ago • 0 comments

There are by now a number of these missing data related functions. I have already given them a prefix, miss_. As such, they constitute a small but very useful set of functions:

  • miss_amount, for overall counts of missing data.
  • miss_by_var, missing data for each variable.
  • miss_by_case, missing data for each case.
  • miss_plot, for visualizing patterns of missing data in multiple ways.
  • miss_pattern, calculates missing data patterns.
  • miss_analyze, for analyzing whether missingness is related to variables ('missing at random').
  • miss_complexity, returns complexity metrics for the missing data.
  • miss_matrix, returns the missing data matrix used for the above, but which could be used for other purposes.
  • miss_add_random, adds random missing data to a data frame. I'm considering renaming this to miss_add and add functionality for adding non-random missing data. An alternative is to add some other functions e.g. miss_add_nonrandom to which one could somehow indicate how missingness is to be induce, perhaps via formulas or functions.
  • miss_impute, wrapper function to impute missing data using VIM:irmi. Is able to deal with some problems, such as which subsets of the data one wants to impute.

In general, I am pretty happy with the current selection of functions, not much functionality is missing.

Deleetdk avatar Feb 04 '17 01:02 Deleetdk