kirkegaard
kirkegaard copied to clipboard
Move missing data functions to their own package
There are by now a number of these missing data related functions. I have already given them a prefix, miss_
. As such, they constitute a small but very useful set of functions:
-
miss_amount
, for overall counts of missing data. -
miss_by_var
, missing data for each variable. -
miss_by_case
, missing data for each case. -
miss_plot
, for visualizing patterns of missing data in multiple ways. -
miss_pattern
, calculates missing data patterns. -
miss_analyze
, for analyzing whether missingness is related to variables ('missing at random'). -
miss_complexity
, returns complexity metrics for the missing data. -
miss_matrix
, returns the missing data matrix used for the above, but which could be used for other purposes. -
miss_add_random
, adds random missing data to a data frame. I'm considering renaming this to miss_add and add functionality for adding non-random missing data. An alternative is to add some other functions e.g. miss_add_nonrandom to which one could somehow indicate how missingness is to be induce, perhaps via formulas or functions. -
miss_impute
, wrapper function to impute missing data usingVIM:irmi
. Is able to deal with some problems, such as which subsets of the data one wants to impute.
In general, I am pretty happy with the current selection of functions, not much functionality is missing.