missCompare icon indicating copy to clipboard operation
missCompare copied to clipboard

NAs in metadata$Corr_matrix

Open hopkinsjj9 opened this issue 4 years ago • 8 comments

Thank you for putting together a great package!

I'm getting infinite or missing values in 'x' errors when I try to send the following data through the process: https://www.kaggle.com/pradeeptripathi/predicting-house-prices-using-r/data

train <- data.frame(readr::read_csv('../data/train.csv')) str(train) train <- train %>% mutate_if(is.character,as.factor) str(train)

cleaned <- missCompare::clean(train, var_removal_threshold = 0.5, ind_removal_threshold = 0.8, missingness_coding = -9)

make sure cleaned <- missCompare::clean(cleaned, var_removal_threshold = 0.5, ind_removal_threshold = 0.8, missingness_coding = -9)

metadata <- missCompare::get_data(cleaned, matrixplot_sort = T, plot_transform = T) Warning message: In stats::cor(X, use = "pairwise.complete.obs", method = "pearson") : the standard deviation is zero

simulated <- missCompare::simulate(rownum = metadata$Rows, colnum = metadata$Columns, cormat = metadata$Corr_matrix, meanval = 0, sdval = 1) Error in eigen(if (doDykstra) R else Y, symmetric = TRUE) : infinite or missing values in 'x'

I found two NAs in metadata$Corr_matrix. Utilities/LotFrontage Not knowing exactly how to handle this, I just set them to zero (hack)

colnames(metadata$Corr_matrix)[colSums(is.na(metadata$Corr_matrix)) > 0] metadata$Corr_matrix[is.na(metadata$Corr_matrix)] <- 0

I can now restart at the simulate step But, there's got to be a better way Shouldn't clean or get_data take care of this somehow?

Thanks again Jack Hopkins

hopkinsjj9 avatar Sep 20 '19 15:09 hopkinsjj9