Modify calc_mean_dissimilarities to resolve factor-mismatch bug
I was getting the following error when using the calc_mean_dissimilarities function:
Error in Ops.factor(dm_clmns_wCat[, 4], dm_clmns_wCat[, 5]) : level sets of factors are different
After some sleuthing I figured out that this bug appears after the add_metadata_to_dm_clmns function converts the factor category into two factors. If the summarize_by_factor category contains levels with only one sample, and if that sample happens to be listed first in the distance matrix, then since there are no "self" comparisons, there will be a mismatch in the number of factor levels between the 4th and 5th columns of dm_clmns_wCat. This can easily be solved by converting those factors into characters before reducing the dataframe. Something like this:
dm_clmns = convert_dm_to_3_column(dm) dm_clmns_wCat = add_metadata_to_dm_clmns(dm_clmns, metadata_map, summarize_by_factor) # change categories to factor so that I don't get the factor mis-align error dm_clmns_wCat[, 4] <- as.character(dm_clmns_wCat[, 4]) dm_clmns_wCat[, 5] <- as.character(dm_clmns_wCat[, 5]) dm_clmns_wCat = dm_clmns_wCat[!is.na(dm_clmns_wCat[, 4]) & !is.na(dm_clmns_wCat[, 5]), ] dm_clmns_wCat_reduced = dm_clmns_wCat[dm_clmns_wCat[, 4] != dm_clmns_wCat[, 5], ]