metacoder icon indicating copy to clipboard operation
metacoder copied to clipboard

`compare_groups`: allow comparisions with more than one grouping variable

Open zachary-foster opened this issue 6 years ago • 5 comments

Instead of:

compare_groups(obj, dataset = "tax_abund",
               cols = hmp_samples$sample_id,
               groups = paste(hmp_samples$sex, hmp_samples$body_site))

allow

compare_groups(obj, dataset = "tax_abund",
               cols = hmp_samples$sample_id,
               groups = list(sex = hmp_samples$sex, 
                             site = hmp_samples$body_site))

or

compare_groups(obj, dataset = "tax_abund",
               cols = hmp_samples$sample_id,
               groups = hmp_samples[, c("sex", "body_site"])

This would compare every unique combination of sex and body site. Instead of the "treatment_1" and "treatment_2" columns, the output in this example would have "sex_1", "site_1", "sex_2", and "site_2" columns.

This was requested by a user and is similar to something @grunwald wanted as well.

zachary-foster avatar Aug 10 '18 12:08 zachary-foster

Is there a way to list the combinations of which specific groups we want to compare instead of comparing everything against each other, or just one pair at a time?

For example, if I have 4 groups:

  1. Control
  2. Disease
  3. Treatment
  4. Disease + treatment

I only want to compare:

  • control vs disease
  • control vs treatment
  • disease vs disease + treatment

catherineel avatar Oct 26 '21 02:10 catherineel

To only calculate differences between certain groups, you can use the combinations option of compare_groups. To plot only some comparisons one at a time, you can use code like this:

https://grunwaldlab.github.io/metacoder_documentation/workshop--07--diversity_stats.html#comparing-taxon-abundance-in-two-groups

Does that do what you want?

zachary-foster avatar Oct 28 '21 19:10 zachary-foster

Not exactly, is there a way to list specific combinations in one line?

For example, I used this code to compare control vs disease and then I repeated the same code with control vs treatment.

obj$data$diff_table <- compare_groups(obj, "tax_abund", cols = obj_samples$sample_id, groups = obj_samples$treatment, combinations = list(c("control", "disease")))

But is there a way to write down the different combinations in one line as I want to only display certain comparisons in my heat tree matrix?

obj$data$diff_table <- compare_groups(obj, "tax_abund", cols = obj_samples$sample_id, groups = obj_samples$treatment, combinations = list(c("control", "disease" + "control", "treatment" + "disease", "disease+treatment" ))) <- I know this isn't correct

At the moment my heat tree matrix compares all groups against each other,

obj$data$diff_table <- compare_groups(obj, dataset = "tax_abund",
                                      cols = obj_samples$sample_id, 
                                      groups = obj_samples$treatment) 

but there are some comparisons that I am not interested in eg. treatment alone vs disease alone. I want to make the heat tree matrix neater by only showing certain comparisons rather than everything.

Sorry if I am not making any sense

catherineel avatar Oct 31 '21 12:10 catherineel

If you want to plot the results with heat_tree_matrix, you can plot all pair-wise combinations of a subset of treatments, but not arbitrary subset of combinations because then there would be missing plots in the triangular matrix the plots are arranged in. In the future there might be a helper function like heat_tree_matrix that was designed to plot a custom list of specific comparisons, but currently the way I would do it is to make each comparison plot individually (using method in link above) and combine them with a tool like cowplot. Does that make sense?

zachary-foster avatar Nov 02 '21 07:11 zachary-foster

Yes that does, thanks a lot!

catherineel avatar Nov 02 '21 23:11 catherineel