UpSetR icon indicating copy to clipboard operation
UpSetR copied to clipboard

Exclusive vs inclusive intersections option

Open diplodata opened this issue 6 years ago • 6 comments

If I'm not mistaken there's no setting to specify in a plot whether intersections should be inclusive or exclusive - i.e. whether the value for AB is a count of all observations of AB, or if it excludes for instance ABC, ABD, etc, which have their own bars. UpsetR implements the latter, e.g. AB = 1 in following plot:

x = data.frame(A=1, B=c(0,1,1), C=c(0,0,1))
upset(x, sets = names(x))

image

Am I wrong and there is a way to get a value of 2 for AB with the above dataset (and 3 for A)?

If not, this seems to me a problem for 2 reasons. Firstly there's the obvious risk that some readers will interpret AB as inclusive. Secondly, for datasets featuring a large number of categories (or where the long tail (..AX) might be cut off by the nintersects argument) it's actually very difficult to gain from the plot even an approximate sense of the actual total intersection between A and B, let alone a figure. Doesn't that rather defeat the point of the whole approach?

diplodata avatar May 17 '18 13:05 diplodata

For an example of a long dataset:

x = read.csv('https://gist.githubusercontent.com/diplodata/b2d6610a6e43fddfccea9a8a4821a154/raw/5d795f82f59e6571b78e99f8612edcbb7e5886b8/test.csv')

Take AD as an example. It's almost impossible from the plot to tell the intersection. sum(x$D & x$A) reports 21. But there's no way you could figure that out from looking at the graph.

upset(x, sets = names(x))

image

diplodata avatar May 17 '18 14:05 diplodata

In my opinion, this is a major problem. The true functionality might be explained somewhere in the paper or elsewhere, but quickly looking at these graphs, I feel most people will expect the plot to be "inclusive". I would like to use these graphs to get away from having to do any mental math (which is the problem with Venn diagrams in the first place). I would prefer to look at the graph and immediately determine the inclusive overlap between any number of groups. I'm not sure it's possible, but I would like to request an option to make the graphs display inclusive overlaps. Thanks for a very cool approach!

toddknutson avatar Apr 30 '19 20:04 toddknutson

Nice to see I'm not alone! Alas it's starting to look like UpSetR may be a dead project - a real shame if so.

diplodata avatar Apr 30 '19 23:04 diplodata

Hi!

Is there any update on this issue? I am also interested in applying the "inclusive" version and would like to know if you found a way to do it.

AlvaroRodriguezDelRio avatar Dec 02 '19 14:12 AlvaroRodriguezDelRio

I transitioned to using a different package, called ggupset. I asked how to solve this problem on their GitHub issues page and the developer provided a nice reproducible example. This might be helpful to someone: https://github.com/const-ae/ggupset/issues/3

toddknutson avatar Dec 02 '19 15:12 toddknutson

Great, this is really helpful! Thanks a lot!!

AlvaroRodriguezDelRio avatar Dec 02 '19 15:12 AlvaroRodriguezDelRio