UpSetPlot
UpSetPlot copied to clipboard
min_subset_size expressed as a percentage / fraction
Hi, Thank you for your support.
I tried to filter values with min_subset_size=0.1
when using show_percentages
but it did not filter (will work if I set both show_percentages and show_count to True).
Another question (please let me know If I need to move it as separate one)
When using upset.add_stacked_bars
and another method, ex: upset.add_catplot
How I can control the legend location for each plot?
I used fig.legend(loc=(1, .7))
but it will work for one only, not both.
Thanks, Medhat
Hi @MeHelmy, thanks for raising the issue. To help investigate, would you be able to provide complete, runnable code snippets that reproduce the issues?
Thank you!
Here is an example:
import pandas as pd
import numpy as np
mydf = pd.DataFrame(np.random.randint(2,size=(1000, 4)), columns=list('ABCD'))
mydf['Country'] = pd.DataFrame(np.arange(1000).reshape(1000,1)).applymap(lambda x: np.random.choice(['Ge', 'Eg', 'Ni', 'UN']))
first=True
for i in mydf.columns.tolist()[:-1]:
if first:
first=False
mydf = mydf.set_index(mydf[i] == 1)
else:
mydf = mydf.set_index(mydf[i] == 1, append=True)
from upsetplot import UpSet
from matplotlib import cm
fig = plt.figure(figsize=(20, 10))
upset = UpSet(mydf,
intersection_plot_elements=0,
show_percentages=True,
orientation='vertical',
show_counts=False,
min_subset_size=5,
sort_by='degree',
)
upset.add_stacked_bars(by="Country", colors=cm.Pastel1,
title="Count by country",
elements=20,
)
fig.legend(loc=(1, 1))
upset.plot()
That results in this plot:
So, I used min_subset_size=5
so I think I should not see 4.6%
I tried to control the legend position looks like I am doing it wrong, lastly the annotation on the vertical bars are not clear can I change the annotation to be horizontal? (I know we can change the font by using mpl.rcParams['font.size'] = 10 or so, but this is not the case here).
Thanks, Medhat
Are you saying that you want min_subset_size
to also operate in percentages? Personally I find it very surprising behaviour if I had code
UpSet(mydf, min_subset_size=20)
and then added a parameter related to display
UpSet(mydf, min_subset_size=20, show_percentages=True)
and got an altogether different plot.
Are you able to share code for your other issue with legends?
Are you saying that you want
min_subset_size
to also operate in percentages? Personally I find it very surprising behaviour if I had codeUpSet(mydf, min_subset_size=20)
and then added a parameter related to display
UpSet(mydf, min_subset_size=20, show_percentages=True)
and got an altogether different plot.
I understand your point, let me put it this way, can I filter intersects to include only intersects representing more than 2% (just a random number here)?
Are you able to share code for your other issue with legends?
It is in the same mentioned plot.
We could support something like min_subset_frac=.2, or min_subset_size="20%", but we do not support these at present
We could support something like min_subset_frac=.2, or min_subset_size="20%", but we do not support these at present
Sure! I was suggesting an idea :)
Best, Medhat
I've renamed the issue in accordance with this.
Thank you,
Do you have any suggestions for: fig.legend(loc=(1, 1)) and The annotation on the vertical bars, If I can change the annotation to be horizontal?
Thanks! Medhat
I will admit legend placement is not within my expertise... Not sure what you're asking about annotations; label your picture?
Thank you,
The annotation I mean upper left corner (red arrow).
You want that text rotated? Currently you would have to get the axes returned by the plot()
method, find the annotations and modify them...
Thank you!
I'm reopening this until we have a solution for min_subset_size expressed as a fraction
I'm actually going to open a new issue so I don't need to keep reading this thread when I get occasion to work on this package .