complex-upset icon indicating copy to clipboard operation
complex-upset copied to clipboard

Changing the y-axis scale for intersection_size from ComplexUpset package

Open arunimgarg opened this issue 1 year ago • 1 comments

Objective I want to normalize the intersection_size data from 0 to 1. I have created 4 different upset plots using the ComplexUpset package in R. The 4 plots have different intersection sizes since the lengths of the data frames range from 300 to 12000. I was hoping to have a same y-axis scale for ease of clarity and discussion.

I have attached 2 out of the 4 upset plots I have created that I need to compare (redacted the labels since the I'm working on a project on a vm of a protected institution). As it can be seen, the y-axes of the plots are on different scales.

After reading the Upset and ComplexUpset documentations, I see that the intersections are internally calculated and cannot really be extracted. I see that you still manipulate the intersections like:

'Intersection size'=intersection_size(text_mapping=aes(label=paste0(round(
            !!get_size_mode('exclusive_intersection')/!!get_size_mode('inclusive_union') * 100
        ), '%')))

but I couldn't do a normalization like

'Intersection size'=intersection_size(text_mapping=aes(label=paste0(round(
            !!get_size_mode('exclusive_intersection')/max(!!get_size_mode('inclusive_union')))

I saw How to to assign logarithmic scale to “Intersection size” using ComplexUpset library? solution from @krassowski and I'm hoping to do something similar using the geom_bar to maybe normalize instead of a log scale.

Screenshot or illustration image

image

Context (required)

ComplexUpset version: 1.3.3

arunimgarg avatar Nov 29 '23 21:11 arunimgarg

I have done the following to normalize (y = y/max(y))the intersection size:

presence = ComplexUpset:::get_mode_presence('exclusive_intersection')
summarise_values = function(df){
    aggregate(
        as.formula(paste0(presence, '~intersection')),
        df,
        FUN = sun
    )
}

upset(
    movies,
    genres,
    base_annotations=list(
        'log10(intersection size)'=(
            ggplot()
            + geom_bar(
                data=summarise_values,
                stat='identity',
                aes(y=!!presence / max(!!presence))) 
            )
        )
    ),
    width_ratio=0.1
)

I think the results make sense as I'm seeing them, but if anyone sees any logical mistake, let me know. Otherwise, we're good to close this issue.

arunimgarg avatar Nov 30 '23 18:11 arunimgarg