complex-upset
complex-upset copied to clipboard
Memory error when trying to manually sort intersections but not when intersection order is specified with sort_intersections_by
I need the intersections sorted in a particular way. This has worked with less intersections, but when I added more I receive a memory error. However when the intersection order isn't specified or is specified with sort_intersections_by the code works.
intersections = list('RaCa2', 'P14K','t1_18','C24S','I4K','S15K', 'A19K', c('C18S','t1_18'),c('C18S','C24S'),c('I4K','A19K'), c('S15K','A19K','T21K'),c('I4K','S15K','t1_18'),c('I4K','A19K','T21K'), c('P3R','I4K','A19K'),c('P3R','S15K','t1_18'), c('P3R','I4K','S15K','t1_18'), c('P3R','I4K','S15K','C18S','t1_18'), c('A19K','t15_23'), c('A19Dab','K22Dab','K23Dab','t15_23','NH2','_CONH2'), c('A19Dap','t15_23','NH2','_CONH2'), c('A19Dap','K22Dap','K23Dap','t15_23','NH2','_CONH2'), c('A19K','d_23','t15_23','NH2','_CONH2'), c('A19K','d_22','t15_23','NH2','_CONH2'), c('A19K','d_19','t15_23','NH2','_CONH2'), c('A19K','d_19','d_22','d_23','t15_23','NH2','_CONH2'), c('A19K','t15_23','NH2','_CONH2'), c('A19K','K22Dap','t15_23','NH3','_CONH2'), c('A19K','K23Dap','t15_23','NH3','_CONH2'), c('A19Dab','t15_23','NH3','_CONH2'), c('A19K','K22Dab','t15_23','NH3','_CONH2'), c('A19K','K23Dab','t15_23','NH3','_CONH2'), c('C18S','A19K','d_19','d_22','d_23','t15_23','NH3','_CONH2'), c('C18S','A19Dap','K22Dap','K23Dap','t15_23','NH3','_CONH2'), c('C18S','A19Dab','K22Dab','K23Dab','t15_23','NH3','_CONH2') )
This produces the error:
Error: cannot allocate vector of size 524287.9 Gb
Looking at the codebase, yes this is unfortunately the case - when using user-provided intersections we are computing all intersections and subsetting rather than compute the intersections of interest directly:
https://github.com/krassowski/complex-upset/blob/e3b51dce5022972d782b57f0086ac6b7e025c08b/R/data.R#L514-L515
Ideally we would instead create an equivalent to observed_intersections_matrix
and just multiply it as in:
https://github.com/krassowski/complex-upset/blob/e3b51dce5022972d782b57f0086ac6b7e025c08b/R/data.R#L534-L537
But if I recall correctly this was not trivial for some reason...
Without changing the logic too much we could try to change the third and fourth argument in all_intersections_matrix(intersect, NULL, 0, Inf)
call to min(saply(intersections, length))
and max(saply(intersections, length))
, but I am not sure if it would work for all cases right now.
Hi Michal,
I'm having the same issue as lrichter53 (Error: vector memory exhausted (limit reached?)) and I've tried applying the patch mentioned here: https://stackoverflow.com/questions/72820148/complexupset-how-can-i-plot-selected-intersections. But it seems to have made the issue worse as I receive the memory error a lot quicker. Is there any way to fix the error please? I'm using ComplexUpset 1.3.3 have a 92 long list of intersections I need to apply, so it would be great if there was a fix.
Just adding my commands below (its pretty long):
gene_list = c("unknown", "AmpC1", "BBB_1D_v1", "AAA_1_gene", "AAA_2_genes", "AAA_3_genes", "AAA_4_genes", "BBB_KKK_1_gene", "BBB_KKK_2_genes", "BBB_KKK_3_genes", "BIL_CMY_LAP_SCO_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_2_genes", "UUU_GGG_1_gene", "UUU_GGG_2_genes", "BBB_GGG_1_gene", "KKK_GGG_1_gene", "KKK_GGG_2_genes", "KPC_KKK_MMM_1_gene", "NDM_KKK_MMM_1_gene", "NDM_KKK_MMM_2_genes", "OXA_KKK_MMM_1_gene", "VIM_KKK_MMM_1_gene", "VIM_KKK_MMM_2_genes", "UUU_49_KKK_InhR")
upset(df, gene_list, annotations = list(
'Infection_status'=(
ggplot(mapping=aes(fill=Infection_status))
+ ggtitle("alpha") + theme(plot.title = element_text(size = 60, face = "bold"))
+ geom_bar(stat='count', position='fill')
+ scale_y_continuous(labels=scales::percent_format())
+ scale_fill_manual(values=c('Dead'='#ebb860', 'Alive'='#57109e', 'Unknown'='#468a37')) + ylab('Infection_status')),
'Phenotype'=(
ggplot(mapping=aes(fill=Phenotype))
+ geom_bar(stat='count', position='fill')
+ scale_y_continuous(labels=scales::percent_format())
+ scale_fill_manual(values=c('grey'='#36e345','intermediate'='#eb5278','sleep'='#1a47db')) + ylab('Phenotype')),
'Phenotype (mm)'=ggplot(mapping=aes(x=intersection, y=Disk_measurement)) + ggtitle("Phenotype vs cell (DDD) mutations") + geom_hline(yintercept=18, color="pink", size=1, linetype = 'dashed') + annotate("text",x=50, y =17, label = "ECO_O = 18mm",color = "pink",size = 12) + geom_violin(width=1.1, alpha=1.5) + ggbeeswarm::geom_quasirandom(aes(color=DDD_mutations, size = 1)) + guides(color = guide_legend(override.aes = list(size=7))),
'Phenotypes (mm)'=ggplot(mapping=aes(x=intersection, y=Disk_measurement)) + ggtitle("Phenotype vs Phenotype") + geom_hline(yintercept=18, color="pink", size=1, linetype = 'dashed') + annotate("text",x=50, y =17, label = "ECO_O = 18mm",color = "pink",size = 12) + geom_violin(width=1.1, alpha=1.5) + ggbeeswarm::geom_quasirandom(aes(color=Phenotype, size = 1)) + guides(color = guide_legend(override.aes = list(size=7)))),
sort_intersections=FALSE, intersections=list(c("AAA_1_gene"), c("AAA_2_genes"), c("AAA_3_genes"), c("AAA_4_genes"), c("UUU_49_KKK_InhR"), c("UUU_GGG_1_gene"), c("UUU_GGG_2_genes"), c("BBB_1D_v1", "AAA_1_gene"),
c("AAA_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene"), c("AAA_1_gene", "BBB_KKK_1_gene"), c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene"), c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene"), c("AAA_1_gene", "KKK_GGG_1_gene"), c("AAA_1_gene", "UUU_GGG_1_gene"),
c("AAA_1_gene", "UUU_GGG_2_genes"), c("BBB_1D_v1", "AAA_1_gene", "KKK_GGG_1_gene"), c("BBB_1D_v1", "AAA_1_gene", "BBB_GGG_1_gene"), c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_1_gene"), c("BBB_1D_v1", "AAA_1_gene", "BBB_GGG_1_gene", "KKK_GGG_1_gene"), c("AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_1_gene"),
c("AAA_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "KKK_GGG_1_gene"), c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "UUU_GGG_1_gene"), c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "KKK_GGG_1_gene"), c("AAA_1_gene", "NDM_KKK_MMM_1_gene"), c("AAA_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "NDM_KKK_MMM_1_gene"), c("AAA_1_gene", "UUU_GGG_1_gene", "OXA_KKK_MMM_1_gene"),
c("AAA_1_gene", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), c("AAA_1_gene", "UUU_GGG_1_gene", "KPC_KKK_MMM_1_gene"), c("AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"), c("AAA_1_gene", "BBB_KKK_1_gene", "UUU_GGG_1_gene", "KPC_KKK_MMM_1_gene"), c("AAA_1_gene", "BBB_KKK_2_genes", "UUU_GGG_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"), c("AAA_1_gene", "BBB_KKK_2_genes", "UUU_GGG_1_gene", "BBB_GGG_1_gene", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"),
c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_2_genes", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"), c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "UUU_GGG_1_gene", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "UUU_GGG_2_genes", "NDM_KKK_MMM_1_gene"), c("AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene", "OXA_KKK_MMM_1_gene"),
c("AmpC1", "BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_2_genes", "NDM_KKK_MMM_1_gene", "OXA_KKK_MMM_1_gene"), c("AAA_1_gene", "UUU_49_KKK_InhR"), c("AAA_2_genes", "BIL_CMY_LAP_SCO_KKK_1_gene"), c("AAA_2_genes", "BBB_KKK_1_gene"), c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene"), c("AAA_2_genes", "UUU_GGG_1_gene"), c("AAA_2_genes", "KKK_GGG_1_gene"), c("AAA_2_genes", "NDM_KKK_MMM_1_gene"), c("AAA_2_genes", "BBB_KKK_1_gene", "KPC_KKK_MMM_1_gene"), c("AAA_2_genes", "BIL_CMY_LAP_SCO_KKK_1_gene", "OXA_KKK_MMM_1_gene"), c("BBB_1D_v1", "AAA_2_genes", "KKK_GGG_1_gene"), c("BBB_1D_v1", "AAA_2_genes", "BBB_GGG_1_gene", "KKK_GGG_1_gene"),
c("AAA_2_genes", "BBB_KKK_1_gene", "BBB_GGG_1_gene", "KKK_GGG_1_gene"), c("AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_1_gene"), c("AAA_2_genes", "BBB_KKK_2_genes", "KKK_GGG_1_gene"), c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "KKK_GGG_1_gene"), c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_1_gene"), c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_2_genes"), c("AAA_2_genes", "BBB_KKK_1_gene", "BBB_GGG_1_gene", "KKK_GGG_2_genes", "OXA_KKK_MMM_1_gene"), c("AAA_2_genes", "BBB_KKK_2_genes", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"), c("AAA_2_genes", "BBB_KKK_2_genes", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"),
c("AAA_2_genes", "BBB_KKK_2_genes", "BBB_GGG_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_2_genes"), c("AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "KPC_KKK_MMM_1_gene"), c("AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"), c("BBB_1D_v1", "AAA_2_genes", "BBB_GGG_1_gene", "KPC_KKK_MMM_1_gene"), c("BBB_1D_v1", "AAA_2_genes", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene", "OXA_KKK_MMM_1_gene"), c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_2_genes", "OXA_KKK_MMM_1_gene"), c("BBB_1D_v1", "AAA_2_genes", "BBB_GGG_1_gene", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"),
c("BBB_1D_v1", "AAA_2_genes", "BBB_GGG_1_gene", "KKK_GGG_2_genes", "NDM_KKK_MMM_1_gene", "OXA_KKK_MMM_1_gene"), c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), c("AAA_2_genes", "BBB_KKK_2_genes", "BIL_CMY_LAP_SCO_KKK_1_gene", "BBB_GGG_1_gene", "KKK_GGG_1_gene", "UUU_49_KKK_InhR"), c("AAA_3_genes", "UUU_GGG_1_gene"), c("AAA_3_genes", "UUU_49_KKK_InhR"), c("BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_1_gene"), c("BBB_1D_v1", "BBB_KKK_1_gene", "UUU_GGG_1_gene"), c("BBB_1D_v1", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_1_gene"), c("BBB_KKK_3_genes", "BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_1_gene"), c("BBB_1D_v1", "BBB_KKK_1_gene", "UUU_GGG_1_gene", "KKK_GGG_1_gene"),
c("BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_1_gene", "OXA_KKK_MMM_1_gene"), c("BBB_KKK_1_gene", "UUU_GGG_1_gene", "KPC_KKK_MMM_1_gene"), c("UUU_GGG_1_gene", "VIM_KKK_MMM_1_gene"), c("UUU_GGG_1_gene", "KPC_KKK_MMM_1_gene"), c("BBB_KKK_1_gene", "UUU_GGG_2_genes", "KPC_KKK_MMM_1_gene"), c("BBB_1D_v1", "UUU_GGG_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene", "OXA_KKK_MMM_1_gene"), c("BBB_1D_v1", "UUU_GGG_2_genes", "BBB_GGG_1_gene", "KPC_KKK_MMM_1_gene"), c("BBB_KKK_2_genes", "UUU_GGG_1_gene", "BBB_GGG_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"), c("BBB_KKK_2_genes", "UUU_GGG_1_gene", "BBB_GGG_1_gene", "KPC_KKK_MMM_1_gene"), c("BBB_1D_v1", "BBB_KKK_1_gene", "UUU_GGG_1_gene", "VIM_KKK_MMM_1_gene"), c("BBB_1D_v1", "BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_1_gene", "BBB_GGG_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"),
c("BBB_1D_v1", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_2_genes", "NDM_KKK_MMM_1_gene"), c("BBB_1D_v1", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_2_genes"), c("UUU_GGG_1_gene", "UUU_49_KKK_InhR")),
queries=list(
upset_query(intersect=c("unknown"), color='KKKck', fill='KKKck', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_3_genes"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_4_genes"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("UUU_49_KKK_InhR"), color='red', fill='red', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("UUU_GGG_1_gene"), color='blue', fill='blue', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("UUU_GGG_2_genes"), color='blue', fill='blue', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene"), color='orange', fill='orange', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "BBB_KKK_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "KKK_GGG_1_gene"), color='yellow', fill='yellow', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "UUU_GGG_1_gene"), color='yellow', fill='yellow', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "UUU_GGG_2_genes"), color='yellow', fill='yellow', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene", "KKK_GGG_1_gene"), color='KKKck', fill='KKKck', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene", "BBB_GGG_1_gene"), color='KKKck', fill='KKKck', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_1_gene"), color='KKKck', fill='KKKck', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene", "BBB_GGG_1_gene", "KKK_GGG_1_gene"), color='KKKck', fill='KKKck', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_1_gene"), color='pink', fill='pink', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "KKK_GGG_1_gene"), color='pink', fill='pink', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "UUU_GGG_1_gene"), color='red', fill='red', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "KKK_GGG_1_gene"), color='red', fill='red', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "NDM_KKK_MMM_1_gene"), color='orange', fill='orange', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "NDM_KKK_MMM_1_gene"), color='blue', fill='blue', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "UUU_GGG_1_gene", "OXA_KKK_MMM_1_gene"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "UUU_GGG_1_gene", "KPC_KKK_MMM_1_gene"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"), color='green', fill='green', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "BBB_KKK_1_gene", "UUU_GGG_1_gene", "KPC_KKK_MMM_1_gene"), color='green', fill='green', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "BBB_KKK_2_genes", "UUU_GGG_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"), color='green', fill='green', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "BBB_KKK_2_genes", "UUU_GGG_1_gene", "BBB_GGG_1_gene", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), color='green', fill='green', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_2_genes", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "UUU_GGG_1_gene", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "UUU_GGG_2_genes", "NDM_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene", "OXA_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AmpC1", "BBB_1D_v1", "AAA_1_gene", "BBB_KKK_1_gene", "KKK_GGG_2_genes", "NDM_KKK_MMM_1_gene", "OXA_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_1_gene", "UUU_49_KKK_InhR"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BIL_CMY_LAP_SCO_KKK_1_gene"), color='yellow', fill='yellow', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BBB_KKK_1_gene"), color='yellow', fill='yellow', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene"), color='yellow', fill='yellow', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "UUU_GGG_1_gene"), color='red', fill='red', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "KKK_GGG_1_gene"), color='red', fill='red', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "NDM_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BBB_KKK_1_gene", "KPC_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BIL_CMY_LAP_SCO_KKK_1_gene", "OXA_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_2_genes", "KKK_GGG_1_gene"), color='orange', fill='orange', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_2_genes", "BBB_GGG_1_gene", "KKK_GGG_1_gene"), color='orange', fill='orange', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BBB_KKK_1_gene", "BBB_GGG_1_gene", "KKK_GGG_1_gene"), color='darkgreen', fill='darkgreen', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_1_gene"), color='darkgreen', fill='darkgreen', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BBB_KKK_2_genes", "KKK_GGG_1_gene"), color='darkgreen', fill='darkgreen', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "KKK_GGG_1_gene"), color='darkorchid1', fill='darkorchid1', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_1_gene"), color='darkorchid1', fill='darkorchid1', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_2_genes"), color='darkorchid1', fill='darkorchid1', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BBB_KKK_1_gene", "BBB_GGG_1_gene", "KKK_GGG_2_genes", "OXA_KKK_MMM_1_gene"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BBB_KKK_2_genes", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BBB_KKK_2_genes", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BBB_KKK_2_genes", "BBB_GGG_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_2_gene"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "KPC_KKK_MMM_1_gene"), color='coral', fill='coral', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"), color='coral', fill='coral', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_2_genes", "BBB_GGG_1_gene", "KPC_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_2_genes", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene", "OXA_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_2_genes", "OXA_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_2_genes", "BBB_GGG_1_gene", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_2_genes", "BBB_GGG_1_gene", "KKK_GGG_2_genes", "NDM_KKK_MMM_1_gene", "OXA_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "AAA_2_genes", "BBB_KKK_1_gene", "KKK_GGG_1_gene", "OXA_KKK_MMM_1_gene"), color='purple', fill='purple', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_2_genes", "BBB_KKK_2_genes", "BIL_CMY_LAP_SCO_KKK_1_gene", "BBB_GGG_1_gene", "KKK_GGG_1_gene", "UUU_49_KKK_InhR"), color='yellow', fill='yellow', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_3_genes", "UUU_GGG_1_gene"), color='blue', fill='blue', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("AAA_3_genes", "UUU_49_KKK_InhR"), color='green', fill='green', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_1_gene"), color='KKKck', fill='KKKck', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "BBB_KKK_1_gene", "UUU_GGG_1_gene"), color='KKKck', fill='KKKck', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_1_gene"), color='KKKck', fill='KKKck', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_KKK_3_genes", "BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_1_gene"), color='KKKck', fill='KKKck', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "BBB_KKK_1_gene", "UUU_GGG_1_gene", "KKK_GGG_1_gene"), color='KKKck', fill='KKKck', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_1_gene", "OXA_KKK_MMM_1_gene"), color='pink', fill='pink', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_KKK_1_gene", "UUU_GGG_1_gene", "KPC_KKK_MMM_1_gene"), color='pink', fill='pink', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("UUU_GGG_1_gene", "VIM_KKK_MMM_1_gene"), color='orange', fill='orange', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("UUU_GGG_1_gene", "KPC_KKK_MMM_1_gene"), color='orange', fill='orange', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_KKK_1_gene", "UUU_GGG_2_genes", "KPC_KKK_MMM_1_gene"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "UUU_GGG_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene", "OXA_KKK_MMM_1_gene"), color='pink', fill='pink', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "UUU_GGG_2_genes", "BBB_GGG_1_gene", "KPC_KKK_MMM_1_gene"), color='pink', fill='pink', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_KKK_2_genes", "UUU_GGG_1_gene", "BBB_GGG_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"), color='pink', fill='pink', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_KKK_2_genes", "UUU_GGG_1_gene", "BBB_GGG_1_gene", "KPC_KKK_MMM_1_gene"), color='pink', fill='pink', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "BBB_KKK_1_gene", "UUU_GGG_1_gene", "VIM_KKK_MMM_1_gene"), color='red', fill='red', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_1_gene", "BBB_GGG_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_1_gene"), color='red', fill='red', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_2_genes", "NDM_KKK_MMM_1_gene"), color='red', fill='red', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("BBB_1D_v1", "BBB_KKK_1_gene", "BIL_CMY_LAP_SCO_KKK_1_gene", "UUU_GGG_1_gene", "KKK_GGG_1_gene", "NDM_KKK_MMM_2_genes"), color='red', fill='red', only_components=c("intersections_matrix", "Intersection size")),
upset_query(intersect=c("UUU_GGG_1_gene", "UUU_49_KKK_InhR"), color='grey', fill='grey', only_components=c("intersections_matrix", "Intersection size")))) + patchwork::plot_layout(heights=c(0.1, 1.0, 0.3, 0.5)) + labs(title = "Co-occurence of alpha genes", caption = "Data: alpha")
Hi @krassowski , do you know if the memory error when manually sorting intersections will be fixed soon? Would be great to continue using ComplexUpset as its such a great package
But it seems to have made the issue worse as I receive the memory error a lot quicker.
I don't believe that that patch could ever make things worse performance wise. Instead I suspect that the problem you see is not related to this issue but the number of upset_query
calls in your example.
Each upset_query
creates a new layer which may cause ggplot2 to run out of memory in the plotting phase. This would be consisted with your observation that patch from the faster-specific
branch makes the memory error appear sooner - but this is because the performance has improved (you reached the plotting phase sooner). If you just want to give each bar a different fill, you should use aesthetics (as in the 3.2 Fill the bars example), not queries.
Do you still run out of memory if you remove the queries? What is the minimum reproducible example of the problem you are facing (i.e. after removing every single line of code which does not make a difference for the problem at hand)?
do you know if the memory error when manually sorting intersections will be fixed soon?
I don't have bandwidth to work on it this month.
I don't have bandwidth to work on it this month.
But contributions are welcome if anyone has some time to spare!
If you just want to give each bar a different fill, you should use aesthetics (as in the 3.2 Fill the bars example), not queries.
This may not be straightforward with the current public API for the provided example code. I guess this is a separate issue:
- CompelxUpset should combine
upset_query
layers for the same colour if the are not conflicting - CompelxUpset should provide an easy way to fill bars based on intersection/intersection cardinality (in addition to mapping based on individual observations); this is already possible with
encode_sets=FALSE
+aes()
mapping but not documented.
Hi @krassowski
I've removed the queries but still have the same memory issue:
gene_list = c("SIV_49_Cla_InhR", "TIM_Cla_Darb_2_genes", "TIM_Cla_Darb_1_gene", "LXA_Cla_Darb_1_gene", "VMM_Cla_Darb_2_genes", "VMM_Cla_Darb_1_gene", "KPC_Cla_Darb_1_gene", "JTX_M_Cla_TRMA_2_genes", "JTX_M_Cla_TRMA_1_gene", "CEM_Cla_TRMA_1_gene", "SIV_Cla_TRMA_2_genes", "SIV_Cla_TRMA_1_gene", "JIL_TMY_LAP_TAA_Cla_2_genes", "JIL_TMY_LAP_TAA_Cla_1_gene", "CEM_Cla_3_genes", "CEM_Cla_2_genes", "CEM_Cla_1_gene", "SIV_Cla_Chr_4_genes", "SIV_Cla_Chr_3_genes", "SIV_Cla_Chr_2_genes", "SIV_Cla_Chr_1_gene", "CEM_3D_v1", "TyoMA", "undefined")
upset(df, gene_list, annotations = list( 'Types (mm)'=ggplot(mapping=aes(x=intersection, y=Size)) + ggtitle("Type vs Phenotype") + geom_violin(width=0.8, alpha=1.5) + ggbeeswarm::geom_quasirandom(aes(color=Phenotype, alpha = I(1/2))) + guides(color = guide_legend(override.aes = list(size=5)))), sort_sets=FALSE, sort_intersections=FALSE, intersections=list(c("CEM_3D_v1", "SIV_Cla_Chr_1_gene", "CEM_Cla_1_gene", "SIV_Cla_TRMA_1_gene"), c("CEM_3D_v1", "SIV_Cla_Chr_1_gene", "CEM_Cla_1_gene", "JIL_TMY_LAP_TAA_Cla_1_gene", "JTX_M_Cla_TRMA_1_gene")))
Error: vector memory exhausted (limit reached?)
Any chance you've found a way to solve this memory problem please?
How bug is your data frame? Could you possibly prepare a reproducer using the movies
dataset (by duplicating rows as many times as needed and adding as many random group (TRUE/FALSE) columns as needed)? This would help me to look into this locally.
My personal dataset has 1896 rows with 25 columns, each column representing an intersection. I wasn’t able to replicate this problem using the Movies dataset with the original 58789 rows and 7 genre columns. Even when I increased the number of rows to around 170,000 rows, I still wasn’t able to replicate the problem. I was however able to replicate the problem by changing the Movies dataset, so that it now only has 2000 rows but 26 columns (movie genres or intersections). I’ve attached the data set below, with the minimal Complex-Upset commands required to replicate the problem (listing just two intersections).
Modified Movies dataset:
Complex-Upset commands:
library(ggplot2) library(ComplexUpset) my_data4 <- read.csv("movies4.csv", header = TRUE) df4_movies <- data.frame(my_data4)
genres4 = colnames(df4_movies)[18:43]
upset(df4_movies, genres4, annotations = list( 'Types (mm)'=ggplot(mapping=aes(x=intersection, y=length)) + ggtitle("Type vs Phenotype") + geom_violin(width=0.8, alpha=1.5)), sort_sets=FALSE, sort_intersections=FALSE, intersections=list(c("Action", "Animation", "Comedy", "Drama", "Long", "Musical", "Silent", "Fantasy", "Opera", "Historical", "Detective", "Emmy_winning", "Animals", "Sci_fi"), c("Action", "Animation", "Comedy", "Drama", "Thriller", "Horror", "Long", "Musical", "Silent", "Western", "Fantasy", "Adventure", "New", "Old", "Opera", "Historical", "Detective", "Science_fiction", "Emmy_winning", "Highly_rated", "Cooking", "Animals", "Sci_fi")))
Hi @krassowski, do you know if there is any update on the memory error please? I'm really looking forward to using ComplexUpset with my data set
I would also appreciate this functionality - thank you.