UpSetR
UpSetR copied to clipboard
Set order not maintained when intersection list provided
Dear authors,
I am having issues with the ordering of the upset plot. Here, I provide a reproducible example.
Read the usual movies
data set.
movies <- read.csv(system.file("extdata", "movies.csv", package = "UpSetR"),
header = T, sep = ";")
Generate upset plot with certain ordering (say, alphabetical order) of the sets and use keep.order = T
to retain the specified ordering:
movies %>% upset(., keep.order = T,
sets = c("Action", "Comedy", "Drama", "Romance", "Thriller")
)
As you can see, everything works swimmingly. Now, I attempt to display only certain intersections that I am specifically interested in.
movies %>% upset(., keep.order = T,
sets = c("Action", "Comedy", "Drama", "Romance", "Thriller"),
intersections = list(
list("Action", "Comedy", "Romance"),
list("Action", "Drama", "Thriller"),
list("Action", "Drama", "Romance"),
list("Action", "Thriller"),
list("Drama", "Thriller")
)
)
However, the specified set ordering is not retained despite keep.order = T
executed.
Perhaps, I am making a mistake with the command and parameters. I was also unable to find any issues that resemble this. Perhaps, I missed them. Could you please let me know how I could fix this?
Looking forward to your response.
Cheers, Shaman
Just to mention that I am also running into this issue.
I have also run into the same issue. Hope to know if anyone has been able to solve it.
Same here, any updates?
Writing here that I also have this issue
I also have this issue.
I need this too - I came up with the following temporary hack. I re-defined and reassigned to namespace the function specific_intersections
as below in order to avoid re-reordering the intersections if order.by = NULL
.
Test on OP's example:
upset(movies, keep.order = T,
order.by= NULL,
sets = c("Action", "Comedy", "Drama", "Romance", "Thriller"),
intersections = list(
list("Action", "Comedy", "Romance"),
list("Action", "Drama", "Thriller"),
list("Action", "Drama", "Romance"),
list("Action", "Thriller"),
list("Drama", "Thriller")
)
)
specific_intersections <- function(data, first.col, last.col, intersections, order_mat,
aggregate, decrease, cut, mbar_color, set_names){
data <- as.data.frame(data)
sets <- names(data[c(first.col:last.col)])
keep <- unique(unlist(intersections))
remove <- sets[which(!sets %in% keep)]
remove <- which(names(data) %in% remove)
if(length(remove) != 0){
data <- data[-remove]
}
data <- plyr::count(data[keep])
sets <- names(data[1:length(keep)])
data <- lapply(intersections, function(x){
temp_sets <- unlist(x)
x <- data[which(rowSums(data[1:length(keep)]) == length(temp_sets)), ]
x <- x[which(rowSums(x[temp_sets]) == length(temp_sets)), ]
if(nrow(x) == 0){
names <- names(x[1:length(keep)])
x <- rbind(x, rep(0, ncol(x)))
colnames(x) <- c(names, "freq")
x[ ,which(names %in% temp_sets)] <- 1
}
x <- x
})
Freqs <- data.frame()
for(i in seq(length(data))){
Freqs <- rbind(Freqs, data[[i]])
}
Freqs <- Freqs[c(set_names, "freq")]
num_sets <- length(keep)
if(tolower(aggregate) == "degree" | is.null(order_mat) == TRUE){
for(i in 1:nrow(Freqs)){
Freqs$degree[i] <- rowSums(Freqs[ i ,1:num_sets])
}
if(is.null(order_mat) == FALSE) {
order_cols <- c()
for(i in 1:length(order_mat)){
order_cols[i] <- match(order_mat[i], colnames(Freqs))
}
for(i in 1:length(order_cols)){
logic <- decrease[i]
Freqs <- Freqs[order(Freqs[ , order_cols[i]], decreasing = logic), ]
}
}
} else if(tolower(aggregate) == "sets" & is.null(order_mat) == FALSE) {
Freqs <- Get_aggregates(Freqs, num_sets, order_mat, cut)
} else {
stop('Not implemented yet')
}
#delete rows used to order data correctly. Not needed to set up bars.
delete_row <- (num_sets + 2)
Freqs <- Freqs[ , -delete_row]
for( i in 1:nrow(Freqs)){
Freqs$x[i] <- i
Freqs$color <- mbar_color
}
Freqs <- na.omit(Freqs)
return(Freqs)
}
assignInNamespace('specific_intersections', specific_intersections, ns= 'UpSetR')
I had this issue too. @dariober's solution worked well for me, thanks!
I have the same issue too, and checking @dariober solution I saw that in the code you tried to order the sets, but again in the plot the sets are ordered in other way, while the intersections are correctly ordered... so the problem remains...
Hi, where do I add specific_intersections <- function(data, first.col, last.col, intersections, order_mat, aggregate, decrease, cut, mbar_color, set_names){..}
?
@mictadlo Just execute the code in your interactive session or add it to your script like any other function. Take care to execute also the line assignInNamespace('specific_intersections', specific_intersections, ns= 'UpSetR')
. Perhaps best is to put all that (function definition and assignInNamespace
) in a file and then do source("specific_intersections.R")
at the start of your session.
Thank you, but unfortunately, it does not work with my data. What did I do wrong?
> library("UpSetR")
> orthogroups_df<- read.table("orthogroups.GeneCount.tsv", header=T, stringsAsFactors = F)
> #All species
> selected_species <- colnames(orthogroups_df)[2:(ncol(orthogroups_df) -1)]
> selected_species
[1] "Atha" "Cann" "NQLD" "Natt" "Ngla" "Nlab" "Nsyl" "Ntab" "Ntom" "Slyc" "Stub" "Vvin"
> head(orthogroups_df)
Orthogroup Atha Cann NQLD Natt Ngla Nlab Nsyl Ntab Ntom Slyc Stub Vvin Total
1 OG0000000 0 0 965 0 0 3 0 0 0 0 0 0 968
2 OG0000001 0 1 3 0 0 448 0 0 0 0 0 0 452
3 OG0000002 0 1 313 0 0 120 1 0 1 0 0 0 436
4 OG0000003 0 93 15 21 46 16 33 63 36 25 39 26 413
5 OG0000004 1 42 2 34 109 6 8 154 11 9 4 0 380
6 OG0000005 0 2 61 1 34 44 91 70 43 20 1 0 367
> ncol(orthogroups_df)
[1] 14
> orthogroups_df[orthogroups_df > 0] <- 1
> # we only show intersections of interest ,
> intersections=list(list(selected_species),
+ list("NQLD", "Ngla", "Natt", "Nlab", "Nsyl", "Ntab", "Ntom"),
+ list("Stub", "Slyc"),
+ list("Atha", "Vvin"),
+ list("Ntab", "Nsyl", "Ntom"),
+ list("Nlab", "NQLD", "Ngla"),
+ list("Nlab", "NQLD", "Nsyl"),
+ list("Nlab", "Ngla", "Nsyl"),
+ list("NQLD", "Nsyl", "Ngla"))
> specific_intersections <- function(data, first.col, last.col, intersections, order_mat,
+ aggregate, decrease, cut, mbar_color, set_names){
+ data <- as.data.frame(data)
+ sets <- names(data[c(first.col:last.col)])
+ keep <- unique(unlist(intersections))
+ remove <- sets[which(!sets %in% keep)]
+ remove <- which(names(data) %in% remove)
+ if(length(remove) != 0){
+ data <- data[-remove]
+ }
+
+ data <- plyr::count(data[keep])
+ sets <- names(data[1:length(keep)])
+ data <- lapply(intersections, function(x){
+ temp_sets <- unlist(x)
+ x <- data[which(rowSums(data[1:length(keep)]) == length(temp_sets)), ]
+ x <- x[which(rowSums(x[temp_sets]) == length(temp_sets)), ]
+ if(nrow(x) == 0){
+ names <- names(x[1:length(keep)])
+ x <- rbind(x, rep(0, ncol(x)))
+ colnames(x) <- c(names, "freq")
+ x[ ,which(names %in% temp_sets)] <- 1
+ }
+ x <- x
+ })
+
+ Freqs <- data.frame()
+
+ for(i in seq(length(data))){
+ Freqs <- rbind(Freqs, data[[i]])
+ }
+
+ Freqs <- Freqs[c(set_names, "freq")]
+
+ num_sets <- length(keep)
+
+ if(tolower(aggregate) == "degree" | is.null(order_mat) == TRUE){
+ for(i in 1:nrow(Freqs)){
+ Freqs$degree[i] <- rowSums(Freqs[ i ,1:num_sets])
+ }
+ if(is.null(order_mat) == FALSE) {
+ order_cols <- c()
+ for(i in 1:length(order_mat)){
+ order_cols[i] <- match(order_mat[i], colnames(Freqs))
+ }
+
+ for(i in 1:length(order_cols)){
+ logic <- decrease[i]
+ Freqs <- Freqs[order(Freqs[ , order_cols[i]], decreasing = logic), ]
+ }
+ }
+ } else if(tolower(aggregate) == "sets" & is.null(order_mat) == FALSE) {
+ Freqs <- Get_aggregates(Freqs, num_sets, order_mat, cut)
+ } else {
+ stop('Not implemented yet')
+ }
+ #delete rows used to order data correctly. Not needed to set up bars.
+ delete_row <- (num_sets + 2)
+ Freqs <- Freqs[ , -delete_row]
+ for( i in 1:nrow(Freqs)){
+ Freqs$x[i] <- i
+ Freqs$color <- mbar_color
+ }
+ Freqs <- na.omit(Freqs)
+ return(Freqs)
+ }
> assignInNamespace('specific_intersections', specific_intersections, ns= 'UpSetR')
> upset(orthogroups_df,
+ text.scale = c(1.4),
+ sets=rev(selected_species),
+ nsets = ncol(orthogroups_df),
+ #keep.order=T,
+ #mb.ratio=c(0.5,0.5),
+ #order.by='degree',
+ order.by='freq',
+ intersections = intersections,
+ sets.x.label="Total number of orthogroups",
+ mainbar.y.label = "Number of orthogroups")
I am not able to maintain set order in any situation. Tried different numbers and orders (degree and freq) of intersects.
I have also been having this problem. My current workaround is to specify my desired order as the first intersection in the list. I am grouping by degree so can easily just crop it off the final image since it will be at the end. Not ideal, but it works.