plotly.R
plotly.R copied to clipboard
Error bars displayed in incorrect order with color attribute assigned
When error_x is used along with a factor variable assigned to the color attribute in plot_ly, the resulting plot displays the error bars around the wrong points. Below is a working example using source code from the plotly for R book. You can see in the plot that the standard errors are being mapped to the wrong coefficients.
m <- lm(Sepal.Length ~ Sepal.Width * Petal.Length * Petal.Width, data = iris)
d <- broom::tidy(m) %>% arrange(desc(estimate)) %>% mutate(term = factor(term, levels = term), one_col=cut(estimate,3,labels=c("Low","Medium","High"))) plot_ly(d, x = ~estimate, y = ~term,color=~one_col) %>% add_markers(error_x = ~list(array=std.error,type="array")) %>% layout(margin = list(l = 200))
I came across this too. One workaround is to add each sample group via add_trace. The code below demonstrates this. Error bars in the first plot (p1) are incorrectly assigned whereas error bars in the second (p2) are correct.
library(plotly)
library(dplyr)
library(tidyr)
## Raw data
df <- data.frame(sample = rep(paste0('sample ', 1:5), 4),
x = rnorm(20),
group = rep(paste0('group ', 1:2), each = 10),
stringsAsFactors = FALSE
)
## Stats table
df2 <- df %>%
group_by(sample, group) %>%
summarise(avg = mean(x), sd = sd(x)) %>%
ungroup()
## Plotly barchart with error bars. Error bars are incorrectly assigned
p1 <- plot_ly(df2, x = ~sample, y = ~avg, color = ~group, type = 'bar', error_y = list(array = ~df2$sd))
p1
## Create individual columns for group data and errors
df3 <- df2 %>%
gather(key, value, -c(sample, group)) %>%
mutate(ref = paste0(group, ifelse(key == 'sd', '_sd', ''))) %>%
select(-group, -key) %>%
spread(ref, value)
## Plotly barchart displays error bars correctly
p2 <- plot_ly(df3, type = 'bar')
for (g in unique(df2$group)) {
p2 <- add_trace(p2, x = df3[['sample']], y = df3[[g]], name = g, error_y = list(array = df3[[paste0(g, '_sd')]]))
}
p2
Is this still the most viable solution for getting error bars to show correctly still? I have run into a similar situation with grouped time-series data. The y-error bars are not being associated with the correct bar.
Attached is the data and below is the code I'm running:
(will need to convert XLSX to csv) nymphs <- read.csv("nymphs.csv", header=TRUE")
ptest <- plot_ly(nymphs, x = ~week, y = ~nymphs, type= "bar", color = ~treat, error_y = ~list( array= se, type="array", color="#000000")) %>% layout(xaxis= list(title="WAT"), yaxis = list(title= "Means nymphs per plant"))
Below is a screenshot that I'm getting - the large error bars should be on "WAT 10", where the bars are much larger as well. Any help would be greatly appreciated - can't find documentation addressing this issue anywhere.
I was running into this issue when I wanted to filter production data with crosstalk on manufacturing parameters/dates and label the scatterplot by one main parameter. To get the position of the error bars fixed I modified the workaround by harveyl888 (thanks for the input). My modified version for crosstalk with model-data (I have chosen ordered instead of randomized data to get an idea of the structure behind the issue):
library(crosstalk)
library(plotly)
TestDF <- data.frame(a = c(1:9), b = c(rep(c(1:3), 3)), c = c(1:9/10), d = LETTERS[rep(c(1:3),3)])
d_list <- unique(TestDF$d)
shared_Test_all <- SharedData$new(TestDF, group = "Test")
shared_Test <- list()
p1 <- plot_ly(type = "scatter", mode = "markers")
for (i in 1:length(d_list)) {
shared_Test[[i]] <- SharedData$new(TestDF[TestDF$d == d_list[i],], group = "Test")
p1 <- add_trace(p1, data = shared_Test[[i]], x = ~a, y = ~b, name = ~d, error_y = ~list(array = c))
}
bscols(list(filter_select("d", "filter by d", shared_Test_all, ~d, multiple = TRUE),
bscols(plot_ly(data = shared_Test_all, x = ~a, y = ~b,
type = "scatter", mode = "markers",
error_y = ~list(array = c)),
plot_ly(data = shared_Test_all, x = ~a, y = ~b, name = ~d,
type = "scatter", mode = "markers",
error_y = ~list(array = c)),
p1)))
The resulting plots:
The 1st plot is without labels for reference - the error bars are at the right position. The 2nd plot shows the bug for the model-data: the error bar increases with the parameter "a" within the group first, instead of being "a/10" for all points. Looks like the order of the errorbar values remains like without groups while they are matched to the scatterplot folowing the order within the groups... My workaround (3rd plot): I used a list of SharedData-environments - one list element for each group of labels. I grouped the environments for common filtering.
Bump. Please fix this
Bump again - this is v cumbersome workaround.