DataExplorer
                                
                                
                                
                                    DataExplorer copied to clipboard
                            
                            
                            
                        Group-wise color in scatterplot
Hello, First, thank you for your awesome product. I really appreciate the data exploration tool you've put together.
I'm trying to figure out if certain functionality exists. I'm running DataExplorer v 0.8.0.
I have the following dataframe:
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':	32 obs. of  7 variables:
 $ group          : Factor w/ 2 levels "amber","blue": 1 2 2 1 2 1 1 2 2 2 ...
 $ ess_score      : num  13 14 9 12 1 3 11 8 5 8 ...
 $ rpcsq_rpq3     : num  2 4 2 3 5 0 6 4 0 0 ...
 $ rpcsq_rpq13    : num  4 16 19 2 16 4 15 23 12 16 ...
 $ rpcsq_cognitive: num  0 4 8 0 6 2 2 10 6 7 ...
 $ rpcsq_somatic  : num  6 14 13 3 15 2 12 10 6 6 ...
 $ rpcsq_emotional: num  0 2 0 2 0 0 7 7 0 3 ...
I'd like to produce ess x rpcsq scatterplots (5 scatterplots) with the points colored by group. I've tried the following:
> plot_scatterplot(tmp, by = "ess_score")
This works but obviously doesn't color the points. The following code however fails to produce the plots:
> plot_scatterplot(tmp, by = "ess_score", geom_point_args = list(col = "group"))
Error in grDevices::col2rgb(colour, TRUE) : invalid color name 'group'
> plot_scatterplot(tmp, by = "ess_score", geom_point_args = list(col = group))
Error in do.call("geom_point", geom_point_args) : 
  object 'group' not found
> plot_scatterplot(tmp, by = "ess_score", geom_point_args = list(group = "group", col = "group"))
Error in grDevices::col2rgb(colour, TRUE) : invalid color name 'group'
Does the functionality I'm looking for exist in the current iteration of DataExplorer? Thanks for any help you can be.
Thanks for using DataExplorer. For your need, you will have to tweak the source code a little. Copy & paste the following function and you should be able to pass group.
plot_scatterplot2 <- function(data, by, group, sampled_rows = nrow(data), geom_point_args = list(), title = NULL, ggtheme = theme_gray(), theme_config = list(), nrow = 3L, ncol = 3L, parallel = FALSE) {
  variable <- NULL
  if (!is.data.table(data)) data <- data.table(data)
  if (sampled_rows < nrow(data)) data <- data[sample.int(nrow(data), sampled_rows)]
  dt <- suppressWarnings(melt.data.table(data, id.vars = c(by, group), variable.factor = FALSE))
  feature_names <- unique(dt[["variable"]])
  layout <- DataExplorer:::.getPageLayout(nrow, ncol, length(feature_names))
  plot_list <- DataExplorer:::.lapply(
    parallel = parallel,
    X = layout,
    FUN = function(x) {
      ggplot(dt[variable %in% feature_names[x]], aes_string(x = by, y = "value", color = group)) +
        do.call("geom_point", geom_point_args) +
        coord_flip() +
        xlab(by)
    }
  )
  class(plot_list) <- c("multiple", class(plot_list))
  plotDataExplorer(
    plot_obj = plot_list,
    page_layout = layout,
    title = title,
    ggtheme = ggtheme,
    theme_config = theme_config,
    facet_wrap_args = list(
      "facet" = ~ variable,
      "nrow" = nrow,
      "ncol" = ncol,
      "scales" = "free_x",
      "shrink" = FALSE
    )
  )
}
Then just do:
plot_scatterplot2(tmp, by = "ess_score", group = "group")
I tested it on iris and it works fine:
plot_scatterplot2(iris, by = "Sepal.Length", group = "Species")
                                    
                                    
                                    
                                
Please also keep this issue open. I might be able to add this in future versions, but I can't promise which one.
Thanks @boxuancui. I'll give it a shot.
In general, having the ability to color (or assign other ggplot2 aesthetics) based on some groups defined in some column would be quite useful for many of the plotting functions (plot_density, etc.)