rstatix icon indicating copy to clipboard operation
rstatix copied to clipboard

Create similar plots for all variables in a data frame using a function?

Open jaydoc opened this issue 3 years ago • 1 comments

Below is the code for a plot I want to make for all numeric variables of a data frame.

  1. libraries
library(tidyverse)
library(rstatix)
library(ggpubr)
  1. Example data (the actual data has around 60 variables)
data_set <-
  data.frame(
    var1 = rep(c("N", "N", "Y", "Y"),4),
    var2 = c(rep("type1",8), rep("type2", 8)),
    var3 = c(rep("type1",4),rep("type2",8),rep("type1",4)),
    x = rnorm(16),
    y = rnorm(16),
    z = rnorm(16)
    )
  1. Stat tests
stat.test <- data_set %>%
  group_by(var2, var1) %>%
  t_test( x ~ var3) %>%
  adjust_pvalue(method = "bonferroni") %>%
  add_significance("p.adj") %>%
  add_xy_position(x = "var2", dodge = 0.8)

stat.test.1 <- data_set %>%
  group_by(var3, var1) %>%
  t_test( x ~ var2) %>%
  adjust_pvalue(method = "bonferroni") %>%
  add_significance("p.adj") %>%
  add_xy_position(x = "var3", dodge = 0.8) %>%
  mutate(
    xmin = xmin + c(0, 0, -0.6, -0.6),
    xmax = xmax + c(0.6, 0.6, 0, 0),
    y.position = y.position + c(1, 1, 2, 2)
  )
  1. plot
ggboxplot(
  data_set,
  x = "var2",
  add = "mean_sd",
  y = "x",
  color = "var3",
  facet.by = "var1"
) +
  stat_pvalue_manual(stat.test,
                     label = "p.adj",
                     tip.length = 0.01,
                     hide.ns = FALSE) +
  stat_pvalue_manual(
    stat.test.1,
    label = "p.adj",
    tip.length = 0.01,
    hide.ns = FALSE
  ) +
  scale_y_continuous(expand = expansion(mult = c(0.01, 0.1)))
  1. The result plot

What I want to do is create a function or loop to do draw a similar plot for all the 60 numeric variables in my original data? I have tried various ways to create the stats dataframes using tidy evaluations but have had no luck.

Is this possible? How can I do this? Thanks.

jaydoc avatar Mar 30 '21 01:03 jaydoc

Hi @jaydoc

This is not an issue of the functionality of rstatix or ggpubr. What you can try to do is iteration and there are excellent ressources to learn this kind of stuff. A basic example for your dataset could look like the following code. Subfigures A, B, and C correspond to variables x, y, and z.

library(tidyverse)
library(rstatix)
library(patchwork)


# Helper function for plotting data
plot_data <- function(data) {
  data %>% 
    ggplot(aes(var2, value, color = var3)) + 
    geom_boxplot() + 
    facet_wrap(~ var1)
}


# Example dataset
data_set <- data.frame(
    var1 = rep(c("N", "N", "Y", "Y"),4),
    var2 = c(rep("type1",8), rep("type2", 8)),
    var3 = c(rep("type1",4),rep("type2",8),rep("type1",4)),
    x = rnorm(16),
    y = rnorm(16),
    z = rnorm(16)
  )


# Plot for all variables
data_set %>% 
  pivot_longer(
    cols = x:z,
    names_to = "variable"
  ) %>% 
  group_by(variable) %>% 
  nest() %>% 
  mutate(
    plots = map(data, plot_data)
  ) %>% 
  pull(plots) %>% 
  wrap_plots() + 
  plot_layout(guides = "collect", ncol = 2) + 
  plot_annotation(tag_levels = "A")

Created on 2021-10-06 by the reprex package (v2.0.1)

benediktclaus avatar Oct 06 '21 09:10 benediktclaus