forcats icon indicating copy to clipboard operation
forcats copied to clipboard

Feature Request: fct_reorder_within() for Group-Specific Reordering in Faceted Plots

Open abiyug opened this issue 10 months ago • 2 comments

Problem When creating faceted plots with ggplot2 (e.g., facet_wrap()), I often need to reorder a factor (like "country") within each facet group (like "category") based on a numeric value (like "value"). The current forcats::fct_reorder() applies a global order across the entire dataset, ignoring the facet-specific context. This makes it hard to visually compare values within each group without workarounds. What I’ve Tested

fct_reorder(country, value): Reorders "country" globally (e.g., by mean value), but the order doesn’t adjust per facet, misrepresenting within-group rankings.
tidytext::reorder_within(country, value, category): Works perfectly—reorders "country" within each "category" and pairs with scale_x_reordered() for clean labels—but it’s in tidytext, not forcats, which feels off for a general-purpose factor task.
Base R reorder(interaction(country, category), value): Also works but leaves messy x-axis labels (e.g., "country.category"), requiring manual cleanup.
Other forcats options (fct_infreq(), fct_inorder() after pre-sorting): Either global or too manual, not integrating smoothly with faceting.

Desired Outcome A new function in forcats, like fct_reorder_within(factor, value, group), that:

Reorders a factor within each level of a grouping variable (e.g., for faceting).
Plays nicely with ggplot2’s facet_wrap() or facet_grid() with scales = "free".
Ideally pairs with a label-cleaning helper (like tidytext::scale_x_reordered()) to avoid suffixes in the plot.

This would centralize group-specific reordering in forcats, reducing reliance on tidytext or base R hacks for a common visualization need.

Reproducible Example (Reprex) Here’s a minimal example showing the issue and desired behavior:

library(tidyverse)

# Sample data: countries with electricity values across categories
data <- tibble(
  country = c("Sudan", "South Africa", "Zambia", "South Africa", "Egypt", "Morocco"),
  category = c("biofuel", "biofuel", "biofuel", "coal", "coal", "coal"),
  value = c(0.94, 0.4, 0.88, 192.73, 179.22, 33.12)
)

# Current fct_reorder: global order, wrong within facets
ggplot(data, aes(x = fct_reorder(country, value), y = value)) +
  geom_bar(stat = "identity") +
  facet_wrap(~ category, scales = "free") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
# South Africa ranks low in "biofuel" but high globally due to "coal" value

# Desired with tidytext::reorder_within (what I want in forcats)
library(tidytext)
ggplot(data, aes(x = reorder_within(country, value, category), y = value)) +
  geom_bar(stat = "identity") +
  scale_x_reordered() +
  facet_wrap(~ category, scales = "free") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
# Correct: Sudan > Zambia > South Africa in "biofuel"; South Africa > Egypt > Morocco in "coal"

# Proposed forcats function (hypothetical)
# ggplot(data, aes(x = fct_reorder_within(country, value, category), y = value)) +
#   geom_bar(stat = "identity") +
#   scale_x_forcats_reordered() +  # Or similar
#   facet_wrap(~ category, scales = "free")

sessionInfo() R version 4.3.2 (2023-10-31) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Big Sur 11.2

packageVersion("forcats") # 1.0.0 [1] ‘1.0.0’ packageVersion("ggplot2") # 3.5.1 [1] ‘3.5.1’ packageVersion("tidyverse") # 2.0.0 [1] ‘2.0.0’ packageVersion("tidytext") # 0.4.2 [1] ‘0.4.2’

abiyug avatar Feb 26 '25 08:02 abiyug

I'm not sure what the forcats team's perspective on this question is, but since there's a scale_* suggestion as well, bringing in ggplot2, I thought I'd offer the following -- the ggplot within-facet ranked bar chart question is very interesting!

If you take a 2-variable approach to creating y, interacting the target y label with the rank of x (I switched x and y from your example), would allow you to accomplish the goal, but make it easier to think about and not reach beyond the core tidyverse. The faceting step would be the same. The suggested adjustment to the scales label w str_remove, however, does look a little hairy...

# w/o faceting, creating a factor interacting the display variable of interest (country) with the rank of x (pop). 
gapminder::gapminder |>
  filter(continent == "Americas") |>
  filter(year > 2000) |>
  ggplot(
        aes(x = pop,
               y = interaction(country, rank(-pop)) )
    ) +
  geom_col() 

Image

# faceting with "free_y", we see that we are close, but an adjustment to y scale labels is needed
last_plot() +
  facet_wrap(~year, scales = "free_y")

Image

# using a function on the y scale labels to remove the absolute (w/o faceting) rankings
last_plot() + 
  scale_y_discrete(labels = function(x) str_remove(x, "\\.\\d+?$")) 

Image

To reverse the y axis for a discrete vars, so showing longer bars at the top, I believe coord_trans(y = "reverse"), would be the way.

EvaMaeRey avatar Mar 03 '25 16:03 EvaMaeRey

Thanks Gina. Your suggestion is helpful and works! I liked the tidytext::reorder_within(country, value, category) option, however, I think it is natural extension for forcats to add this functionality. I encounter the need to order within facet more often myself.

Here is a custom function I put together with AI assist.

# Custom function: fct_reorder_within
# Purpose: Reorder a factor within groups based on a numeric value, for faceted plots
# Arguments:
#   x: The factor to reorder (e.g., "country")
#   y: Numeric values to order by (e.g., "value")
#   group: Grouping variable (e.g., "name" or "category")
#   decreasing: Logical, FALSE for ascending order (default), TRUE for descending
fct_reorder_within <- function(x, y, group, decreasing = FALSE) {
  # Input validation
  if (!is.factor(x)) x <- as.factor(x)  # Ensure x is a factor
  if (length(x) != length(y) || length(x) != length(group)) {
    stop("All arguments (x, y, group) must have the same length")
  }
  
  # Create a data frame (base R)
  df <- data.frame(x = x, y = y, group = group, stringsAsFactors = FALSE)
  
  # Create a unique identifier combining x and group
  # Rationale: Ensures uniqueness across groups
  df$combined <- paste(df$x, df$group, sep = "__")
  
  # Order the data by group and value
  # Rationale: order() with decreasing = FALSE puts smallest first, so highest is last
  ord <- order(df$group, df$y, decreasing = decreasing)
  df <- df[ord, ]
  
  # Create a factor with ordered levels based on the combined identifier
  # Rationale: Levels go smallest to largest, flipping to largest at top with coord_flip
  result <- factor(df$combined, levels = unique(df$combined))
  
  # Reorder back to original row order
  # Rationale: Matches input row order for plotting
  result <- result[order(ord)]
  
  return(result)
}

abiyug avatar Mar 04 '25 07:03 abiyug

I don't think it's possible to implement this in forcats, because (and please correct me if I'm wrong) you can't do this with a factor — you need some more complex data type. I think that makes it out of scope for forcats.

hadley avatar Aug 19 '25 04:08 hadley

Hi Hadley, Thanks for your comment! I see your point about factors potentially needing a complex data type, but I believe fct_reorder_within() fits forcat's scope as a natural extension for factor manipulation in tidyverse workflows, especially for ggplot2 faceted plots where group-specific reordering is common. Solutions like tidytext::reorder_within() show this can work with standard factors (e.g., via level name tweaks). Could you elaborate on why you think this exceeds factor capabilities or forcat's focus? I’d appreciate your thoughts!

abiyug avatar Aug 19 '25 08:08 abiyug

Given that it requires some special ggplot2 manipulation after creating the variable, I think it would be better to find a home where the two pieces can live together.

hadley avatar Aug 19 '25 16:08 hadley

@hadley Thanks for your response and insight on the ggplot2 dependency! Thanks also to @EvaMaeRey for her earlier input in the discussion. I understand your point about needing a unifying home for both factor reordering and plotting components. I’ll work on a helper package, in case others find it useful, but believe group-specific reordering could enhance forcats for tidyverse faceting workflows. Appreciate your time!

abiyug avatar Aug 19 '25 17:08 abiyug

Interesting discussion! It might be worth a chat with ggcharts author @thomas-neitmann on ggcharts::bar_chart() about how he arrived at that (popular!) solution and the decision to wrapping up the data manipulation, reorder_within, and scaling steps. It does feel like a bit of an involved train of thought to do w/ base ggplot2.

fct_cross seems like the nearest kin to tidytext::reorder_within. Getting to tidytext::reorder_within-type ordering, maybe you'd fct_cross with fct_reorder two times, once on the value and after with the category.

library(tidyverse)
data <- tibble(
  country = c("Sudan", "South Africa", "Zambia", "South Africa", "Egypt", "Morocco"),
  category = c("biofuel", "biofuel", "biofuel", "coal", "coal", "coal"),
  value = c(0.94, 0.4, 0.88, 192.73, 179.22, 33.12)
)

tidytext::reorder_within(x = data$country, 
                         by = data$value,
                         within = data$category) |>
  tibble(cat = _, data$value) |> 
  arrange(cat)
#> # A tibble: 6 × 2
#>   cat                    `data$value`
#>   <fct>                         <dbl>
#> 1 South Africa___biofuel         0.4 
#> 2 Zambia___biofuel               0.88
#> 3 Sudan___biofuel                0.94
#> 4 Morocco___coal                33.1 
#> 5 Egypt___coal                 179.  
#> 6 South Africa___coal          193.


fct_cross(data$country, data$category, sep = "____") |> # create new levels
  fct_reorder(data$value) |>               # reorder on value (`by` argument)
  fct_reorder(data$category) |>            # reorder on cat (`within` argument) 
  tibble(cat = _, value = data$value) |> 
  arrange(cat)
#> # A tibble: 6 × 2
#>   cat                      value
#>   <fct>                    <dbl>
#> 1 South Africa____biofuel   0.4 
#> 2 Zambia____biofuel         0.88
#> 3 Sudan____biofuel          0.94
#> 4 Morocco____coal          33.1 
#> 5 Egypt____coal           179.  
#> 6 South Africa____coal    193.

Created on 2025-08-19 with reprex v2.1.1

EvaMaeRey avatar Aug 19 '25 19:08 EvaMaeRey

You might find the discussion here useful: https://github.com/ggplot2-extenders/ggplot-extension-club/discussions/41 Some of the standard extension points that feel like good fits for this problem (new scale, stat) don't help because of early y scale training - so you are stuck with the factors as is when you declare aes(y = country).

Maybe something like aes_y_by_x could be a helper to think about...

library(tidyverse)

aes_y_by_x <- function(x, y){
  
  aes(x = {{x}}, 
      y = interaction(rank({{x}}), {{y}}, lex.order = T))
  
}



gapminder::gapminder |>
  filter(year %in% c(1967, 2007)) |> 
  filter(continent == "Americas") |>
  ggplot() + 
  aes_y_by_x(x = pop, y = country) +
  geom_col() + 
  facet_wrap(facets = vars(year), 
             scales = "free_y") 


last_plot() +
  scale_y_discrete(labels = function(x) str_remove(x, "^\\d+?\\."))


last_plot() +
  ggplyr::data_slice_max(pop, by = year)

Created on 2025-08-23 with reprex v2.1.1

EvaMaeRey avatar Aug 23 '25 14:08 EvaMaeRey

@EvaMaeRey I was unfamiliar with the interaction approach, but your suggestion rendered the desired grouped ordered plots within a ggplot facet!

Initially, I approached this problem using a tidyverse workflow, expecting ggplot2 to seamlessly handle grouped factor ordering. However, I discovered that ggplot2, by default, does not automatically preserve the order of the factor column (e.g., country) when using facet_wrap. This limitation prompted my initial attempt to address the issue by proposing a new fct_reorder_within/scale function within ggplot2 to adapt to grouped ordering. However, my true goal was to create a dataset with factors reordered within each group (e.g., by category), enabling not only ggplot2 visualizations but also tables and other downstream analyses.To achieve this, I used a wrangling approach that involves:Adding a rank column using mutate to determine the order based on a metric (e.g., value). Reordering the factor levels with fct_reorder to reflect the grouped ranking.

This pre-processed data ensures compatibility with ggplot2::facet_wrap and provides a reusable apprach for various applications. Below are the relevant reproducible scripts that implement this solution:

#data
df <- tibble(
  country = c("Sudan", "South Africa", "Zambia", "South Africa", "Egypt", "Morocco"),
  category = c("biofuel", "biofuel", "biofuel", "coal", "coal", "coal"),
  value = c(0.94, 0.4, 0.88, 192.73, 179.22, 33.12)
)

#wrangle and ranked grouped factor
df_grp_fct_ordr <-
df %>%
     group_by(category) %>%
     mutate(
       rank_value = rank(-value, ties.method = "first"),
       country = fct_reorder(country, rank_value)
     ) %>%
     ungroup() %>%
     select(-rank_value)

View the grouped factor levels

df_grp_fct_ordr %>%
     group_by(category) %>%
     arrange(value) %>%
     summarise(levels = list(as.character(country))) %>%
     deframe()

$biofuel
[1] "South Africa" "Zambia"       "Sudan"       

$coal
[1] "Morocco"      "Egypt"        "South Africa"

Viz

# Viz the group ordered facet plots
df_grp_fct_ordr %>% 
ggplot(aes(x = value, y = country)) +
     geom_col() +
     facet_wrap(~ category, scales = "free") +
     scale_y_discrete(limits = rev)
Image

Here is a generalized function that can be used to wrangle and prep data for ggplot2 facet_wrap

fct_grouped_order <- function(data, 
                              group_col = "category_var", 
                              order_col = "value_var", 
                              label_col = "label_var") {
  data %>%
    group_by(!!sym(group_col)) %>%
    mutate(
      rank_value = -rank(!!sym(order_col), ties.method = "first"),  # Negate the rank values
      !!sym(label_col) := fct_reorder(!!sym(label_col), rank_value)
    ) %>%
    ungroup() %>%
    select(-rank_value)
}


# test fun to generated group ordered factor
df_ordered <- fct_grouped_order(df, 
                                group_col = "category", 
                                order_col = "value", 
                                label_col = "country")

Given that reordering within facets is a common requirement, a dedicated ggplot2 feature to handle this natively could significantly reduce cognitive load. Such a feature might allow users to specify an ordering variable directly within facet_wrap() or a new scale function, eliminating the need for pre-wrangle steps. This could enhance usability, especially for users unfamiliar with data manipulation, though it would need to balance flexibility with ggplot2’s philosophy of keeping core functionality minimal. What are your thoughts?

abiyug avatar Aug 24 '25 17:08 abiyug

I think it's probably better to move the discussion to ggplot2 venue like https://github.com/ggplot2-extenders/ggplot-extension-club/discussions/41 (more open ended discussion) or the ggplot2 repo (a specific request), since this is outside of the scope of forcats?

EvaMaeRey avatar Aug 24 '25 17:08 EvaMaeRey

I agree, this ended up being more of a ggplot2 limitations, than forcats.

abiyug avatar Aug 24 '25 17:08 abiyug