geom_bar()/geom_col() erroneously warn that they ignore width aesthetic
geom_bar() and geom_col() let you specify a width aesthetic to control the width of the bars.
The behavior is as expected, but it generates an erroneous warning "Warning: Ignoring unknown aesthetics: width".
width isn't listed in the aesthetics section of ?geom_bar, so it appears that this is an unofficial behavior.
library(ggplot2)
suppressPackageStartupMessages(library(dplyr))
mtcars_by_cyl <- mtcars %>%
group_by(cyl) %>%
summarize(
mean_wt = mean(wt),
n = n()
) %>%
mutate(prop = n / sum(n))
ggplot(mtcars_by_cyl) +
geom_col(aes(cyl, mean_wt, width = prop))
#> Warning: Ignoring unknown aesthetics: width

Created on 2019-02-13 by the reprex package (v0.2.0).
Similar closed issues.
- https://github.com/tidyverse/ggplot2/issues/1904
- https://github.com/tidyverse/ggplot2/issues/2473
Currently, width is recognized as a parameter by a "hack". Here's the comment written 4 years ago. Maybe it's worth trying to make width to a proper aes?
https://github.com/tidyverse/ggplot2/blob/43dcd632fe96d412c13689454ffee366aaa39ce3/R/geom-bar.r#L130
elsewhere (e.g. boxplot), width is added to the list of extra_params:
https://github.com/tidyverse/ggplot2/blob/03bd9461fd0ae236d15be6d215a42911518b18ee/R/geom-boxplot.r#L162
width works just fine as a parameter in the way the code is currently written, and the "hack" is fine also. The question is whether width should be an aesthetic. I'm skeptical, because bars with varying widths are not normally meaningful. It's not that different a case from bars that start from a base value other than zero, which we also don't support. If people really want to do something like this, they can use geom_rect() or geom_tile() instead.
The question is whether
widthshould be an aesthetic
Isn't width already an aes? At least, the plot above seems to have varying widths of bars.
Sorry, I was confused. Now I come to think the varying widths of geom_col() is just a mistake. It uses data$width, but it should be really "ignored" as the warning says.
https://github.com/tidyverse/ggplot2/blob/43dcd632fe96d412c13689454ffee366aaa39ce3/R/geom-col.r#L40-L46
In geom_bar()'s case, stat_count() provides the width, so it should be used. But, geom_col() uses stat_identity(), which we should not expect width.
But, in terms of the interface (I don't mean the current behaviour is semantically correct), width is provided by a Stat via data. So, it is virtually an aes.
I'm wondering why width is not passed via param...
Oh, this last example reminds me of the need for varying width.
# You can specify a function for calculating binwidth, # particularly useful when faceting along variables with # different ranges
https://ggplot2.tidyverse.org/reference/geom_histogram.html
Here's my understanding. Is this correct?
- We want to enforce a constant bar width within a panel, so
widthcannot be anaes. - Yet, the width can vary among panels, so we need to pass
widths per bar viadata, not a single value viaparam. -
data$widthshould be used only when the Stat provides it. But,geom_col()is not the case, it should ignoredata$width.
Just a comment in passing: If width were to be passed as an aes to capture the relative amounts of some variable in the dataset, the bar-chart would become a sort of rectangular-shaped pie-chart, where the area --- not the length --- becomes the relevant metric (I don't think that's "meaningless", but would suffer from most of the problems that pie-charts have). As far as I can tell, Hadley (among others) is not fond of pie-charts.
For the standard bar-chart with "meaningless" width, I would argue that the current default width of geom_bar is too wide : narrower bars would help the eye focus on the important metric --- height. Excel and LibreOffice/Calc seem to go for a default of 100%, i.e. the space between bars = width of the bars. geom_bar is wider than that. Anyone else thinks it ought to be narrower?
library("reprex")
library("ggplot2")
ggplot(mtcars, aes(x = gear)) + geom_bar()

ggplot(mtcars, aes(x = gear)) + geom_bar(width = 0.5)

ggplot(mtcars, aes(x = gear)) + geom_bar(width = 0.25)

Created on 2019-02-18 by the reprex package (v0.2.1)
Just to comment here that allowing width as an aesthetic can be used to have different "sized" pies in pie charts, which is quite useful (I mean, as useful as a pie chart can be...):
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(ggplot2)
d <- mtcars %>%
group_by(am) %>%
count(cyl) %>%
mutate(total = sum(n),
norm_n = n / total)
(p <- ggplot(d, aes(0, norm_n, fill = factor(cyl))) +
facet_grid(cols = vars(am)) +
geom_col(aes(width = total), position = position_stack()))
#> Warning: Ignoring unknown aesthetics: width

p + aes(x = total/2) + coord_polar("y")

Created on 2021-08-12 by the reprex package (v2.0.0)
Variable width is useful for bar charts by month, to prevent the bars from overlapping. Especially if you want no gaps between the bars, but also because you'll get inconsistent gaps otherwise.
You can hack it by making the dates a factor instead, but then you need to do much more work to get a nice date axis.
library(dplyr, warn.conflicts = FALSE)
library(lubridate, warn.conflicts = FALSE)
library(ggplot2, warn.conflicts = FALSE)
set.seed(1)
df <- tibble(
date = seq.Date(ymd("2020-01-01"), ymd("2020-12-01"), by = "1 month"),
quantity = sample(20:100, 12),
ndays = days_in_month(date) # width for different months
) |>
mutate(date = date + ndays / 2) # reposition to fix the overlaps
df
#> # A tibble: 12 × 3
#> date quantity ndays
#> <date> <int> <int>
#> 1 2020-01-16 87 31
#> 2 2020-02-15 58 29
#> 3 2020-03-16 20 31
#> 4 2020-04-16 53 30
#> 5 2020-05-16 62 31
#> 6 2020-06-16 33 30
#> 7 2020-07-16 78 31
#> 8 2020-08-16 70 31
#> 9 2020-09-16 40 30
#> 10 2020-10-16 73 31
#> 11 2020-11-16 26 30
#> 12 2020-12-16 56 31
df |>
ggplot() +
geom_col(aes(date, quantity, width = ndays), alpha = 0.7)
#> Warning in geom_col(aes(date, quantity, width = ndays), alpha = 0.7): Ignoring
#> unknown aesthetics: width

Created on 2023-11-30 with reprex v2.0.2
I understand that we do not want to encourage bar charts with variable widths, however I do think enforcing this is causing us more pain than gain. I'd like to challenge some points in favour of not recognising width as an aesthetic.
widthworks just fine as a parameter in the way the code is currently written,
Not really. It throws warnings about being ignored, while it is being used.
the "hack" is fine also
While the hack works to recognise the parameter, we wouldn't need the hack at all if it were a proper aesthetic.
If people really want to do something like this, they can use geom_rect() or geom_tile() instead.
-
geom_tile()is not a good alternative, for two reasons. Theheightaesthetic is not a position aesthetic, so it does not respond to scale transformations. Scale-transformed bar charts are probably a bad idea anyway, but I don't think we should prohibit it. Secondly, you have to usey = after_stat(count / 2)when pairing a bar chart with a stat, which is clunky. -
geom_rect()is not a good alternative, also for two reasons. You have to specifyymin = 0, which is clunky. More importantly, when using a discrete x variable, thexminandxmaxare a pain to compute, because you'd have to manually convert the discrete variable into a continuous one. - If you want to solve most of these issues, you'd want a geom that has
x/widthparametrisation for the horizontal direction, butymin/ymaxparametrisation for the vertical direction. This geom does not exist.
data$widthshould be used only when the Stat provides it. But,geom_col()is not the case, it should ignoredata$width.
Ideally, the geom shouldn't care whence the width data came. Baking in prohibitions for specific geom/stat pairings hurts the flexibility of the API and should, in my opion, only ever be used to enhance displays, not prohibit them.
I'd also like to re-iterate some points in favour of width as aesthetic.
- We already allow bars with varying
widthdirectly from the aesthetics. Sure, we throw a warning in protest, but then promptly display the bars as people intended anyway. We can even circumvent this warning by usingggplot(..., mapping = aes(..., width = var))as it'll end up in the layer data even for layers that don't havewidthas an aesthetic or parameter. - There are valid use-cases from a user perspective, as pointed out elsewhere in this issue.
- There are valid use-cases from a developer perspective, such as when
widthcomes from a position adjustment, stat computation, or needs to vary between panels. - Maintaining
widthas a proper aesthetic is easier than relying on the current hack.
In summary, the main argument against width as an aesthetic is that it might possibly encourage some bad visualisation. However, we can't stop people from doing this anyway and having ggplot2 jump through hoops to discourage this is causing discomfort in the shape of hacks and spurious warnings. Therefore, I argue we should just let width be an aesthetic.
@teunbrand Let me go back on my argument from six years ago. While I still think one has to be careful with variable widths in a plot, I also these days believe plotting software should be maximally flexible and not impose specific design philosophies on their users. So unless there's a good technical reason not to have width as an aesthetic I don't see how we lose in any way by making it one.
Thanks Claus, it seems we are in alignment then over this. I didn't mean to single out your arguments (and I'm sorry if it appeared that way). I just felt that this issue was stuck in a weird place of being acknowledged and having proposed solutions, but being dormant for a while. My arguing hopefully would get folks on board with the 'width as aesthetic' approach, so we can move forward on this issue.
No worries, I didn't feel singled out. In fact, I was surprised by my own comment from 2019 as today I don't think I would write it. (I came here thinking: let me argue in favor of width as an aesthetic and let's see who the idiot was that argued against it. Well, it was me apparently. 🤣)
I apreciate @teunbrand's take on the subject, but -- and I may well have missed something -- I am not sure the proposed fix works as intended. See comment at #5807.
A typical example of valid use of bar width is for multiplicative units, such as average price x number of units to get a transaction volume displayed as area. Such graphs are quite common in physics / engineering / climate science, etc.