`n` legend with single value `1` when using `geom_bar(stat = "sum")`
Using stat = "sum" in geom_bar() displays a count legend labelled "n" with a grey square as a default. It is particularly out of place when the bars are filled, resulting in what seems to be two fill legends, the grey not matching anything on the plot.
The issue has been brought up on StackOverflow, with a workaround: https://stackoverflow.com/questions/50378718/what-is-the-n-1-box-in-my-r-geom-bar-legend-and-how-do-i-remove
library(ggplot2)
exp_df <- data.frame(x = c("A", "B", "B", "C"),
value = 1:4,
group = c("Z", "Z", "Y", "Y"))
ggplot(exp_df, aes(x, value)) +
geom_bar(stat = "sum")

# with fill
ggplot(exp_df, aes(x, value, fill = group)) +
geom_bar(stat = "sum")

Created on 2022-05-17 by the reprex package (v2.0.1)
Should geom_bar() not display this legend at all?
The reason is after_stat(n) is mapped to size (probably because this stat_sum() was primarily intended to be used with geom_point()). I'm not sure if we can remove n by default for this case, but I agree it would be nice.
library(ggplot2)
exp_df <- data.frame(x = c("A", "B", "B", "C"),
value = 1:4,
group = c("Z", "Z", "Y", "Y"))
ggplot(exp_df, aes(x, value, size = NULL)) +
geom_bar(stat = "sum")

Created on 2022-05-18 by the reprex package (v2.0.1)
Thanks, @yutannihilation. I now realise that creating a "summed" bar chart gives the illusion that it does what one might want it to do, i.e. pre-process the data by summing the values by group, and creating a col chart using those summed values as the y value.
When in reality, it does nothing of the sort. It just creates a stacked col chart:
library(ggplot2)
exp_df <- data.frame(x = c("A", "B", "B", "C"),
value = 1:4,
group = c("Z", "Z", "Y", "Y"))
# "summed" bar chart is not actually summed...
ggplot(exp_df, aes(x, value)) +
geom_bar(stat = "sum", colour = "red")

# ... it's just a stacked col chart (with thicker outlines)
ggplot(exp_df, aes(x, value)) +
geom_col(colour = "red")

However, if a pair of x and y values is repeated, the size of the outline does change – as expected if one knows that the sum stat affects the size aesthetic according to repeats of x and y pairs:
exp_df2 <- data.frame(x = c("A", "B", "B", "C"),
value = c(1,2,2,3),
group = c("Z", "Z", "Y", "Y"))
ggplot(exp_df2, aes(x, value)) +
geom_bar(stat = "sum", colour = "red")

What I am actually after is using the stat_summary() function with fun = "sum":
ggplot(exp_df, aes(x, value)) +
stat_summary(geom = "bar", fun = "sum", colour = "red")

Created on 2022-06-01 by the reprex package (v2.0.1)
My understanding now is that the current behaviour is expected, however the name of the stat is misleading. I assume this is not something that will likely change (and I suspect the clash with the name of the count stat has been previously discussed), so maybe some additions in the documentation could help?
Something like:
In the geom_count Description
(or in "Details")
[...] The "sum" stat computes an
nvariable equal to how many elements share the same values for all aesthetic other thansize.nis by default mapped to thesizeaesthetic of the geometry. Especially when used with a geometry other than "point", for example when replacingstat = "count"withstat = "sum"ingeom_bar(), this stat may be misunderstood as one computing sums of groupedxoryvalues.
In the geom_bar Details
Using the "sum" stat in a "bar" or "col" geometry will not sum the grouped
xoryvalues. This stat is intended to be used with the "point" geometry, and varies thesizeaesthetic when there are exact overlaps. To create a bar chart of summed values, pre-process the dataframe before feeding if to ggplot2 functions, or usestat_summary()withfun = "sum".
What do you think?
Sorry for replying late. I have no idea. I too think the documentation can be improved on this, but I'm not sure if it goes to the help page of each function. I feel it's a higher topic, but I don't come up with the right place for this...