style icon indicating copy to clipboard operation
style copied to clipboard

Need clarity on named vs. unnamed arguments and line breaks

Open MichaelChirico opened this issue 1 year ago • 5 comments

In #39 we have this quote:

If no arguments are named, and it doesn't fit on one line.then each argument always gets its own line.

Downthread it's pointed out this is not very clear in the style guide as of now.

I'd also like to point out that I find it to be less-than-optimal advice. This came up in {lintr} here, where we have:

https://github.com/r-lib/lintr/pull/2543/files#diff-f32b4985bf0c6dd105640050567eb5e0d6d097c1e6e365a69a0e5a4c24dc2b13R40

conds <- c(conds,
  "self::OP-RIGHT-PAREN[preceding-sibling::*[not(self::COMMENT)][1][self::OP-LEFT-PAREN or self::OP-COMMA]]"
)

I think it's very good to put conds on the same line as c( here because it makes it very clear we're overwriting the same object in a way that's not as transparent if we force conds to come on the next line.

Our usage is also in line with the current style guide, but not the text of the issue linked, so I'm hoping this discussion can be revisited.

MichaelChirico avatar Apr 02 '24 01:04 MichaelChirico

The confusion likely stems from the brevity of the c() function's name. When functions have longer names, it might not seem as natural to place conds on the same line. Consider the clarity in the formatting of concatenate_elements_to_vector:

conds <- concatenate_elements_to_vector(
  conds,
  second_argument,
  third_argument
)

This format distinctly shows that conds is the first argument, separated on its own line, compared to placing it on the same line, which might make its role less clear:

conds <- concatenate_element_to_vector(conds,
  second_argument,
  third_argument
)

The separation emphasizes conds as the initial argument, improving readability.

luciorq avatar Apr 02 '24 13:04 luciorq

When functions have longer names, it might not seem as natural to place conds on the same line.

I don't disagree, though concatenate_element_to_vector is the other extreme of an unusually long variable name. What I am arguing for is flexibility for cases where keeping in one line improves readability, where in your case readability is probably best split on two lines.

But note that your case is far more exceptional -- at 29 characters it's at the limit of lintr::object_name_linter()'s limit of 30 characters. Just as a quick empirical note, if we look at {dplyr}'s R code, we see this distribution of nchar(CALL_NAME):

  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20 
291  23 208 545 670 679 437 237 644 571 277 219 126 163 129 179 143  77  50  27 
 21  22  23  24  25  26  27  28  29  30  32  33  34  35  36  37  39  42  43  46 
 56  46  13  11   7  21   4   9   5   1   3   3   5   2   3   3   1   1   2   1 

about 0.5% of calls have such a long name, while other common 1-char calls include I() and n().

all_call_names <- function(f) {
  f |>
    parse() |>
    xmlparsedata::xml_parse_data() |>
    xml2::read_xml() |>
    xml2::xml_find_all("//SYMBOL_FUNCTION_CALL") |>
    xml2::xml_text()
}
# inside dplyr/R
list.files() |>
  lapply(all_call_names) |>
  unlist() |>
  nchar() |>
  table()

MichaelChirico avatar Apr 03 '24 06:04 MichaelChirico

I agree that the current advice is suboptimal. There are three cases that I think would be good to consider.

(1) You have one "primary" argument, or you're mutating an object, as @MichaelChirico's cond example.

(2) Sometimes, particularly in paste() or c() you want to collect related arguments together, e.g.

paste(
  "Once upon a time, there was a", adjective, noun, 
  "who loved to", verb, adverb, ". ",
  "It was the funniest thing ever!"
)

(3) I also don't love the case where you have a short unnamed argument and then a couple of long named arguments, e.g.

x <- map(
  x,
  f,
  a_long_argument_name = 1
)

But I don't have a good sense of how to fix the advice to handle all these cases.

hadley avatar Aug 09 '24 12:08 hadley

See also #189

hadley avatar Sep 30 '24 15:09 hadley

(2) is already mentioned in the guide. And (3) is the topic of #189.

But I'm not sure I can figure out what the rule is for (1) because I'd never write this:

df <- mutate(df,
  y = y + a_very_very_very_very_very_very_very_very_very_very_long_function_name(z)
)

And looking again at the motivating example, I'd be tempted to rewrite like:

extra_cond <- "self::OP-RIGHT-PAREN[preceding-sibling::*[not(self::COMMENT)][1][self::OP-LEFT-PAREN or self::OP-COMMA]]"
conds <- c(conds, extra_cond)

So I don't think this is quite ready to become official style advice.

hadley avatar Sep 30 '24 16:09 hadley