vctrs icon indicating copy to clipboard operation
vctrs copied to clipboard

Proposal for `repair = "unique_quiet"` and `repair = "universal_quiet"`

Open jennybc opened this issue 1 year ago • 8 comments

I'd like to propose the addition of 2 name repair strings to the standard list of vec_as_names():

vec_as_names(
  ...,
  repair = c("minimal", "unique", "universal", "check_unique"),
  quiet = FALSE,
  ...
)

I'm proposing we add:

  • unique_quiet
  • universal_quiet

which would have the effect of requesting a specific level of name repair AND setting quiet = TRUE.

I know this might sound sort of gross, but hear me out :sweat_smile: The existing check_unique is already a mix of a repair strategy and how to report on repair.

I'm writing this as the maintainer of readr, vroom, readxl, and googlesheets4, all of which do name repair, with various degrees of historical baggage, but all of which do now expose a name_repair argument. Some of these packages call vctrs::vec_as_names() directly (e.g. vroom), but others do not (readxl and googlesheets4 currently access name repair via tibble).

The ability to suppress name repair messages is a common and understandable feature request from users.

These feel like my current options for addressing this:

  • Documentation: Show user how to write and use a custom name repair function that does not message.

    quiet_repair <- function(x) {
      vctrs::vec_as_names(x, repair = "unique", quiet = TRUE)
    }
    
    read_csv(I("x,x\n1,2\n3,4"), name_repair = quiet_repair)
    
  • Documentation: Show user how to use the rlib_name_repair_verbosity global option to suppress name repair messaging.

    withr::with_options(
      list(rlib_name_repair_verbosity = "quiet"),
      read_csv(I("x,x\n1,2\n3,4"))
    )
    
  • Feature: Actually change the signature of all relevant user-facing functions in readr, vroom, readxl, and googlesheets4 to expose a new, 2nd argument re: name repair, that controls verbosity. One example:

    read_excel <- function(...
      .name_repair = "unique",            # <-- this is already there
      .name_repair_verbosity = "verbose"  # <-- this would be new
    )
    
  • Feature: I could implement my proposal in each of these packages, by intercepting name_repair and, if I see minimal_quiet or universal_quiet, use whatever means I have available to make it so (implementation would differ across the packages).

I don't love any of these options. To quote @DavisVaughan:

I do want to make it somewhat hard to silence [name repair messages], but right now I think it is a little too hard

The first 2 solutions (documentation, user does extra work) feel too hard on the user.

The third solution (new argument everywhere) feels like a lot of work/change for such a small behavioural tweak.

The fourth solution (special handling for the new strings in certain packages) feels like it breaks a tidyverse rule re: consistency.

I think name_repair = "unique_quiet" and name_repair = "universal_quiet" could be really handy for giving users more control with the single name_repair argument commonly seen in user-facing functions. To be clear, I think we should never default to these.

jennybc avatar Aug 29 '22 19:08 jennybc

BTW I'd be happy to work on this, if the idea gets traction.

jennybc avatar Aug 29 '22 19:08 jennybc

One other thing that quickly came up (but hasn't been discussed much) is:

name_repair = list("unique", quiet = TRUE)

This gives a way to set quiet right in name_repair, which would override the argument set through vec_as_names(quiet = ) (or maybe we'd change that default to NULL as a way to say "quiet defaults to FALSE unless it is overridden by the special name_repair list syntax." Then explicitly setting the quiet argument to vec_as_names() would have priority over anything else)

This would be a more extensible escape hatch in case there are more name-repair-ish arguments that we might want the user to be able to set without adding more arguments

DavisVaughan avatar Aug 29 '22 20:08 DavisVaughan

Regarding passing a list of options, I think it's probably better to expose an option constructor with autocompletion of arguments and a dedicated help page.

I think if we go this route we might want to deprecate the quiet argument of vec_as_names() because it's a bit weird that you would be allowed to do this:

vec_as_names(x, repair = "unique_quiet", quiet = FALSE)
vec_as_names(x, repair = name_repair_opts("unique", quiet = TRUE), quiet = FALSE)

personally I've never loved these name repair messages and I wonder if we could do better by summarising name repair events at the end of a command (see discussion I've started on Slack).

lionel- avatar Aug 30 '22 07:08 lionel-

On further thought, it's simpler to just say that unique_quiet and universal_quiet have precedence over the quiet argument. And we don't really need to worry about potential new arguments since we can just pass a vec_as_names() wrapper function as repair argument with full flexibility. So @jennybc's proposal feels sufficient to me.

lionel- avatar Aug 30 '22 13:08 lionel-

That's fine by me, I don't feel strongly about the options-list

DavisVaughan avatar Aug 30 '22 13:08 DavisVaughan

Would you like me to try to implement this as offered or is it more trouble for you to get an external PR? You know better than I what this entails.

jennybc avatar Aug 30 '22 14:08 jennybc

You're welcome to send a PR! It's in the C function new_name_repair_opts() in names.c. Just needs some new cached strings for the pointer comparison (just grep strings_unique to see what this entails), and then you can adjust the options struct as needed.

lionel- avatar Aug 30 '22 14:08 lionel-

OK I will have a go at it, although probably not this week. I'll assign myself.

jennybc avatar Aug 30 '22 16:08 jennybc

I think this would be great to have. Is there any chance we could get it in the upcoming vctrs release?

hadley avatar Sep 24 '22 14:09 hadley

~What is the timeline for that? I remain very interested!~ PR incoming soon

jennybc avatar Sep 24 '22 14:09 jennybc