r-optparse icon indicating copy to clipboard operation
r-optparse copied to clipboard

Can we get mandatory argument support ?

Open statquant opened this issue 8 years ago • 10 comments

If my understanding is correct there is no way to indicate that an argument is required and not specify a default value. I raised the question : http://stackoverflow.com/questions/35252547/can-i-specify-mandatory-arguments-with-optparse Am I wrong ?

statquant avatar Feb 07 '16 18:02 statquant

This has been discussed before ( #3 ) and one can note this was never a feature in the Python optparse library this package is based off nor will it be a priority in this package (see https://docs.python.org/2/library/optparse.html#what-are-options-for for the reasoning on why). Other R packages such as "argparse" ( https://github.com/trevorld/argparse ) does support explicitly specifying mandatory positional arguments and/or mandatory optional arguments.

If you wanted mandatory arguments with "optparse" you could always manually check if a mandatory value was specified. For example for a mandatory optional argument:

  option_list <- list(make_option(c("-t", "--test"), type="character", "Mandatory option"))
  parser <- OptionParser(option_list=option_list)
  opt <- parse_args(parser)
  if(is.null(opt$test)) { 
        cat("Mandatary test argument not found\n")
        print_help(parser)
        quit(status=1)
  }

The second example in the package vignette shows checking if a mandatory positional argument was reasonable.

Edit 2018-10-19: Comment previously claimed incorrectly that the argparse package doesn't support required optional arguments but actually it does.

trevorld avatar Feb 08 '16 00:02 trevorld

A simple R-style solution would be to lazy evaluate default only if needed and allow R expressions as values, which would allow passing something like default=stop().

eantonya avatar Oct 17 '18 20:10 eantonya

@eantonya , the issue with your proposed stop() lazy evaluation approach is that we need to always evaluate default (if just to check if it is NULL or not) in order to enable smarter type casting (i.e. grabbing the type of the option from the default type if not otherwise defined).

trevorld avatar Oct 17 '18 22:10 trevorld

Don't check for NULL, check for missing. And if type is unspecified and default is an expression - either complain and stop, or evaluate - I can see an argument made for either.

eantonya avatar Oct 18 '18 14:10 eantonya

There is a school of thought that specifying optional arguments in a function (such as default here) is more clearly understood by users if done with a NULL and should be checked with is.null: http://adv-r.had.co.nz/Functions.html

I'm not fully persuaded of the merits of having the default argument in make_option try to complete two different tasks:

  1. Set the default including possibility of the user explicitly setting it to NULL
  2. Assert if an optional argument was explicitly passed an argument on the command line

Note if one sets an integer value for the positional_arguments argument of parse_args then optparse will throw an error if not enough or too little positional arguments are present.

With the lazy evaluation approach there is also the risk that a bunch of un-needed computations (possibly with other undesirable side-effects) occur before the stop() is triggered which could have been prevented if the user explicitly asserted that a reasonable value was set earlier in the Rscript.

I think it is much cleaner and safer to do Step 2 separately either by using instead mandatory positional arguments or instead checking if the optional argument is either present in commandArgs(TRUE):

 if (!grepl("^--mandatory_option", commandArgs(TRUE)) stop("mandatory_option not set") 

Or by doing something like checking that the parsed option is not NULL:

if(is.null(options[["mandatory_option"]])) stop("mandatory_option not set")

If those are too verbose for you one could always write a helper function::

# Assert mandatory option present
assert_mandatory_options <- function(options, mandatory_options=character()) {
    for (mo in mandatory_options) {
        if (is.null(options[[mo]])) {
              stop(paste("Forgot to set mandatory option", mo)
        }
    }
}

options <- parse_args(parser) # or options <- parse_args2(parser)$options
mandatory_options <-  c("mandatory_option1", "mandatory_option2")
assert_mandatory_options(options, mandatory_options)

Or the functions themselves later in the Rscript can assert if they were fed reasonable arguments.

trevorld avatar Oct 18 '18 21:10 trevorld

The user can also forgo using this package altogether and do everything themselves - that's obviously not the point. Point is to improve this package to make it easier to use and more versatile. Crappy solutions to this outside of the package exist, but that's what they are - crappy.

If you simply must have a default value of NULL for default (which btw I can't ever imagine anyone specifying explicitly) for cultural reasons, that's fine too - you can still check if it's an expression before the null check. Or scratch all that and add a new bool argument.

Probably any solution you pick is faster to type out than all of this arguing, so maybe you just think that this should never be added, which is fine, but it makes this package less useful than it could be.

eantonya avatar Oct 19 '18 17:10 eantonya

Here's a real-world example btw of why I need this, and why positional arguments are not a good solution.

I have an R script that given a date range and a country, prints out the official holidays of that country. All 3 arguments are mandatory, and have no sensible defaults.

Maybe you could argue that start/end dates can be positional, but that would still leave country up in the air + surely after using R one can appreciate how much nicer it is not to worry about position of arguments and instead just specify them by name wherever you like.

eantonya avatar Oct 19 '18 17:10 eantonya

Crappy solutions to this outside of the package exist, but that's what they are - crappy.

There is also the argparse package (which I wrote to handle more advanced command-line use cases than optparse).

> library("argparse")
> parser = ArgumentParser()
> parser$add_argument("--option", required=TRUE)
> parser$parse_args()
Error in .stop(output, "parse error:") : parse error:
usage: PROGRAM [-h] --option OPTION
PROGRAM: error: the following arguments are required: --option
> parser$parse_args("--option=foo")
$option
[1] "foo"

Maybe you could argue that start/end dates can be positional, but that would still leave country up in the air + surely after using R one can appreciate how much nicer it is not to worry about position of arguments and instead just specify them by name wherever you like.

In your particular use case I'd argue that start, end, AND country need not be mandatory and in fact can all be given a sensible optional default. For a typical person one desired behaviour could be to see what would be all their official holidays for upcoming year after inferring the user's country:

  1. use as start today's date
  2. use as end a year from today's date
  3. a. default to a guess of the user's country (perhaps make inferences from Sys.getenv("LANG") which on my system would suggest I am from the "US") b. default to printing out all official holidays from all countries (or just the more common ones)
    c. default to your country d. default to another salient country e. Or make this the one required positional argument

As a user I appreciate it when developer's take the time to pick out or infer reasonable defaults for me and usually try to do the same with my Rscripts. I fail to see in your particular use case why any of your options need to be mandatory. The user can always try the command a second time with explicit options passed in to tweak the output for their use case (or maybe create an alias with their preferred settings).

If you simply must have a default value of NULL for default (which btw I can't ever imagine anyone specifying explicitly)

Someone could want to set that explicitly if they would then pass option into a function which interprets NULL as meaning the function should calculate a reasonable default i.e. one could have:

 print_public_holidays <- function(country = NULL, start = NULL, end = NULL) {
      # contents of function here
 }

which is then called in an Rscript by

print_public_holidays(options$country, options$start, options$end)

trevorld avatar Oct 19 '18 20:10 trevorld