argufy icon indicating copy to clipboard operation
argufy copied to clipboard

WIP: Argufy Return

Open jimhester opened this issue 8 years ago • 14 comments

This works for the following functions using the trailing format. It still needs to have the code cleaned up and made more readable, but thought I would show you the current state and see what you thought.

funs <- list(
  function(x) x ? is.numeric,
  function(x) { x } ? is.numeric,
  function(x) { return(x) } ? is.numeric,
  function(x) { if (x == "a") return(x) else 1 } ? is.numeric
)

sapply(funs, function(f) f(x = 1))
#> [1] 1 1 1 1

sapply(funs, function(f) f(x = "a"))
#> [1] "a" "a" "a" "a"

sapply(lapply(funs, argufy), function(f) try(f(x = 1)))
#> [1] 1 1 1 1

sapply(lapply(funs, argufy), function(f) try(f(x = "a")))
#> Error : is.numeric(`_result_`) is not TRUE
#> Error : is.numeric(`_result_`) is not TRUE
#> Error : is.numeric(`_result_`) is not TRUE
#> Error : is.numeric(`_result_`) is not TRUE

sapply(lapply(funs, argufy), function(f) print(f, useSource = FALSE))
#> function (x)
#> {
#>     `_result_` <- local({
#>         x
#>     })
#>     stopifnot(is.numeric(`_result_`))
#>     `_result_`
#> }
#> function (x)
#> {
#>     `_result_` <- local({
#>         {
#>             x
#>         }
#>     })
#>     stopifnot(is.numeric(`_result_`))
#>     `_result_`
#> }
#> function (x)
#> {
#>     `_result_` <- local({
#>         {
#>             return(x)
#>         }
#>     })
#>     stopifnot(is.numeric(`_result_`))
#>     `_result_`
#> }
#> function (x)
#> {
#>     `_result_` <- local({
#>         {
#>             if (x == "a")
#>                 return(x)
#>             else 1
#>         }
#>     })
#>     stopifnot(is.numeric(`_result_`))
#>     `_result_`
#> }

jimhester avatar Sep 30 '15 21:09 jimhester

I know I am stubborn, but I just don't like putting it after the function body.

I think this is where Stefan's approach in ensurer is better. If you have a separate function_ function, then, you can just write things like:

f <- function_(
  this = ? is.character,
  that = ?~ as.numeric,
  returns = ? is.character
) { ... }

Maybe we should switch to this. We could keep the argufy function to convert an already existing function, and also argufy_package to convert a whole package.

gaborcsardi avatar Oct 01 '15 11:10 gaborcsardi

This implementation above will work putting the return check right after the function arguments with a slight tweak. (swapping the bod indexes)

However there isn't any way I can find to write a replacement function for ? that will work correctly. We need to return 'e1' if ? is used within the argument list and 'e2' if in the function body.

Using sys.nframe() works on the top level, but won't work if the functions are used within other calls.

# sys.nframe is only true in the body of the function call, it is 3 when evaluating arguments
`?` <- function(e1, e2) { if (sys.nframe() == 2) e2 else e1 }

I strongly dislike using a new function_ call, I would rather forget this feature entirely than go to that approach.

I do think keeping the check at the end actually does have advantages.

  1. Simple implementation / overriding
  2. Exact same semantics as used for arguments (value ~ check)
  3. Even when you move the check near the arguments it still won't show up in \usage{} section on the man page (as that contains only arguments).
  4. When looking at the return value of a function interactively you are already looking near the end of the function, so it is actually nice to have the check in the same place.

The main argument for having the return checks near the declaration is because that is function declarations in C-like languages use, but I am not sure that is a great argument. There is no need for pre-declaration of functions in R, nor header files containing only function declarations.

jimhester avatar Oct 01 '15 14:10 jimhester

Essentially my only argument for having it up front, is that then you can see all information about arguments and return values (essentially the whole API of the function) together, on the same screen.

Exactly as in the documentation, @param, @param, ..., @return all together.

With function_ you could have in the usage, btw. if we implement return values as just another argument to the function.

gaborcsardi avatar Oct 01 '15 15:10 gaborcsardi

Moving to function_ completely breaks the ability to not use argufy and have the code still work though (or turn off the checks if they are making your code too slow).

I don't see a way to make this work without putting the checks at the end, so I guess we can just forget this feature

jimhester avatar Oct 01 '15 15:10 jimhester

One last try. How about putting it among the function arguments, anyway:

f <- some_func(
  this = ? is.character,
  that = ?~ as.numeric,
  returns_ = ? is.character
) { ... }

You cannot use returns_ in the body of the function, obviously, but this is probably not a big problem. Everything would work as it is currently, but argufy() would check for returns_ and add the checks if needed.

gaborcsardi avatar Oct 01 '15 15:10 gaborcsardi

I don't hate it, but it feels a slightly janky to me.

It is also means you have to document the return twice (once in @param, once in @return).

But it is the most palatable of the options I guess...

jimhester avatar Oct 01 '15 17:10 jimhester

I think you only need to document the return twice if you don't run argufy on the function. If you run argufy, then @return is fine and argufy takes out the returns argument.

But I agree that it is not optimal. How about the

f <- function(x, y) ? is.character ? { }

form? This is essentially

f <- function(x, y) `?`(`?`(is.character), {})

so maybe you can check if the first argument of ? is another ? expression, and if it is, then return the second argument, the function body. Otherwise return the first argument.

Haven't actually tried it yet....

gaborcsardi avatar Oct 01 '15 19:10 gaborcsardi

I tried playing with that, the problem is the ? expressions get evaluated from the inside out, so if you return y from the second ? expression then the first ? has no way to know there is another expression. So you can detect in the inner ? that it is a nested ?, but not from the outer. I guess you could attach an attribute to the returned expression, but that seems kind of hacky (it is also much more complicated than ``? <- function(x, y) x

jimhester avatar Oct 01 '15 19:10 jimhester

How about capturing the arguments of ? as expressions, i.e. substitute() ? Then you can see what the arguments are, and choose accordingly. This does not sound too difficult, but I might be missing something (again).

gaborcsardi avatar Oct 01 '15 19:10 gaborcsardi

Ok using substitute() seems to work, not sure why I didn't try it before, thank you for the suggestion! https://github.com/gaborcsardi/argufy/commit/1459e85a9212b8854d24fc3a691129279e98a597 has a prototype implementation that supports both f <- function(x) is.numeric ? { x } and f <- function(x) ? is.numeric ? { x } syntax.

`?` <- function(x, y) { x <- substitute(x); if (is.call(x) && identical(x[[1]], as.symbol("?"))) y else x}

f1 <- function(x = ? is.numeric) ? is.integer ? { x }

f1('a')
#> [1] "a"
f1(1)
#> [1] 1
f1(1L)
#> [1] 1

argufy(f1)('a')
#> Error: is.numeric(x) is not TRUE
argufy(f1)(1)
#> Error: is.integer(`_result_`) is not TRUE
argufy(f1)(1L)
#> [1] 1

f2 <- function(x = ? is.numeric) is.integer ? { x }
argufy(f2)('a')
#> Error: is.numeric(x) is not TRUE
argufy(f2)(1)
#> Error: is.integer(`_result_`) is not TRUE
argufy(f2)(1L)
#> [1] 1

Actually just remembered that using the single ? form makes it ambigious as to if we are in an argument or not, so I will make that an error instead.

jimhester avatar Oct 06 '15 13:10 jimhester

Awesome! I think it is great! Can you just update the PR and merge it?

gaborcsardi avatar Oct 07 '15 08:10 gaborcsardi

Or the PR is probably OK, only this comment is not true any more? https://github.com/gaborcsardi/argufy/pull/20/files#diff-951791f1fb37d9e5b0f0cf852ce38d83R70

gaborcsardi avatar Oct 07 '15 08:10 gaborcsardi

I am going to add some tests for this as well and updated the documentation.

The current implementation does actually work with f <- function(x) is.character ? { x }, however as discussed I don't think you can unambiguously determine if a ? call is defined in an argument or the body from within the call. So in practice only the two ? form should be used. This form also makes more sense when using an assertion f <- function(x) ?~ as.character ? { x } rather than f <- function(x) as.character ?~ { x }

jimhester avatar Oct 07 '15 15:10 jimhester

Yeah, I agree, the two ? version is better.

gaborcsardi avatar Oct 07 '15 15:10 gaborcsardi