defmacro icon indicating copy to clipboard operation
defmacro copied to clipboard

Use case: inline `::`

Open krlmlr opened this issue 5 years ago • 3 comments

:: is slow, and it's not going to get faster.

There's also very little point in substituting at the user level, see below.

What if :: was a macro that's expanded during load time, but only if it's safe to do (i.e. if there are no collisions regarding that particular symbol)?

Substitution at user level

We could shovel all exports into an environment and use it to look up exported functions in that environment. However, this means that we need to look up that environment for each function call -- this means we're not getting any faster than a direct namespace import.

library(rlang)
library(tidyverse)

rlang_ns <- asNamespace("rlang")
exports <- unlist(as.list(.getNamespaceInfo(rlang_ns, "exports")))

rlang. <- new.env(size = 1013)
for (name in exports) {
  rlang.[[name]] <- get(name, rlang_ns)
}

needle <- "bar"
haystack <- c("foo", "bar", "baz")

bench::mark(
  iterations = 1e5,
  arg_match0(needle, haystack),
  rlang::arg_match0(needle, haystack),
  rlang.$arg_match0(needle, haystack)
)
#> # A tibble: 3 x 6
#>   expression                               min   median `itr/sec` mem_alloc
#>   <bch:expr>                          <bch:tm> <bch:tm>     <dbl> <bch:byt>
#> 1 arg_match0(needle, haystack)        835.05ns 994.07ns   877271.        0B
#> 2 rlang::arg_match0(needle, haystack)   5.58µs   6.53µs   138727.        0B
#> 3 rlang.$arg_match0(needle, haystack)   1.02µs   1.24µs   700871.        0B
#> # … with 1 more variable: `gc/sec` <dbl>

Created on 2020-10-23 by the reprex package (v0.3.0)

krlmlr avatar Oct 23 '20 02:10 krlmlr

Interesting idea. Only in the absence of eval calls can this be implemented 100% safely without collisions I think and it would still require a bit of static analysis. Not sure if a macro is the right vehicle to do it, as it just operates locally. It would have to change the code outside somewhere.

But a marco could change rlang::arg_match0 to get0("arg_match0", asNamespace("rlang")) which would already be a speedup of 2x it seems.

library(rlang)

rlang_ns <- asNamespace("rlang")
exports <- unlist(as.list(.getNamespaceInfo(rlang_ns, "exports")))

rlang. <- new.env(size = 1013)
for (name in exports) {
  rlang.[[name]] <- get(name, rlang_ns)
}

# just to test with just one element without the hash overhead
rlangsubset <- new.env(hash = FALSE, size = 1)
rlangsubset[["arg_match0"]] <- get("arg_match0", rlang_ns)

needle <- "bar"
haystack <- c("foo", "bar", "baz")

bench::mark(
  iterations = 1e5,
  arg_match0(needle, haystack),
  rlang::arg_match0(needle, haystack),
  rlang.$arg_match0(needle, haystack),
  rlangsubset$arg_match0(needle, haystack),
  get("arg_match0", rlang_ns)(needle, haystack),
  get0("arg_match0", rlang_ns)(needle, haystack),
  get0("arg_match0", asNamespace("rlang"))(needle, haystack)
)
#> # A tibble: 7 x 6
#>   expression                                                    min  median
#>   <bch:expr>                                                 <bch:> <bch:t>
#> 1 arg_match0(needle, haystack)                               1.28µs  1.47µs
#> 2 rlang::arg_match0(needle, haystack)                        9.55µs 10.54µs
#> 3 rlang.$arg_match0(needle, haystack)                         1.8µs  2.16µs
#> 4 rlangsubset$arg_match0(needle, haystack)                   1.81µs  2.15µs
#> 5 get("arg_match0", rlang_ns)(needle, haystack)              2.57µs  2.89µs
#> 6 get0("arg_match0", rlang_ns)(needle, haystack)             2.49µs   2.8µs
#> 7 get0("arg_match0", asNamespace("rlang"))(needle, haystack) 4.84µs  5.41µs
#> # … with 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl>

Created on 2020-10-23 by the reprex package (v0.3.0)

dirkschumacher avatar Oct 23 '20 19:10 dirkschumacher

Even more inlining 🤔

library(rlang)

rlang_ns <- asNamespace("rlang")
exports <- unlist(as.list(.getNamespaceInfo(rlang_ns, "exports")))

rlang. <- new.env(size = 1013)
for (name in exports) {
  rlang.[[name]] <- get(name, rlang_ns)
}

needle <- "bar"
haystack <- c("foo", "bar", "baz")

bench::mark(
  iterations = 1e5,
  a = arg_match0(needle, haystack),
  b = rlang::arg_match0(needle, haystack),
  c = get0("arg_match0", asNamespace("rlang"))(needle, haystack),
  d = getNamespace("rlang")[["arg_match0"]](needle, haystack),
  # unsafe
  d1 = .Internal(getRegisteredNamespace("rlang"))[["arg_match0"]](needle, haystack),
)
#> # A tibble: 5 x 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 a             1.3µs    1.5µs   493086.        0B     9.86
#> 2 b            9.15µs  10.14µs    77921.        0B    12.5 
#> 3 c             4.5µs   5.08µs   148416.        0B    13.4 
#> 4 d            2.02µs   2.33µs   359991.        0B    14.4 
#> 5 d1           1.64µs   1.92µs   453953.        0B    13.6

Created on 2020-10-23 by the reprex package (v0.3.0)

dirkschumacher avatar Oct 23 '20 19:10 dirkschumacher

`::` <- defmacro(function(lhs, rhs) {
  bquote(
    getNamespace(.(as.character(lhs)))[[.(as.character(rhs))]]
  )
})

But overwriting :: seems to cause some other issues, at least in RStudio :)

dirkschumacher avatar Oct 23 '20 19:10 dirkschumacher