bench icon indicating copy to clipboard operation
bench copied to clipboard

Results seem biased towards first expression with very quick benchmarks

Open lionel- opened this issue 2 years ago • 2 comments

bench::mark(
  identity(NULL),
  identity(NA),
  check = FALSE,
  iterations = 100000
)
#> # A tibble: 2 × 13
#>   expression          min  median `itr/sec` mem_alloc `gc/sec`  n_itr
#>   <bch:expr>     <bch:tm> <bch:t>     <dbl> <bch:byt>    <dbl>  <int>
#> 1 identity(NULL)        0     1ns 11819650.        0B      0   100000
#> 2 identity(NA)      125ns   251ns  3788145.        0B     37.9  99999
#> # … with 6 more variables: n_gc <dbl>, total_time <bch:tm>,

bench::mark(
  identity(NA),
  identity(NULL),
  check = FALSE,
  iterations = 100000
)
#> # A tibble: 2 × 13
#>   expression          min  median `itr/sec` mem_alloc `gc/sec`  n_itr
#>   <bch:expr>     <bch:tm> <bch:t>     <dbl> <bch:byt>    <dbl>  <int>
#> 1 identity(NA)          0     1ns 34330721.        0B      0   100000
#> 2 identity(NULL)    166ns   250ns  3697761.        0B     37.0  99999
#> # … with 6 more variables: n_gc <dbl>, total_time <bch:tm>,

It looks like the problem is also there with fewer iterations:

bench::mark(
  identity(NULL),
  identity(NA),
  check = FALSE
)
#> # A tibble: 2 × 13
#>   expression          min   median `itr/sec` mem_alloc `gc/sec` n_itr
#>   <bch:expr>     <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int>
#> 1 identity(NULL)        0    167ns  1631474.        0B        0 10000
#> 2 identity(NA)      125ns    250ns  4030948.        0B        0 10000
#> # … with 6 more variables: n_gc <dbl>, total_time <bch:tm>,

bench::mark(
  identity(NA),
  identity(NULL),
  check = FALSE
)
#> # A tibble: 2 × 13
#>   expression          min   median `itr/sec` mem_alloc `gc/sec` n_itr
#>   <bch:expr>     <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int>
#> 1 identity(NA)          0    167ns  1743834.        0B        0 10000
#> 2 identity(NULL)    166ns    250ns  3458356.        0B        0 10000
#> # … with 6 more variables: n_gc <dbl>, total_time <bch:tm>,

This can be worked around by adding a dummy first expression:

bench::mark(
  NULL,
  identity(NA),
  identity(NULL),
  check = FALSE,
  iterations = 100000
)
#> # A tibble: 3 × 13
#>   expression          min  median  `itr/sec` mem_alloc `gc/sec` n_itr
#>   <bch:expr>     <bch:tm> <bch:t>      <dbl> <bch:byt>    <dbl> <int>
#> 1 NULL                  0     1ns 495429666.        0B        0   1e5
#> 2 identity(NA)      125ns   251ns   3729849.        0B        0   1e5
#> 3 identity(NULL)    126ns   251ns   3741279.        0B        0   1e5
#> # … with 6 more variables: n_gc <dbl>, total_time <bch:tm>,

lionel- avatar Aug 19 '21 07:08 lionel-

I actually can't reproduce this on my machine.

bench::mark(
  identity(NULL),
  identity(NA),
  check = FALSE,
  iterations = 100000
)
#> # A tibble: 2 × 6
#>   expression          min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>     <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 identity(NULL)    199ns    251ns  2684031.        0B      0  
#> 2 identity(NA)      222ns    270ns  2834638.        0B     28.3

bench::mark(
  identity(NA),
  identity(NULL),
  check = FALSE,
  iterations = 100000
)
#> # A tibble: 2 × 6
#>   expression          min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>     <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 identity(NA)      222ns    268ns  3204566.        0B      0  
#> 2 identity(NULL)    219ns    271ns  3190654.        0B     31.9

bench::mark(
  NULL,
  identity(NA),
  identity(NULL),
  check = FALSE,
  iterations = 100000
)
#> # A tibble: 3 × 6
#>   expression          min   median  `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>     <bch:tm> <bch:tm>      <dbl> <bch:byt>    <dbl>
#> 1 NULL                  0      1ns 401638626.        0B      0  
#> 2 identity(NA)      223ns    270ns   3230758.        0B     32.3
#> 3 identity(NULL)    213ns    269ns   3209396.        0B     32.1

Created on 2021-08-19 by the reprex package (v2.0.0)

It may be an interaction with the bytecode compiler, try setting compiler::enableJIT(FALSE) and see what happens?

jimhester avatar Aug 19 '21 13:08 jimhester

I get the same results with uncompiled functions. This is strange.

lionel- avatar Aug 19 '21 14:08 lionel-

I can still reproduce it when running an x86 R on my m1. With an arm64 R, it doesn't reproduce.

lionel- avatar May 03 '23 07:05 lionel-

Let's see if this issue resurfaces in the future.

lionel- avatar May 03 '23 10:05 lionel-

I didn't realize this was occurring only when running x86 R on a m1 mac. I think most likely this is an artifact of the x86 virtualization on arm, likely inflating the overhead calculation because the x86 calls need to be emulated on arm.

jimhester avatar May 04 '23 19:05 jimhester