bench icon indicating copy to clipboard operation
bench copied to clipboard

Documentation for running continuous benchmarks on GH Actions

Open Robinlovelace opened this issue 4 years ago • 5 comments

I'm looking at setting-up benchmarks for the new package sfnetworks. I've followed the examples used in a couple of packages but the CI is failing in the benchmarking stage due to GitHub permissions, as shown here: https://github.com/luukvdmeer/sfnetworks/runs/798548926#step:14:25

Is it possible to get the benchmarks working without requiring authentication? Happy to help with documentation efforts on this, great package.

I did also wonder if it would be useful to have a benchmark action in https://github.com/r-lib/actions/

Understand that this is in development, hope this issue will help bring better continuous benchmarking support to R!

Robinlovelace avatar Jun 23 '20 11:06 Robinlovelace

If you are running on linux you need to explicitly set the user name in git, run these lines with whatever user name and email you would like in your workflow before trying to use any git operations.

git config --global user.email "[email protected]" git config --global user.name "Your Name"

The macOS workers have a user name and email set automatically, so if you are using that OS you won't need to do this.

jimhester avatar Jun 23 '20 11:06 jimhester

Great, thanks for the quick reply, I'm giving that a go in the commit above, let's see if it works!

Robinlovelace avatar Jun 23 '20 12:06 Robinlovelace

The git config issue seems to be fixed :tada: I'm not sure what the next issue is though, as the benchmarks seem to be passing:


remotes::install_github("luukvdmeer/sfnetworks@develop")
#> Using github PAT from envvar GITHUB_PAT
#> Skipping install of 'sfnetworks' from a github remote, the SHA1 (7baa168f) has not changed since last install.
#>   Use `force = TRUE` to force installation
library(sfnetworks)

bench::press(n = seq(from = 1, to = nrow(roxel), length.out = 5),
             {
               bench::mark(
                 time_unit = "ms",
                 as_sfnetwork(roxel[1:n, ])
               )
             }
             )
#> Running with:
#>       n
#> 1    1
#> 2  214.
#> 3  426
#> 4  638.
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> 5  851
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 5 x 7
#>   expression                     n   min median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                 <dbl> <dbl>  <dbl>     <dbl> <bch:byt>    <dbl>
#> 1 as_sfnetwork(roxel[1:n, ])    1   5.32   5.41     180.     1.73MB     21.6
#> 2 as_sfnetwork(roxel[1:n, ])  214. 20.7   20.9       47.4  478.91KB     36.5
#> 3 as_sfnetwork(roxel[1:n, ])  426  33.6   34.0       29.5   873.6KB    108. 
#> 4 as_sfnetwork(roxel[1:n, ])  638. 48.9   50.3       19.9    1.21MB     21.8
#> 5 as_sfnetwork(roxel[1:n, ])  851  63.8   65.2       15.2    1.52MB     20.8

Created on 2020-06-23 by the reprex package (v0.3.0)

And ...

pkgload::load_all()
#> Error: No root directory found in /tmp/Rtmp17y1bE/reprex56026429baf7 or its parent directories. Root criterion: contains a file `DESCRIPTION`

library(sfnetworks)
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0
library(tidygraph)
#> 
#> Attaching package: 'tidygraph'
#> The following object is masked from 'package:stats':
#> 
#>     filter

net = as_sfnetwork(roxel, directed = FALSE)

short_path_1_9 = net %>%
  activate("edges") %>%
  dplyr::mutate(weight = edge_length()) %>%
  tidygraph::convert(to_shortest_path, 1, 9)

plot(roxel$geometry)
plot(sf::st_as_sf(short_path_1_9)$geometry, lwd = 5, add = TRUE)


sp = function(net, from, to) {
  net %>%
    activate("edges") %>%
    dplyr::mutate(weight = edge_length()) %>%
    tidygraph::convert(to_shortest_path, from, to)
}
sp(net, 1, 9)
#> # An sfnetwork with 17 nodes and 16 edges
#> #
#> # CRS:  EPSG:4326 
#> #
#> # An unrooted tree with spatially explicit edges
#> #
#> # Edge Data:     16 x 7 (active)
#> # Geometry type: LINESTRING
#> # Dimension:     XY
#> # Bounding box:  xmin: 7.533722 ymin: 51.9475 xmax: 7.538665 ymax: 51.95556
#>    from    to name   type                     geometry   weight .tidygraph_edge…
#>   <int> <int> <fct>  <fct>            <LINESTRING [°]>      [m]            <int>
#> 1     2     5 Nottu… resi… (7.537673 51.9475, 7.53800… 84.4075…               94
#> 2     4     7 Im Se… resi… (7.535677 51.95453, 7.5356…  6.6361…              156
#> 3     7    15 <NA>   cycl… (7.535677 51.95453, 7.5354… 93.7893…              456
#> 4     4     6 Dorff… resi… (7.535656 51.95448, 7.5356… 49.3036…              486
#> 5    15    16 <NA>   cycl… (7.534421 51.95476, 7.5342… 66.7283…              576
#> 6     1    16 <NA>   cycl… (7.534074 51.95532, 7.5339… 36.4379…              577
#> # … with 10 more rows
#> #
#> # Node Data:     17 x 2
#> # Geometry type: POINT
#> # Dimension:     XY
#> # Bounding box:  xmin: 7.533722 ymin: 51.9475 xmax: 7.538665 ymax: 51.95556
#>              geometry .tidygraph_node_index
#>           <POINT [°]>                 <int>
#> 1 (7.533722 51.95556)                     1
#> 2  (7.537673 51.9475)                     9
#> 3 (7.538166 51.94902)                   111
#> # … with 14 more rows
bench::press(n = seq(from = 9, to = 99, length.out = 5),
             {
               bench::mark(
                 sp(net, 9, n)
               )
             }
             )
#> Running with:
#>       n
#> 1   9
#> 2  31.5
#> 3  54
#> 4  76.5
#> 5  99
#> # A tibble: 5 x 7
#>   expression        n      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>    <dbl> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 sp(net, 9, n)   9     42.8ms   44.6ms      21.5    1.78MB     43.0
#> 2 sp(net, 9, n)  31.5   43.4ms     45ms      22.0    1.75MB     18.3
#> 3 sp(net, 9, n)  54     43.7ms   44.3ms      22.5    1.72MB     18.8
#> 4 sp(net, 9, n)  76.5   44.4ms   48.2ms      20.9    1.73MB     31.3
#> 5 sp(net, 9, n)  99     43.6ms   44.4ms      22.3    1.72MB     18.6

# with stplanr # work in progress, not currently working
# library(stplanr)
# sln = SpatialLinesNetwork(roxel)
# p1 = net %>%
#   activate(nodes) %>%
#   st_as_sf() %>%
#   slice(1)
# p9 = net %>%
#   activate(nodes) %>%
#   st_as_sf() %>%
#   slice(1)
#
# stplanr::route_local(sln = sln, from = c(st_coordinates(p1)), to = c(st_coordinates(p9)))
#
# bench::press(n = seq(from = 9, to = 99, length.out = 5),
#              {
#                bench::mark(
#                  sp(net, 9, n)
#                )
#              }
# )

Created on 2020-06-23 by the reprex package (v0.3.0)

Robinlovelace avatar Jun 23 '20 13:06 Robinlovelace

Also just tested the benchark code in the bench folder with the most recent version of this package but not issues here, seems to be an issue specific to continuous benchmarking but not sure how to debug it as I cannot reproduce the error message:

Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 1, 5, 4
Calls: <Anonymous> ... unique -> <Anonymous> -> cbind -> cbind -> data.frame

Happy to have a bash at documenting the need to set git config in the package in a PR btw.

Robinlovelace avatar Jun 23 '20 13:06 Robinlovelace

Will open a separate issue...

Robinlovelace avatar Jun 23 '20 22:06 Robinlovelace