repr icon indicating copy to clipboard operation
repr copied to clipboard

repr_geojson can be slow

Open karldw opened this issue 5 years ago • 2 comments

I noticed repr_geojson can be quite slow with larger datasets. I think that's because repr_geojson is rendering the entire object as geojson here. Would it be possible to subset the portion that will be displayed before converting to geojson?

library(repr)

# Demo spatial data from sf
nc <- sf::read_sf(system.file("shape/nc.shp", package="sf"))
nc_big <- nc[rep(seq_len(nrow(nc)), 1000), ]  # copy 1000 times
nrow(nc_big)
#> 100000

nc_no_geom <- sf::st_drop_geometry(nc)
nc_big_no_geom <- sf::st_drop_geometry(nc_big)

# No difference in non-geo repr
bench::mark(repr(nc_no_geom), repr(nc_big_no_geom), check=FALSE)
#>   expression              min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
#>   <bch:expr>           <bch:> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>
#> 1 repr(nc_no_geom)      3.2ms 3.54ms      253.    98.6KB     6.28   121     3
#> 2 repr(nc_big_no_geom) 3.16ms 3.46ms      287.    98.6KB     6.32   136     3

# nc_big is about 1000x slower in repr_geojson
bench::mark(repr_geojson(nc), repr_geojson(nc_big), check=FALSE)
#>   expression              min median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc
#>   <bch:expr>           <bch:> <bch:>     <dbl> <bch:byt>    <dbl> <int> <dbl>
#> 1 repr_geojson(nc)     14.1ms   16ms   59.7       1.66KB     5.97    30     3
#> 2 repr_geojson(nc_big)  17.4s  17.4s    0.0575    1.53MB     4.03     1    70

karldw avatar Jul 16 '20 03:07 karldw

Sure! Would you like to do a PR? I don’t really know much about the format.

flying-sheep avatar Jul 16 '20 13:07 flying-sheep

I won't be able to write a PR soon, but I'll add it to my list to look at eventually. If anyone reading this wants to write a PR, please go ahead.

karldw avatar Jul 16 '20 17:07 karldw