efficientR icon indicating copy to clipboard operation
efficientR copied to clipboard

Add code for #292

Open Robinlovelace opened this issue 3 years ago • 7 comments

Robinlovelace avatar Oct 25 '20 09:10 Robinlovelace

@Robinlovelace Nice additional, however, the code doesn't run for me. Also, what do you suggest about timing from vroom and lasy loading?

csgillespie avatar Oct 25 '20 18:10 csgillespie

Was just a starter for 10. What do you mean by 'timing for lazy loading'? Happy to iterate, just trying to get things up-to-date. Another idea: should we print package versions? I think the current benchmark results are pretty out-of-date...

Robinlovelace avatar Oct 26 '20 07:10 Robinlovelace

The main reason vroom can be faster is because character data is read from the file lazily; you only pay for the data you use. This lazy access is done automatically, so no changes to your R data-manipulation code are needed.

Source: https://www.tidyverse.org/blog/2019/05/vroom-1-0-0/

Cheers

csgillespie avatar Oct 26 '20 14:10 csgillespie

Makes sense. We could add that caveat to the text - that's a good starting point, especially as some char string variables are not used. Re the implementation, that's amazing. Does it mean that an object created by vroom knows the file that generated it and will convert the text to character representation only if that column is used?

Robinlovelace avatar Oct 26 '20 14:10 Robinlovelace

Worth documenting that and adding a link to the book I think, a very interested and fast implementation.

Robinlovelace avatar Oct 26 '20 14:10 Robinlovelace

I'm hoping to take a look at this a bit later in the week. I do agree to printing package versions though; I think it's a quick way to know if the benchmark is wildly out of date.

engineerchange avatar Oct 26 '20 15:10 engineerchange

Great, thanks @engineerchange. Any additions on top of this branch very welcome.

Robinlovelace avatar Oct 26 '20 15:10 Robinlovelace