blogdown Enable optional parallel site building

This PR proposes an option to build_site() using multiple parallel-running R processes, utilizing only the base package parallel. This can result in significant speed improvements with hardware common in 2019.

To enable the parallelization, the user must specify 2 options:

options(
  blogdown.use_parallel = TRUE,
  blogdown.use_parallel.cores = number_of_cores
)

The functionality is only triggered if:

the option blogdown.use_parallel is TRUE
the option blogdown.use_parallel.cores is > 1
the length of files is > 1
the parallel package is available

If this direction of functionality is accepted, we can make the implementation less conservative and easier to use.

May 15 '19 17:05 jozefhajnala

I would consider this ready to be reviewed, just not sure about the "UI" in terms of using the feature. I am a fan of conservative introduction of new features, but happy to change.

May 16 '19 17:05 jozefhajnala

@yihui, is there anything waited on from my side?

May 30 '19 07:05 jozefhajnala

Just some thoughts about letting user choose how to do parallel. 💭

The future package is really helpful for that. It allows a clean separation between what is parallelized and how it should be run. However, it would mean adding it as dependency, at least in suggest, and could be too heavy 🤔
A vignette that gives examples of how to use this option could be insteresting. It would allow to show build_rmds_parallel and explain how it build on parallel package. Either the user can copy paste from the vignette, or know that a helper blogdown::build_rmds_parallel exists.

Thanks for this feature by the way !

Jun 15 '19 10:06 cderv

Just some thoughts about letting user choose how to do parallel. 💭

The future package is really helpful for that. It allows a clean separation between what is parallelized and how it should be run. However, it would mean adding it as dependency, at least in suggest, and could be too heavy 🤔

A vignette that gives examples of how to use this option could be insteresting. It would allow to show build_rmds_parallel and explain how it build on parallel package. Either the user can copy paste from the vignette, or know that a helper blogdown::build_rmds_parallel exists.

Thanks for this feature by the way !

Thanks for the feedback!

The currently suggested PR would let you choose your own way of parallelization, so you could define options("blogdown.build_rmds" = my_parallelization_fun), where my_parallelization_fun() could use the future package to parallelize, without having to introduce a dependency to blogdown
I will happily spend some time writing a vignette if we get the PR merged ;-)

Jun 18 '19 18:06 jozefhajnala

Hi @yihui, do we still want to move this forward?

Sep 25 '19 18:09 jozefhajnala

blogdown blogdown copied to clipboard

Enable optional parallel site building

blogdown
blogdown copied to clipboard