distilabel icon indicating copy to clipboard operation
distilabel copied to clipboard

[FEATURE] Multiprocessing at `Step` level for `Pipeline`

Open gabrielmbmb opened this issue 10 months ago • 0 comments

Currently, each Step of the Pipeline gets executed in a single process. It would be good to achieve parallelization at Step level too, i.e. one step of the pipeline uses multiple processes to produce outputs faster. This could be really beneficial for steps that are slow, like GenerateEmbeddings, which usually causes a bottleneck in the pipelines.

gabrielmbmb avatar Apr 10 '24 11:04 gabrielmbmb