distilabel
distilabel copied to clipboard
[FEATURE] Multiprocessing at `Step` level for `Pipeline`
Currently, each Step
of the Pipeline
gets executed in a single process. It would be good to achieve parallelization at Step
level too, i.e. one step of the pipeline uses multiple processes to produce outputs faster. This could be really beneficial for steps that are slow, like GenerateEmbeddings
, which usually causes a bottleneck in the pipelines.