data-juicer icon indicating copy to clipboard operation
data-juicer copied to clipboard

[BUG]: inappropriate arguments for `map_batches` in ray mode

Open HYLcool opened this issue 1 year ago • 0 comments

For now, running Data-Juicer on multiple nodes in "ray" mode, which uses map_batches to process datasets, might cause some implicit problems.

The map_batches method has two arguments, num_gpus and concurrency, which are actually cluster-level arguments. However, they are calculated automatically according to the hardware information of a single machine. So, there might be some resource utilization problems when running on multiple nodes for OPs with _accelerator is "cuda".

HYLcool avatar Jan 08 '25 06:01 HYLcool