DHS-LLM-Workshop Using device_map auto when launch acceleator

Using device_map auto when launch acceleator

Open ronyadgar opened this issue 1 year ago • 0 comments

in https://github.com/pacman100/DHS-LLM-Workshop/blob/main/chat_assistant/training/utils.py#L182C9-L182C19, what is the reason to set device_map = 'auto' When I run it with accelerator (with fsdp) I got the error

ValueError: You can't train a model that has been loaded with `device_map='auto'` in any distributed mode. Please rerun your script specifying `--num_processes=1` or by launching with `python {{myscript.py}}`.

Also in here (https://github.com/huggingface/accelerate/issues/1840) it states that it is not compatible with using DistributedDataParallel.

Nov 13 '23 09:11 ronyadgar

DHS-LLM-Workshop DHS-LLM-Workshop copied to clipboard

Using device_map auto when launch acceleator

DHS-LLM-Workshop
DHS-LLM-Workshop copied to clipboard