stanford_alpaca
stanford_alpaca copied to clipboard
Deepspeed Training vs FSDP?
Someone told me there is a deepspeed training option in the code, can I ask why its not the default? Do we know if it's far faster, and if so, how much faster at training Llama?