llama2.c
llama2.c copied to clipboard
Is it possible to adapt this code from DDP to FSDP? If yes, what are the potential issues to look out for?
Hi,
Thank you for the fantastic repo. I recently picked up interest in FSDP. Is it possible to adapt this model to FSDP? If yes, what are the things that I should look out for?
Thank you.
Best regards