axolotl icon indicating copy to clipboard operation
axolotl copied to clipboard

FSDP with Mistral

Open SicariusSicariiStuff opened this issue 1 year ago • 3 comments

What piece of documentation is affected?

https://github.com/OpenAccess-AI-Collective/axolotl/tree/main/examples/mistral

What part(s) of the article would you like to see updated?

There's FSDP example with mixtral (which is awesome!) but no example with Mistral. (It uses a different decode layer)

Additional Information

Can you please include and example for Mistral with FSDP offload?

Acknowledgements

  • [X] My issue title is concise, descriptive, and in title casing.
  • [X] I have searched the existing issues to make sure this feature has not been requested yet.
  • [X] I have provided enough information for the maintainers to understand and evaluate this request.

SicariusSicariiStuff avatar Apr 06 '24 11:04 SicariusSicariiStuff

fsdp for mistral uses fsdp_transformer_layer_cls_to_wrap: MistralDecoderLayer. If you would like to contribute an example, please feel free to make a PR.

NanoCode012 avatar Apr 09 '24 10:04 NanoCode012

seems like fsdp still not working with qlora, i'm getting flat tensors...

SicariusSicariiStuff avatar Apr 09 '24 23:04 SicariusSicariiStuff

Is it due to mismatch dtype?

NanoCode012 avatar Apr 30 '24 16:04 NanoCode012