LLaMA-Factory
LLaMA-Factory copied to clipboard
Support for grok-1 and dbrx models
Reminder
- [X] I have read the README and searched the existing issues.
Reproduction
Models:
- grok-1: https://huggingface.co/xai-org/grok-1
- dbrx-base: https://huggingface.co/databricks/dbrx-base
- dbrx-instruct: https://huggingface.co/databricks/dbrx-instruct
Expected behavior
No response
System Info
No response
Others
I would like to contribute to bring support for grok-1 and dbrx models.
Thanks, hope to see dbrx support soon.
And command R+
Command R+ and DBRX have been supported
Thanks, for dbrx does it support fine-tuning and how many memory is required for full/lora/qlora?
@mces89 similar to mixtral 8x22B
@hiyouga from this link: https://huggingface.co/databricks/dbrx-instruct/discussions/18#660c2f4ee6569b99a2e03f63 seems it requires much more memory than mixtral 8x22B listed in this repo's readme table?