LLaMA-Factory icon indicating copy to clipboard operation
LLaMA-Factory copied to clipboard

Support for grok-1 and dbrx models

Open martinakaduc opened this issue 10 months ago • 2 comments

Reminder

  • [X] I have read the README and searched the existing issues.

Reproduction

Models:

  • grok-1: https://huggingface.co/xai-org/grok-1
  • dbrx-base: https://huggingface.co/databricks/dbrx-base
  • dbrx-instruct: https://huggingface.co/databricks/dbrx-instruct

Expected behavior

No response

System Info

No response

Others

I would like to contribute to bring support for grok-1 and dbrx models.

martinakaduc avatar Mar 29 '24 03:03 martinakaduc

Thanks, hope to see dbrx support soon.

mces89 avatar Apr 02 '24 19:04 mces89

And command R+

luo-li-ba-suo avatar Apr 12 '24 06:04 luo-li-ba-suo

Command R+ and DBRX have been supported

hiyouga avatar Apr 25 '24 11:04 hiyouga

Thanks, for dbrx does it support fine-tuning and how many memory is required for full/lora/qlora?

mces89 avatar Apr 25 '24 14:04 mces89

@mces89 similar to mixtral 8x22B

hiyouga avatar Apr 25 '24 15:04 hiyouga

@hiyouga from this link: https://huggingface.co/databricks/dbrx-instruct/discussions/18#660c2f4ee6569b99a2e03f63 seems it requires much more memory than mixtral 8x22B listed in this repo's readme table?

mces89 avatar Apr 25 '24 15:04 mces89