CodeGPT icon indicating copy to clipboard operation
CodeGPT copied to clipboard

Draft: feat: Support DBRX model in Llama

Open reneleonhardt opened this issue 1 year ago • 6 comments

The new Open Source model DBRX sounds amazing, is this enough and correct to integrate it into Llama? https://github.com/ggerganov/llama.cpp/pull/6515 https://huggingface.co/collections/phymbert/dbrx-16x12b-instruct-gguf-6619a7a4b7c50831dd33c7c8 https://www.databricks.com/blog/announcing-dbrx-new-standard-efficient-open-source-customizable-llms https://github.com/databricks/dbrx https://huggingface.co/collections/databricks/

llama.cpp seems to support splitted/sharded files, but I would need to download all of them first I suppose... 😅

reneleonhardt avatar Apr 15 '24 11:04 reneleonhardt

Since the change was recent, we need to update the llama.cpp submodule as well

carlrobertoh avatar Apr 15 '24 11:04 carlrobertoh

Since the change was recent, we need to update the llama.cpp submodule as well

Done

reneleonhardt avatar Apr 15 '24 12:04 reneleonhardt

I'll try running the model locally soon and see if any other changes are necessary

carlrobertoh avatar Apr 15 '24 12:04 carlrobertoh

I'll try running the model locally soon and see if any other changes are necessary

Great! But in this PR I have to implement downloading all 10 files first I guess... 😅

reneleonhardt avatar Apr 15 '24 12:04 reneleonhardt

@phymbert I can download https://huggingface.co/phymbert/dbrx-16x12b-instruct-iq3_xxs-gguf without login in the browser, but inside the plugin I get 403 Forbidden, is this to be expected with the databricks-open-model-license (other) license? Do you think DBRX is not particularly suited as a coding assistant? The smallest is 53 GB huge 😅

reneleonhardt avatar Apr 23 '24 06:04 reneleonhardt

Dbrx is a gated model, so I believe you have to pass a read token. There is an issue open on llama.cpp to support this.

phymbert avatar Apr 23 '24 07:04 phymbert