bumblebee Support Mixtral

It seems that bumblebee is not capable of loading Mixtral-8x7B models (base or instruct). I've checked the files and it should be able to load the model (in theory) since it is capable of loading Mistral-7b files, but I keep getting

** (ArgumentError) could not match the class name "MixtralForCausalLM" to any of the supported models, please specify the :module and :architecture options
    (bumblebee 0.5.3) lib/bumblebee.ex:409: Bumblebee.do_load_spec/4
    (bumblebee 0.5.3) lib/bumblebee.ex:578: Bumblebee.maybe_load_model_spec/3
    (bumblebee 0.5.3) lib/bumblebee.ex:566: Bumblebee.load_model/2
    #cell:4lsfbdujnkoinq5p:3: (file)

Both configuration files have the same reference to MixtralForCausalLM and Mixtral-8x7B has the safetensors filers.

Mix.install(
  [
    {:bumblebee, "~> 0.5.3"},
    {:exla, ">= 0.0.0"}
  ],
  config: [nx: [default_backend: EXLA.Backend]]
)

repo = {:hf, "mistralai/Mixtral-8x7B-Instruct-v0.1"}

{:ok, model_info} = Bumblebee.load_model(repo, type: :bf16, backend: EXLA.Backend) # errors
{:ok, tokenizer} = Bumblebee.load_tokenizer(repo)
{:ok, generation_config} = Bumblebee.load_generation_config(repo)

Apr 10 '24 12:04 WebCloud

@WebCloud Bumblebee needs implementation for each model type in order to load it. Mixtral is not implemented currently, while Mistral is.

Apr 10 '24 13:04 jonatanklosko

I see! Well, I am quite fresh on Elixir, but I'd be happy to help however I can.

Apr 10 '24 18:04 WebCloud

For an example of adding a model you can see Mistral #264. The corresponding hf/transformers code is modeling_mixtral.py.

However, note that Nx/EXLA doesn't support quantization yet, and the bf16 model is around 100GB, so it is not very practical for running on the GPU at this point.

Apr 10 '24 18:04 jonatanklosko

Cool! thanks for the resources!

Apr 11 '24 07:04 WebCloud

bumblebee bumblebee copied to clipboard

Support Mixtral

bumblebee
bumblebee copied to clipboard