candle
candle copied to clipboard
Gemma3 1b 4b inference support
Hi, Gemma3 is now the best local model, if Gemma3 could be supported, candle will become even more popular!
doesn't https://github.com/huggingface/candle/blob/2f3bf42bcba225e956efe086b9534ae53a59213e/candle-transformers/src/models/gemma3.rs cover this?
Currently Candle supports Gemma 3 1B variant. the 4B variant needs something fixed. when i try to load it I get an error when loading the tokenizer.
retrieved the files in 114.371148ms
Error: missing field `attention_bias` at line 38 column 1
it looks like something is different in the tokenizer on 4b from 1b.
Well, this is a serde error for not finding attention_bias at the end of the config.json file. If you look at the Gemma-3 1B config (https://huggingface.co/google/gemma-3-1b-it/blob/main/config.json) vs. the Gemma-3 4B config (https://huggingface.co/google/gemma-3-4b-it/blob/main/config.json), you'll see the config files are different, and the Gemma-3 config.json ends at line 38 where serde errors out! I don't know if there is a difference between their architectures, but I will look into it.
The Gemma 3 1b uses a text-only architecture, while the 4b and up models are vision. This means that not only is the config different to support this, the weights are also slightly different. Supporting text-only for the 4b and up models (for now) could be relatively easy to do.
The vision model is SigLIP and, glancing over the code in the siglip.rs module, probably applies pretty straightforwardly too.
Is there any update about this subject?
I am also facing similar challenges. Trying to load: "google/gemma-3-4b-it" but it fails:
Failed to load the model: missing field `attention_bias` at line 34 column 1
looked closely at the example provided here: https://github.com/huggingface/candle/blob/main/candle-examples/examples/gemma/main.rs but couldn't get it to work. Tried both the pt and it version.