candle icon indicating copy to clipboard operation
candle copied to clipboard

Gemma3 1b 4b inference support

Open MonolithFoundation opened this issue 8 months ago • 7 comments

Hi, Gemma3 is now the best local model, if Gemma3 could be supported, candle will become even more popular!

MonolithFoundation avatar Mar 13 '25 03:03 MonolithFoundation

doesn't https://github.com/huggingface/candle/blob/2f3bf42bcba225e956efe086b9534ae53a59213e/candle-transformers/src/models/gemma3.rs cover this?

BerserkerMother avatar Apr 07 '25 18:04 BerserkerMother

Currently Candle supports Gemma 3 1B variant. the 4B variant needs something fixed. when i try to load it I get an error when loading the tokenizer.

retrieved the files in 114.371148ms
Error: missing field `attention_bias` at line 38 column 1

it looks like something is different in the tokenizer on 4b from 1b.

AlpineVibrations avatar Apr 07 '25 19:04 AlpineVibrations

Well, this is a serde error for not finding attention_bias at the end of the config.json file. If you look at the Gemma-3 1B config (https://huggingface.co/google/gemma-3-1b-it/blob/main/config.json) vs. the Gemma-3 4B config (https://huggingface.co/google/gemma-3-4b-it/blob/main/config.json), you'll see the config files are different, and the Gemma-3 config.json ends at line 38 where serde errors out! I don't know if there is a difference between their architectures, but I will look into it.

BerserkerMother avatar Apr 07 '25 19:04 BerserkerMother

The Gemma 3 1b uses a text-only architecture, while the 4b and up models are vision. This means that not only is the config different to support this, the weights are also slightly different. Supporting text-only for the 4b and up models (for now) could be relatively easy to do.

EricLBuehler avatar Apr 07 '25 19:04 EricLBuehler

The vision model is SigLIP and, glancing over the code in the siglip.rs module, probably applies pretty straightforwardly too.

jremb avatar Apr 07 '25 22:04 jremb

Is there any update about this subject?

barel-mishal avatar Aug 29 '25 08:08 barel-mishal

I am also facing similar challenges. Trying to load: "google/gemma-3-4b-it" but it fails:

Failed to load the model: missing field `attention_bias` at line 34 column 1

looked closely at the example provided here: https://github.com/huggingface/candle/blob/main/candle-examples/examples/gemma/main.rs but couldn't get it to work. Tried both the pt and it version.

boersmamarcel avatar Oct 12 '25 18:10 boersmamarcel