GEmma 3 - pass-merge errors
Updated to latest Mergekit and correct transformers with GEmma 3 ; getting following errors ( simple pass-through merge, same model - no other models )
Model: 12b GEmma 3 it (reg, not "pt" / multimodal version)
NOTE: "mergekit7" is newest install , I have several.
mergekit-yaml --copy-tokenizer --allow-crimes --cuda --out-shard-size 5B --lazy-unpickle --clone-tensors f:/mergefiles/Gemma-3-12B-exp40-3.txt E:/Gemma-3-12B-exp40-3
WARNING:mergekit.merge:Unable to set number of layers for module multi_modal_projector in output config - you may need to manually correct it.
Traceback (most recent call last):
File "F:\mergekit7\mergekit\mergekit\merge.py", line 300, in _model_out_config
set_config_value(res, cfg_key, module_layers[module_name])
File "F:\mergekit7\mergekit\mergekit\common.py", line 37, in set_config_value
parts = key.split(".")
^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'split'
Warmup loader cache: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "
With models like this slicing gets a little bit trickier - since the vision tower and the language model have a different number of layers it doesn't make sense to specify slices in such a way. You can use this new syntax:
merge_method: passthrough
modules:
text_decoder:
# frankenmerge the text model
slices:
- sources:
- model: google/gemma-3-12b-it
layer_range: [0, 24]
- sources:
- model: google/gemma-3-12b-it
layer_range: [8, 48]
vision_tower:
# keep the vision tower as is
models:
- model: google/gemma-3-12b-it
# or also frankenmerge it?
# slices:
# - sources:
# - model: google/gemma-3-12b-it
# layer_range: [0, 16]
# - sources:
# - model: google/gemma-3-12b-it
# layer_range: [8, 27]
multi_modal_projector:
# no layers in this module just a single set of weights
models:
- model: google/gemma-3-12b-it
It gives the following error:
e-tensors --cuda --trust-remote-code
Traceback (most recent call last):
File "/home/sicarius/mergekit/env/bin/mergekit-yaml", line 8, in
Sorry about that! Turns out I didn't fully update the config sanity checks and it was not allowing some valid configs. This config works on main now.
thank you !!!
Quick question: Same format with DARE TIES? other merge types?
I tried with other finetuned and it throws >RuntimeError: Tensor language_model.model.layers.47.post_feedforward_layernorm.weight required but not present in model
Anyone have similar experience?
With models like this slicing gets a little bit trickier - since the vision tower and the language model have a different number of layers it doesn't make sense to specify slices in such a way. You can use this new syntax:
I am attempting a similar merge, but with 27b instead of 12b.
merge_method: passthrough
base_model: '/dataset/models/mergekitstuff/Fallen-Gemma3-27B-v1'
modules:
text_decoder:
slices:
- sources:
- model: '/dataset/models/mergekitstuff/Fallen-Gemma3-27B-v1'
layer_range: [0,62]
- sources:
- model: '/dataset/models/mergekitstuff/X-Ray_Alpha_27B_Base'
layer_range: [0,62]
vision_tower:
models:
- model: '/dataset/models/mergekitstuff/X-Ray_Alpha_27B_Base'
multi_modal_projector:
models:
- model: '/dataset/models/mergekitstuff/X-Ray_Alpha_27B_Base'
but this get
File "/dataset/tool/mergekit/mergekit/merge_methods/passthrough.py", line 27, in execute
raise RuntimeError("Passthrough merge expects exactly one tensor")
RuntimeError: Passthrough merge expects exactly one tensor
I pulled the hf source repos for both locally, but dont know what else I need to do to get this resolved and there is no documentation on this error I could find. if this is unrelated to the issue I can move to another.