Critical Merging Bug just started...
Confirming exact same error ; mergekit can not find the "base_model" ; including if the path is local (absolute) on windows.
Funny thing is some mergekits work fine - no issue, where as others fail for the reasons below. And merges I did in late SEPT 2024, now SOME fail ; others are fine ?!?!
Example: L3 models -> merge fine, no issue Gemmas: Now break as noted below... but not all of them (??!?!)
This works fine:
models:
- model: G:/9B/gemma-2-9b-it-abliterated parameters: weight: .4 merge_method: dare_ties base_model: G:/9B/gemma2-gutenberg-9B tokenizer_source: union dtype: bfloat16
BUT THIS DIES:
models:
- model: G:/9B/Gemma-2-Ataraxy-9B parameters: weight: [1,1,.75,.5,.25,.25,.05,.01]
- model: G:/9B/Gemma-2-9B-It-SPPO-Iter3 parameters: weight: [1,1,.75,.5,.25,.25,.05,.01]
- model: G:/9B/gemma-2-Ifable-9B parameters: weight: [1,1,.75,.5,.25,.25,.05,.01] merge_method: dare_ties base_model: E:/Gemma-Dark-Writer3-mega-ab dtype: bfloat16
But exact SAME as above (3 models, base, dare_ties) , for Llama 3/3.1 merge - works fine (??)
Other GEMMA merges of the same type (3 models, base, dare_ties) that DID work (sept 2024) now crash and burn.
Even if I change this: "base_model: E:/Gemma-Dark-Writer3-mega-ab"
Still dies, no matter what. If I put in a bad location , it gives the normal not found too ; (??)
Likewise any "Gemma" merges like the one above that DID WORK fine, now crash and burn. (specifically: dare_ties, 3 models + base model)
Please advise.
Traceback (most recent call last): File "
", line 198, in _run_module_as_main File " ", line 88, in run_code File "C:\Program Files\Python312\Scripts\mergekit-yaml.exe_main.py", line 7, in File "C:\Program Files\Python312\Lib\site-packages\click\core.py", line 1157, in call return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\Python312\Lib\site-packages\click\core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "C:\Program Files\Python312\Lib\site-packages\click\core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\Python312\Lib\site-packages\click\core.py", line 783, in invoke return __callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\mergekit3\mergekit\mergekit\options.py", line 82, in wrapper f(*args, **kwargs) File "F:\mergekit3\mergekit\mergekit\scripts\run_yaml.py", line 47, in main run_merge( File "F:\mergekit3\mergekit\mergekit\merge.py", line 96, in run_merge for _task, value in exec.run(quiet=options.quiet): File "F:\mergekit3\mergekit\mergekit\graph.py", line 197, in run res = task.execute(**arguments) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\mergekit3\mergekit\mergekit\merge_methods\generalized_task_arithmetic.py", line 126, in execute tvs, base = get_task_vectors( ^^^^^^^^^^^^^^^^^ File "F:\mergekit3\mergekit\mergekit\merge_methods\generalized_task_arithmetic.py", line 201, in get_task_vectors base = tensors[base_model] ~~~~~~~^^^^^^^^^^^^ KeyError: ModelReference(model=ModelPath(path='G:/9B/gemma2-gutenberg-9B', revision=None), lora=None, override_architecture=None)
Originally posted by @David-AU-github in #446
Added issue: seems even when using "mergekit" work around ; that merge kit is not creating "tokenizer.model" for Gemma models. (previously it did).
RESULT: Can't quant models from source without this file in llamacpp ; I used one from a previous merge (mergekit) .
I think something has changed upstream? Transformers?
slices:
- sources:
- model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit
layer_range:
- 0
- 28
- model: taareshg/Llama-3.2-3B-Instruct-En-Hi-merge-200k
layer_range:
- 0
- 28 merge_method: slerp base_model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit parameters: t:
- filter: self_attn
value:
- 0
- 0.5
- 0.3
- 0.7
- 1
- filter: mlp
value:
- 1
- 0.5
- 0.7
- 0.3
- 0
- value: 0.5 dtype: bfloat16
- model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit
layer_range:
[2025-01-06 12:05:07] [INFO] Merge configuration saved in /tmp/tmp26k63_vv/merged/config.yaml
[2025-01-06 12:05:07] [INFO] Creating repo bhuvneshsaini/unsloth-merge-3.2-3B-Instruct-bnb-4bit
[2025-01-06 12:05:07] [INFO] Repo created: https://huggingface.co/bhuvneshsaini/unsloth-merge-3.2-3B-Instruct-bnb-4bit
[2025-01-06 12:05:07] [INFO] Running mergekit-yaml config.yaml merge --copy-tokenizer --cuda --low-cpu-memory --allow-crimes --lora-merge-cache /tmp/tmp26k63_vv/.lora_cache
[2025-01-06 12:05:10] [INFO]
[2025-01-06 12:05:10] [INFO]
[2025-01-06 12:05:18] [INFO] Warmup loader cache: 0%| | 0/2 [00:00<?, ?it/s][A
[2025-01-06 12:05:18] [INFO]
[2025-01-06 12:05:20] [INFO] Warmup loader cache: 50%|█████ | 1/2 [00:07<00:07, 7.34s/it][A
[2025-01-06 12:05:20] [INFO]
[2025-01-06 12:05:20] [INFO] Warmup loader cache: 100%|██████████| 2/2 [00:10<00:00, 4.62s/it][A
[2025-01-06 12:05:20] [INFO] Warmup loader cache: 100%|██████████| 2/2 [00:10<00:00, 5.03s/it]
[2025-01-06 12:05:22] [INFO]
[2025-01-06 12:05:22] [INFO]
[2025-01-06 12:05:29] [INFO] Executing graph: 0%| | 0/1272 [00:00<?, ?it/s][A
[2025-01-06 12:05:29] [INFO]
[2025-01-06 12:05:30] [INFO] Executing graph: 0%| | 5/1272 [00:07<30:20, 1.44s/it][A
[2025-01-06 12:05:30] [INFO] Executing graph: 1%| | 14/1272 [00:07<11:05, 1.89it/s]
[2025-01-06 12:05:30] [INFO] Traceback (most recent call last):
[2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/bin/mergekit-yaml", line 8, in