mergekit icon indicating copy to clipboard operation
mergekit copied to clipboard

Critical Merging Bug just started...

Open David-AU-github opened this issue 1 year ago • 2 comments

Confirming exact same error ; mergekit can not find the "base_model" ; including if the path is local (absolute) on windows.

Funny thing is some mergekits work fine - no issue, where as others fail for the reasons below. And merges I did in late SEPT 2024, now SOME fail ; others are fine ?!?!

Example: L3 models -> merge fine, no issue Gemmas: Now break as noted below... but not all of them (??!?!)

This works fine:

models:

  • model: G:/9B/gemma-2-9b-it-abliterated parameters: weight: .4 merge_method: dare_ties base_model: G:/9B/gemma2-gutenberg-9B tokenizer_source: union dtype: bfloat16

BUT THIS DIES:

models:

  • model: G:/9B/Gemma-2-Ataraxy-9B parameters: weight: [1,1,.75,.5,.25,.25,.05,.01]
  • model: G:/9B/Gemma-2-9B-It-SPPO-Iter3 parameters: weight: [1,1,.75,.5,.25,.25,.05,.01]
  • model: G:/9B/gemma-2-Ifable-9B parameters: weight: [1,1,.75,.5,.25,.25,.05,.01] merge_method: dare_ties base_model: E:/Gemma-Dark-Writer3-mega-ab dtype: bfloat16

But exact SAME as above (3 models, base, dare_ties) , for Llama 3/3.1 merge - works fine (??)

Other GEMMA merges of the same type (3 models, base, dare_ties) that DID work (sept 2024) now crash and burn.

Even if I change this: "base_model: E:/Gemma-Dark-Writer3-mega-ab"

Still dies, no matter what. If I put in a bad location , it gives the normal not found too ; (??)

Likewise any "Gemma" merges like the one above that DID WORK fine, now crash and burn. (specifically: dare_ties, 3 models + base model)

Please advise.

Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in run_code File "C:\Program Files\Python312\Scripts\mergekit-yaml.exe_main.py", line 7, in File "C:\Program Files\Python312\Lib\site-packages\click\core.py", line 1157, in call return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\Python312\Lib\site-packages\click\core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "C:\Program Files\Python312\Lib\site-packages\click\core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Program Files\Python312\Lib\site-packages\click\core.py", line 783, in invoke return __callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\mergekit3\mergekit\mergekit\options.py", line 82, in wrapper f(*args, **kwargs) File "F:\mergekit3\mergekit\mergekit\scripts\run_yaml.py", line 47, in main run_merge( File "F:\mergekit3\mergekit\mergekit\merge.py", line 96, in run_merge for _task, value in exec.run(quiet=options.quiet): File "F:\mergekit3\mergekit\mergekit\graph.py", line 197, in run res = task.execute(**arguments) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\mergekit3\mergekit\mergekit\merge_methods\generalized_task_arithmetic.py", line 126, in execute tvs, base = get_task_vectors( ^^^^^^^^^^^^^^^^^ File "F:\mergekit3\mergekit\mergekit\merge_methods\generalized_task_arithmetic.py", line 201, in get_task_vectors base = tensors[base_model] ~~~~~~~^^^^^^^^^^^^ KeyError: ModelReference(model=ModelPath(path='G:/9B/gemma2-gutenberg-9B', revision=None), lora=None, override_architecture=None)

Originally posted by @David-AU-github in #446

David-AU-github avatar Nov 12 '24 12:11 David-AU-github

Added issue: seems even when using "mergekit" work around ; that merge kit is not creating "tokenizer.model" for Gemma models. (previously it did).

RESULT: Can't quant models from source without this file in llamacpp ; I used one from a previous merge (mergekit) .

I think something has changed upstream? Transformers?

David-AU-github avatar Nov 16 '24 02:11 David-AU-github

slices:

  • sources:
    • model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit layer_range:
      • 0
      • 28
    • model: taareshg/Llama-3.2-3B-Instruct-En-Hi-merge-200k layer_range:
      • 0
      • 28 merge_method: slerp base_model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit parameters: t:
    • filter: self_attn value:
      • 0
      • 0.5
      • 0.3
      • 0.7
      • 1
    • filter: mlp value:
      • 1
      • 0.5
      • 0.7
      • 0.3
      • 0
    • value: 0.5 dtype: bfloat16

[2025-01-06 12:05:07] [INFO] Merge configuration saved in /tmp/tmp26k63_vv/merged/config.yaml [2025-01-06 12:05:07] [INFO] Creating repo bhuvneshsaini/unsloth-merge-3.2-3B-Instruct-bnb-4bit [2025-01-06 12:05:07] [INFO] Repo created: https://huggingface.co/bhuvneshsaini/unsloth-merge-3.2-3B-Instruct-bnb-4bit [2025-01-06 12:05:07] [INFO] Running mergekit-yaml config.yaml merge --copy-tokenizer --cuda --low-cpu-memory --allow-crimes --lora-merge-cache /tmp/tmp26k63_vv/.lora_cache [2025-01-06 12:05:10] [INFO] [2025-01-06 12:05:10] [INFO] [2025-01-06 12:05:18] [INFO] Warmup loader cache: 0%| | 0/2 [00:00<?, ?it/s][A [2025-01-06 12:05:18] [INFO] [2025-01-06 12:05:20] [INFO] Warmup loader cache: 50%|█████ | 1/2 [00:07<00:07, 7.34s/it][A [2025-01-06 12:05:20] [INFO] [2025-01-06 12:05:20] [INFO] Warmup loader cache: 100%|██████████| 2/2 [00:10<00:00, 4.62s/it][A [2025-01-06 12:05:20] [INFO] Warmup loader cache: 100%|██████████| 2/2 [00:10<00:00, 5.03s/it] [2025-01-06 12:05:22] [INFO] [2025-01-06 12:05:22] [INFO] [2025-01-06 12:05:29] [INFO] Executing graph: 0%| | 0/1272 [00:00<?, ?it/s][A [2025-01-06 12:05:29] [INFO] [2025-01-06 12:05:30] [INFO] Executing graph: 0%| | 5/1272 [00:07<30:20, 1.44s/it][A [2025-01-06 12:05:30] [INFO] Executing graph: 1%| | 14/1272 [00:07<11:05, 1.89it/s] [2025-01-06 12:05:30] [INFO] Traceback (most recent call last): [2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/bin/mergekit-yaml", line 8, in [2025-01-06 12:05:30] [INFO] sys.exit(main()) [2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/click/core.py", line 1157, in call [2025-01-06 12:05:30] [INFO] return self.main(*args, **kwargs) [2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/click/core.py", line 1078, in main [2025-01-06 12:05:30] [INFO] rv = self.invoke(ctx) [2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/click/core.py", line 1434, in invoke [2025-01-06 12:05:30] [INFO] return ctx.invoke(self.callback, **ctx.params) [2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/click/core.py", line 783, in invoke [2025-01-06 12:05:30] [INFO] return __callback(*args, **kwargs) [2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/mergekit/options.py", line 82, in wrapper [2025-01-06 12:05:30] [INFO] f(*args, **kwargs) [2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/mergekit/scripts/run_yaml.py", line 47, in main [2025-01-06 12:05:30] [INFO] run_merge( [2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/mergekit/merge.py", line 96, in run_merge [2025-01-06 12:05:30] [INFO] for _task, value in exec.run(quiet=options.quiet): [2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/mergekit/graph.py", line 197, in run [2025-01-06 12:05:30] [INFO] res = task.execute(**arguments) [2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/mergekit/merge_methods/slerp.py", line 60, in execute [2025-01-06 12:05:30] [INFO] slerp( [2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/mergekit/merge_methods/slerp.py", line 137, in slerp [2025-01-06 12:05:30] [INFO] dot = np.sum(v0 * v1) [2025-01-06 12:05:30] [INFO] ValueError: operands could not be broadcast together with shapes (12582912,1) (3072,8192) [2025-01-06 12:05:30] [ERROR] Command exited with code 1 [2025-01-06 12:05:30] [ERROR] Merge failed. Deleting repo as no model is uploaded.

bhuvneshsaini avatar Jan 06 '25 11:01 bhuvneshsaini