ValueError: operands could not be broadcast together with shapes (12582912,1) (3072,8192)
slices:
- sources:
- model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit
layer_range:
- 0
- 28
- model: taareshg/Llama-3.2-3B-Instruct-En-Hi-merge-200k
layer_range:
- 0
- 28 merge_method: slerp base_model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit parameters: t:
- filter: self_attn
value:
- 0
- 0.5
- 0.3
- 0.7
- 1
- filter: mlp
value:
- 1
- 0.5
- 0.7
- 0.3
- 0
- value: 0.5 dtype: bfloat16
- model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit
layer_range:
error:
[2025-01-06 12:05:07] [INFO] Merge configuration saved in /tmp/tmp26k63_vv/merged/config.yaml
[2025-01-06 12:05:07] [INFO] Creating repo bhuvneshsaini/unsloth-merge-3.2-3B-Instruct-bnb-4bit
[2025-01-06 12:05:07] [INFO] Repo created: https://huggingface.co/bhuvneshsaini/unsloth-merge-3.2-3B-Instruct-bnb-4bit
[2025-01-06 12:05:07] [INFO] Running mergekit-yaml config.yaml merge --copy-tokenizer --cuda --low-cpu-memory --allow-crimes --lora-merge-cache /tmp/tmp26k63_vv/.lora_cache
[2025-01-06 12:05:10] [INFO]
[2025-01-06 12:05:10] [INFO]
[2025-01-06 12:05:18] [INFO] Warmup loader cache: 0%| | 0/2 [00:00<?, ?it/s][A
[2025-01-06 12:05:18] [INFO]
[2025-01-06 12:05:20] [INFO] Warmup loader cache: 50%|█████ | 1/2 [00:07<00:07, 7.34s/it][A
[2025-01-06 12:05:20] [INFO]
[2025-01-06 12:05:20] [INFO] Warmup loader cache: 100%|██████████| 2/2 [00:10<00:00, 4.62s/it][A
[2025-01-06 12:05:20] [INFO] Warmup loader cache: 100%|██████████| 2/2 [00:10<00:00, 5.03s/it]
[2025-01-06 12:05:22] [INFO]
[2025-01-06 12:05:22] [INFO]
[2025-01-06 12:05:29] [INFO] Executing graph: 0%| | 0/1272 [00:00<?, ?it/s][A
[2025-01-06 12:05:29] [INFO]
[2025-01-06 12:05:30] [INFO] Executing graph: 0%| | 5/1272 [00:07<30:20, 1.44s/it][A
[2025-01-06 12:05:30] [INFO] Executing graph: 1%| | 14/1272 [00:07<11:05, 1.89it/s]
[2025-01-06 12:05:30] [INFO] Traceback (most recent call last):
[2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/bin/mergekit-yaml", line 8, in
You should try using meta-llama/Llama-3.2-3B-Instruct instead of unsloth/Llama-3.2-3B-Instruct-bnb-4bit