llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Bug: Error when trying to use `./llama-gguf-split --merge` to merge split model gguf files back

Open tybalex opened this issue 1 year ago • 3 comments

What happened?

I was unable to merge split gguf model files back with the following command:

./llama-gguf-split --merge rubra_q4-0000*-of-00006.gguf rubra-q4.gguf

got error:

gguf_merge: rubra_q4-00001-of-00006.gguf -> rubra_q4-00002-of-00006.gguf
gguf_merge: reading metadata rubra_q4-00001-of-00006.gguf done
gguf_merge: reading metadata rubra_q4-00002-of-00006.gguf ...gguf_init_from_file: invalid magic characters 'U'

gguf_merge:  failed to load input GGUF from rubra_q4-00001-of-00006.gguf

and here is how I split the model, this worked:

./llama-gguf-split --split ./rubra-meta-llama-3-70b-instruct.Q4_K_M.gguf ./rubra_q4                        
n_split: 6                                                                                                                                                                                                               
split 00001: n_tensors = 128, total_size = 8030M                                                                                                                                                                         
split 00002: n_tensors = 128, total_size = 7326M                                                                                                                                                                         
split 00003: n_tensors = 128, total_size = 7193M                                                                                                                                                                         
split 00004: n_tensors = 128, total_size = 7044M                                                                                                                                                                         
split 00005: n_tensors = 128, total_size = 7167M                                                                                                                                                                         
split 00006: n_tensors = 83, total_size = 5758M                                                                                                                                                                          
Writing file ./rubra_q4-00001-of-00006.gguf ... done                                                                                                                                                                     
Writing file ./rubra_q4-00002-of-00006.gguf ... done                                                                                                                                                                     
Writing file ./rubra_q4-00003-of-00006.gguf ... done                                                                                                                                                                     
Writing file ./rubra_q4-00004-of-00006.gguf ... done                                                                                                                                                                     
Writing file ./rubra_q4-00005-of-00006.gguf ... done                                                                                                                                                                     
Writing file ./rubra_q4-00006-of-00006.gguf ... done                                                                                                                                                                     
gguf_split: 6 gguf split written with a total of 723 tensors.  

Am I missing something here?

Name and Version

./llama-cli --version version: 3285 (a27152b6) built with cc (Debian 12.2.0-14) 12.2.0 for x86_64-linux-gnu

What operating system are you seeing the problem on?

Linux

Relevant log output

No response

tybalex avatar Jul 02 '24 23:07 tybalex

Same here with "Qwen/Qwen2-72B-Instruct-GGUF". The same problem happens when merging "qwen2-72b-instruct-q8_0-00001-of-00002.gguf" and "qwen2-72b-instruct-q8_0-00002-of-00002.gguf". Tried a few times, while the error message says "gguf_init_from_file: invalid magic characters '?' ", the size of the second model shrinks to 0 gb after the merge operation.

harrychih avatar Jul 17 '24 08:07 harrychih

I have the same issue as well, and the split command also resets the 0002 file to 0 bytes. This forces me to re-download the second file, which is quite annoying.

bss03arg avatar Jul 23 '24 06:07 bss03arg

MacOS Ultra-m2 192GB machine

~/LLAMA/llama.cpp/llama-gguf-split --merge gemma-2-27b-it.BF16/gemma-2-27b-it.BF16-0000* gemma-2-27b-it.BF16.gguf gguf_merge: gemma-2-27b-it.BF16/gemma-2-27b-it.BF16-00001-of-00003.gguf -> gemma-2-27b-it.BF16/gemma-2-27b-it.BF16-00002-of-00003.gguf
gguf_merge: reading metadata gemma-2-27b-it.BF16/gemma-2-27b-it.BF16-00001-of-00003.gguf done
gguf_merge: reading metadata gemma-2-27b-it.BF16/gemma-2-27b-it.BF16-00002-of-00003.gguf ...gguf_init_from_file: invalid magic characters ''

gguf_merge: failed to load input GGUF from gemma-2-27b-it.BF16/gemma-2-27b-it.BF16-00001-of-00003.gguf

~/LLAMA/llama.cpp/llama-gguf-split --version version: 3441 (081fe431) built with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.4.0

ls -l total 59957248 -rw-r--r-- 1 ccui staff 23883518720 7 23 12:13 gemma-2-27b-it.BF16-00001-of-00003.gguf -rw-r--r-- 1 ccui staff 0 7 23 14:14 gemma-2-27b-it.BF16-00002-of-00003.gguf -rw-r--r-- 1 ccui staff 6795218944 7 23 14:08 gemma-2-27b-it.BF16-00003-of-00003.gguf

bss03arg avatar Jul 23 '24 06:07 bss03arg

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Sep 07 '24 01:09 github-actions[bot]

this bug is always there, with version b3678 I got the information:

(python3.12_env) ccui@Mac-Studio bk % ls -l total 146435664 -rw-r--r-- 1 ccui staff 36831830368 9 3 10:42 Hermes-3-Llama-3.1-70B-Q8_0-00001-of-00003.gguf -rw-r--r-- 1 ccui staff 36946630080 9 3 11:26 Hermes-3-Llama-3.1-70B-Q8_0-00002-of-00003.gguf -rw-r--r-- 1 ccui staff 1196589408 9 3 09:10 Hermes-3-Llama-3.1-70B-Q8_0-00003-of-00003.gguf (python3.12_env) ccui@Mac-Studio bk % /Users/ccui/LLAMA/llama.cpp-bin/bin/llama-gguf-split --merge Hermes-3-Llama-3.1-70B-Q8_0-0000* Hermes-3-Llama-3.1-70B-Q8_0.gguf gguf_merge: Hermes-3-Llama-3.1-70B-Q8_0-00001-of-00003.gguf -> Hermes-3-Llama-3.1-70B-Q8_0-00002-of-00003.gguf gguf_merge: reading metadata Hermes-3-Llama-3.1-70B-Q8_0-00001-of-00003.gguf done gguf_merge: reading metadata Hermes-3-Llama-3.1-70B-Q8_0-00002-of-00003.gguf ...gguf_init_from_file: invalid magic characters ''

gguf_merge: failed to load input GGUF from Hermes-3-Llama-3.1-70B-Q8_0-00001-of-00003.gguf

(python3.12_env) ccui@inateckdeMac-Studio bk % /Users/ccui/LLAMA/llama.cpp-bin/bin/llama-gguf-split --version version: 3678 (9b2c24c0) built with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.6.0

(python3.12_env) ccui@Mac-Studio bk % ls -l total 74274272 -rw-r--r-- 1 ccui staff 36831830368 9 3 10:42 Hermes-3-Llama-3.1-70B-Q8_0-00001-of-00003.gguf -rw-r--r-- 1 ccui staff 0 9 7 13:49 Hermes-3-Llama-3.1-70B-Q8_0-00002-of-00003.gguf -rw-r--r-- 1 ccui staff 1196589408 9 3 09:10 Hermes-3-Llama-3.1-70B-Q8_0-00003-of-00003.gguf

The second file is directly reset to 0 again.

bss03arg avatar Sep 07 '24 05:09 bss03arg