fix: Fixes wrong input type for raw_dtype in ggml to gguf scripts
- [x] I have read the contributing guidelines
- Self-reported review complexity:
- [x] Low
- [ ] Medium
- [ ] High
A wrong data type has been passed from the add_tensor and add_tensor_info to this function causing another exception while raising another exception. So I changed the input types based on the name raw_dtype and later converted it to an GGMLQuantizationType object which can also handle invalid arguments itself.
related issue #8929
Thanks for finding this and fixing it. There has been many refactors lately where the old convert_llama_ggml_to_gguf.py was not tested at all. (mostly because I don't have old GGML models around to test)
But I think the type of the raw_dtype parameter of GGUFWriter.add_tensor and GGUFWriter.add_tensor_info should stay GGMLQuantizationType (because it's easier to figure how to use that parameter if its type is more specific than int), and that convert_llama_ggml_to_gguf.py should be fixed instead.
Would this work?
diff --git a/convert_llama_ggml_to_gguf.py b/convert_llama_ggml_to_gguf.py
index 7b00b439..701df869 100755
--- a/convert_llama_ggml_to_gguf.py
+++ b/convert_llama_ggml_to_gguf.py
@@ -116,7 +116,7 @@ class Tensor:
assert quant is not None, 'Unknown tensor type'
(blksize, tysize) = quant
offset += 12
- self.dtype= dtype
+ self.dtype= gguf.GGMLQuantizationType(dtype)
self.dims = struct.unpack(f'<{n_dims}I', data[offset:offset + (4 * n_dims)])
offset += 4 * n_dims
self.name = bytes(data[offset:offset + name_len])
@compilade
You're right using GGMLQuantizationType is kind of more readable and idempotent. I tested your suggested fix (created a patch and applied it), and it fixes this issue. I used your fix and committed the changes again.
There was another problem with this script too, I'll soon create the issue and I'm trying to fix it but I need to get to know the ggml and gguf format more.
Thanks for your review.
Is the failed CI check required for merging this PR, do I need to do anything about it? as it does not seem to be related to this PR.
Is the failed CI check required for merging this PR, do I need to do anything about it? as it does not seem to be related to this PR.
You don't need to do anything about it; a fix is pending in #8982, and the source of the problem was identified in https://github.com/ggerganov/llama.cpp/pull/7599#discussion_r1712901202.
@compilade @ggerganov
Since This PR has been approved and labeled as merge-ready but has had no activity in the past 5 days, it came to me to prevent it from becoming stale I could mention you in the comments to take some action on it.
Thanks for the reminder!