llama.cpp
llama.cpp copied to clipboard
Encountered a warning when using convert.py to create a q8_0 quant of mergekit output
Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bug.
The version of convert was pulled on March 18. (Not sure how to get the exact version.)
This error cropped up during when quanting the output of a frankenmerge of two 7B models produced by mergekit:
llama.cpp\convert.py:99: RuntimeWarning: invalid value encountered in divide qs = (blocks / d[:, None]).round()
The following YAML configuration was used to produce this model using mergekit:
slices:
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [0, 8]
- sources:
- model: grimjim/kukulemon-7B
layer_range: [4, 12]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [9, 16]
- sources:
- model: grimjim/kukulemon-7B
layer_range: [13, 20]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [17, 24]
- sources:
- model: grimjim/kukulemon-7B
layer_range: [21, 28]
- sources:
- model: SanjiWatsuki/Kunoichi-DPO-v2-7B
layer_range: [25, 32]
merge_method: passthrough
dtype: float16
If the bug concerns the server, please try to reproduce it first using the server test scenario framework.