llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

force int cast

Open ostix360 opened this issue 1 year ago • 7 comments

.0 in the config file for the lora_alpha param and I got this error

fout.write(struct.pack("ii", int(params["r"]), params["lora_alpha"]))
struct.error: required argument is not an integer

I just cast

ostix360 avatar Apr 25 '23 07:04 ostix360

Shouldn't we fix the config file instead rather than silencing this bug?

prusnak avatar Apr 25 '23 14:04 prusnak

If I have this error with ist downloading the Lora config from Hugging face so I think it's not the only one to have a .0 for the param's Lora-alpha And the patch cost nothing in terms of performance

ostix360 avatar Apr 25 '23 14:04 ostix360

The mistake here seems to be assuming that lora_alpha is an integer, if it is actually a float. In that case, it should be exported as a float as well. At the very least, we need to check that int(lora_alpha) == lora_alpha.

slaren avatar Apr 25 '23 14:04 slaren

Indeed lora_aplha can be a float value but if this equality int(lora_alpha) == lora_alpha is not true what are supposed to write in the file? In the major case alpha Lora is an integer but if it's not and we write it as a float value on the ggml file, does it change something when it injected to the the model we run it in inference?

ostix360 avatar Apr 25 '23 15:04 ostix360

The documentation here tells that lora_alpha is an int: https://opendelta.readthedocs.io/en/latest/modules/deltas.html

I think there are two ways how to proceed:

  1. use this PR, but add assert int(lora_alpha) == lora_alpha to the conversion script
  2. change lora_alpha from int to double in both convert-lora-to-ggml.py and llama.cpp

Option 2 has two subvariants: a. do the change and do not bump the version b. do the change and bump the version

Which way do you want to go @slaren? 1, 2a or 2b?

prusnak avatar Apr 25 '23 15:04 prusnak

If the documentation says that it is an int, just go with option 1 as a sanity check and be done with it.

slaren avatar Apr 25 '23 15:04 slaren

Note to self: don't try to write valid code as a GitHub suggestion on a phone 😅

prusnak avatar Apr 25 '23 17:04 prusnak