llama.cpp Improved quantize script

Improved quantize script

Open SuajCarrot opened this issue 1 year ago • 2 comments

I improved the quantize script by adding error handling and allowing to select many models for quantization at once in the command line. I also converted it to Python for generalization as well as extensibility.

Mar 17 '23 03:03 SuajCarrot

I think it's a good idea to remove the requirement for a Unix shell. See also #285 which would be made obsolete by this, assuming the Python script works on Windows?

I suggest to remove quantize.sh and update the readme to use the python script.

Mar 19 '23 10:03 sw

Thank you for your comment @sw, I just pushed a commit that applies the changes you suggested. Let me know if there's anything else that should be done to achieve full compatibility with Windows.

Mar 19 '23 16:03 SuajCarrot

Should we merge now or wait for someone to test on Windows?

@SuajCarrot maybe keep the .sh for now and add a comment that it is deprecated. We will remove it later.

Mar 19 '23 18:03 ggerganov

Thank you for merging! Should I create another pull request with the Bash script added back as well as its deprecation notice in the README?

Mar 19 '23 19:03 SuajCarrot

It was confirmed in #285 that it works on Windows, so no need to do it

Mar 19 '23 19:03 ggerganov

That's great, thank you.

Mar 19 '23 19:03 SuajCarrot

llama.cpp llama.cpp copied to clipboard

Improved quantize script

llama.cpp
llama.cpp copied to clipboard