llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Parallel Quantize.sh, add &

Open tljstewart opened this issue 1 year ago • 8 comments

@prusnak

./quantize "$i" "${i/f16/q4_0}" 2 &

tljstewart avatar Mar 13 '23 23:03 tljstewart

The fix need to be more elaborate, because if you pass --remove-f16 then the rm command is called before ./quantize has finished.

Can you come up with a solution that does not have this issue?

prusnak avatar Mar 14 '23 08:03 prusnak

This should work:

Yes, this works. But now I realised this completely defeats the purpose of the remove flag. The remove flag is there to save disk space after each conversion has been done. So this means the remove flag only makes sense when processing the files one after each other.

@ggerganov Do you think it makes sense to run the script in parallel by default and switch to serial processing when --remove-f16 is provided or do we want to have a separate orthogonal flag for parallel/serial processing?

prusnak avatar Mar 14 '23 16:03 prusnak

ah I see what you mean, swapping disk resources

tljstewart avatar Mar 14 '23 16:03 tljstewart

I think it is better to multi-thread the quantize.cpp program. Each tensor is divided in n parts and each of the n threads quantizes the corresponding part. This way, even when quantizing the 7B model which has only 1 part, we will utilize all available CPU resources and still gain performance.

If you agree, either reformulate this issue and add "good first issue" tag or create a new one and close this.

ggerganov avatar Mar 14 '23 19:03 ggerganov

I think it is better to multi-thread the quantize.cpp program.

I agree. This makes sense especially for this reason:

This way, even when quantizing the 7B model which has only 1 part, we will utilize all available CPU resources

If you agree, ...

ACK

FWIW, I really respect your shell skills @tljstewart 👍

prusnak avatar Mar 14 '23 20:03 prusnak

Done another way (rewrite to python) in https://github.com/ggerganov/llama.cpp/pull/222

prusnak avatar Mar 19 '23 19:03 prusnak