airllm configure the chunk split size

configure the chunk split size

Open ageorgios opened this issue 1 year ago • 0 comments

Mac M1 Max 32GB user here without ability to bitsandbites quantize

Is there a way configure the chunk size for the inference to be quicker ? I think the 32GB memory is not efficiently used.

Dec 26 '23 19:12 ageorgios