FastChat
FastChat copied to clipboard
Reduce the peak memory requirement when applying delta
Seeing many complaints about the required peak memory being too high. We would like to keep the peak memory of the following command be less than 8GB.
python3 -m fastchat.model.apply_delta \
--base /path/to/llama-13b \
--target /output/path/to/vicuna-13b \
--delta lmsys/vicuna-13b-delta-v1.1
I am working on reducing the peak memory, and the peak memory now is 11372.2 MiB. See #402 and we can further improve it.
@andy-yang-1 Great work, thanks!