offsite-tuning
offsite-tuning copied to clipboard
How to run distributed evaluation for big models used in this paper?
It seems all the eval for LLMs are done using 1 GPUs can you suggest ways to run distributed eval?