Paul Richmond
Results
2
comments of
Paul Richmond
I am also encountering this behaviour whilst trying to fine-tune Llama3-8B using QLoRA. However, in my case I'm not using DeepSpeed (at least there's no `deepspeed_config` parameter in my accelerator...
Hi @SunMarc, thanks for the quick reply! I'm running my script on an HPC cluster where I only request 2 GPUs from a node comprising of 4 GPUs in total....