lit-llama
lit-llama copied to clipboard
Adapter finetuning do not run on two cards (A100 40G)
I am try to finetune on two cards but the adapter.py hang in there after this output, I waited so long (1 hour, went for food). what is the reason?
from jsonargparse.cli import CLI
[2023-10-02 11:27:18,651] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
initializing deepspeed distributed: GLOBAL_RANK: 1, MEMBER: 2/2
Enabling DeepSpeed BF16. Model parameters and inputs will be cast to bfloat16
.
[rank: 0] Seed set to 1337
[rank: 1] Seed set to 1338
Number of trainable parameters: 1229760
Number of trainable parameters: 1229760