electra
electra copied to clipboard
Multi-GPU training
Hi Kevin,
Thanks for the great work and releasing the codes/models. Was wondering if you have tried multi-GPU training for ELECTRA-base and ELECTRA-large (does your current codes support multi-GPU)? And if you have stats for multi-GPU experiments as well?
Also the stats for single GPU training of ELECTRA-base and ELECTRA-large (how many days needed till they converge to a descent performance?).
Thanks! -Hamid
I've tried the current starter's code for pretraining a small network. It seems like the model is trained on a single GPU.
Yes. It is working on a single GPU. Looking for multi-gpu support. @clarkkev
Gentle follow up Kevin, any thoughts?
Thanks, -Hamid
Is there any plans for multi-gpu support @clarkkev
Hope to support multiple GPU and provide detailed configuration. @clarkkev
@008karan @Palang2014 How are you getting the model to run on a GPU? Even with a GPU available, I'm only able to run on CPU. Mostly interested in running the fine-tuning, not the pre-training.
@008karan @Palang2014 How are you getting the model to run on a GPU? Even with a GPU available, I'm only able to run on CPU. Mostly interested in running the fine-tuning, not the pre-training.
I had the same issue and I realised it was because the program could not find the cuda libraries. Check if you get messages like "Successfully opened dynamic library libcudnn.so.7" If not, or if you see errors saying it couldn't find some cuda libraries, maybe you have the same problem?