RecBole
RecBole copied to clipboard
Implementing multi-GPUs Training for RecBole
Hi there,
I am a fan of the RecBole framework. Considering the complexity of the RecBole framework, I provide an easy but feasible method to achieve multi-GPUs training.
The core implementation idea is that re-wrapping the internal data type of Interaction to the PyTorch Dataloader object. The more details in my pull request branch "fix_multi_gpus", please check it.
Note that it is just one of the promising ways to realize multi-GPUs training. Hoping this method can inspire you to come up with a more novelty method to do it.
To use multi-gpus training model (e.g. BERT4Rec), you just need to run the following command:
- Set the
multi_gpus: Truein yourconfig.yamlfiles. $ python -m torch.distributed.launch --nproc_per_node=3 run_recbole.py --model=BERT4Rec --config_files recbole/properties/model/BERT4Rec.yaml
Best Regards, John
@juyongjiang Hi, thanks for your PR and we will carefully check it.
@juyongjiang Cool!
Hello, I used your methods to implement multi-gpus on kagt, but after setting multi-gpus: True, this parameter doesn't seem to work as it isn't printed on the log. Is there any other setting that I have missed?