Anandamoy Bandyopadhyay
Anandamoy Bandyopadhyay
I run the following code (bloom-accelerate-trainer-minimal.py) on a setup of 8 A4500 GPUs of 20GB vRAM ``` import argparse import os from transformers import AdamW, get_linear_schedule_with_warmup from datasets import load_dataset...
> @muellerzr the problem is in the forward though ;-) And it should work for training as long as there is no offload. There isn't CPU offload as far as...
Did you mean `model.hf_device_map`? There is no attribute `_hf_device_map`. The output of `print_rank0(model.hf_device_map)` is simply ` {'': 7}` which is not correct perhaps. I set the `device_map="balanced"` and `num_processes =...
> Oh the problem is quite clear then, the process only sees GPU 7. I think it all stems from the fact that you use `num_processes=2` in your accelerate config....
@harshit-777 No, I abandoned this codebase long ago. Try scripting your confusion matrix code using the bbox coors from the model outputs.