fairseq
fairseq copied to clipboard
bf16 with A100 GPUs
A100 GPUs should support bf16 unless I am mistaken. I see it's currently only supported with TPUs. Are there any plans to support this?
Any progress on this ? My initial thought was to just pass the appropriate data type in the call to torch.cuda.amp.autocast
in the below fairseq_task.py
file.
Not sure if it would required changes anywhere else. I presume underlying PyTorch should handle the rest of it. Any pointers on that ?