onnxruntime
onnxruntime copied to clipboard
Support BFloat16 ?
Describe the issue
When will be Tensor type Bfloat16 supported?
To reproduce
Urgency
No response
Platform
Linux
OS Version
Ubuntu 18.04
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.12.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
No response
What API do you have in mind? You can use BFloat16 with C/C++/C# API.
I think some operator supports bfloat16. To find which operator supports, search "bfloat16" in https://github.com/microsoft/onnxruntime/blob/454f77cd94901cb95c92b20c60565408b2be045c/docs/OperatorKernels.md
@ildoonet, Which operators (or let us know which model) do you want add support of bfloat16?
@yuslepukhin I'm using python.
@tianleiwu How can I use an input tensor as a bfloat16? I can find any documents since numpy has no bfloat16 type.
@ildoonet First, use torch to generate bfloat16 input: https://pytorch.org/docs/stable/generated/torch.Tensor.bfloat16.html
Then use IO Bining (search PyTorch tensor in the following document):
https://onnxruntime.ai/docs/api/python/api_summary.html
@tianleiwu
What element_type are you specifying in bind_input? It appears that a valid numpy dtype is expected however numpy does not currently support bfloat16 to my knowledge.
https://github.com/microsoft/onnxruntime/blob/0c6037b5abe571fc43a55ef7a9d2f846820fbe5d/onnxruntime/python/onnxruntime_pybind_iobinding.cc#L67
@ildoonet First, use torch to generate bfloat16 input: https://pytorch.org/docs/stable/generated/torch.Tensor.bfloat16.html
Then use IO Bining (search
PyTorch tensorin the following document): https://onnxruntime.ai/docs/api/python/api_summary.html
According to your solution, I wrote toy code as below
import torch
import torch.nn as nn
import onnx
import onnxruntime
import numpy as np
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.relu = nn.ReLU()
def forward(self, a):
return self.relu(torch.add(a, a))
device = torch.device('cuda')
model = CNN()
model.to(device)
model.eval()
a = torch.randn((1,3,10,10), dtype=torch.bfloat16)
model_name = "model_bf16.onnx"
torch.onnx.export(model,
a,
model_name,
export_params=True,
opset_version=17,
do_constant_folding=True,
input_names=['ina'],
output_names=['outc'])
sess = onnxruntime.InferenceSession(model_name, providers=["CUDAExecutionProvider"])
binding = sess.io_binding()
a_tensor = a.contiguous()
binding.bind_input(
name='ina',
device_type='cuda',
device_id=0,
element_type=np.float32, #torch.bfloat16,
shape=tuple(a_tensor.shape),
buffer_ptr=a_tensor.data_ptr(),
)
sess.run_with_iobinding(binding)
If I assign bind_input .element_type = torch.bfloat16, the following error occurs
If I assign bind_input .element_type = np.float32, I get error again

Test environment:
cudatoolkit 11.3.1 h2bc3f7f_2
cudnn 8.2.1 cuda11.3_0
numpy 1.21.6 pypi_0 pypi
onnx 1.12.0 pypi_0 pypi
onnxruntime-gpu 1.13.1 pypi_0 pypi
python 3.7.15 haa1d7c7_0
torch 1.13.0 pypi_0 pypi
So I am just wandering to have a working example with bfloat16 as input.
BFloat16 support is not present in Onnxruntime Python API and you certainly cannot lie to the computer about the type, otherwise, it will get its revenge.
I see that the issue is closed, but from comments it is not clear whether BFloat16 data type is supported natively in ONNX Runtime as of March 2023. Can anybody clarify ?
For python API, bfloat16 is not supported in I/O binding (You can use bfloat16 internally in the onnx graph, but not in graph inputs/outputs).
@kartikpodugu, You can try C/C++ API since that is not limited by numpy.