onnxruntime icon indicating copy to clipboard operation
onnxruntime copied to clipboard

Support BFloat16 ?

Open ildoonet opened this issue 3 years ago • 1 comments

Describe the issue

When will be Tensor type Bfloat16 supported?

To reproduce

Urgency

No response

Platform

Linux

OS Version

Ubuntu 18.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.12.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

No response

ildoonet avatar Sep 18 '22 15:09 ildoonet

What API do you have in mind? You can use BFloat16 with C/C++/C# API.

yuslepukhin avatar Sep 19 '22 19:09 yuslepukhin

I think some operator supports bfloat16. To find which operator supports, search "bfloat16" in https://github.com/microsoft/onnxruntime/blob/454f77cd94901cb95c92b20c60565408b2be045c/docs/OperatorKernels.md

@ildoonet, Which operators (or let us know which model) do you want add support of bfloat16?

tianleiwu avatar Sep 24 '22 23:09 tianleiwu

@yuslepukhin I'm using python.

@tianleiwu How can I use an input tensor as a bfloat16? I can find any documents since numpy has no bfloat16 type.

ildoonet avatar Sep 28 '22 04:09 ildoonet

@ildoonet First, use torch to generate bfloat16 input: https://pytorch.org/docs/stable/generated/torch.Tensor.bfloat16.html

Then use IO Bining (search PyTorch tensor in the following document): https://onnxruntime.ai/docs/api/python/api_summary.html

tianleiwu avatar Sep 30 '22 00:09 tianleiwu

@tianleiwu

What element_type are you specifying in bind_input? It appears that a valid numpy dtype is expected however numpy does not currently support bfloat16 to my knowledge.

https://github.com/microsoft/onnxruntime/blob/0c6037b5abe571fc43a55ef7a9d2f846820fbe5d/onnxruntime/python/onnxruntime_pybind_iobinding.cc#L67

gaziqbal avatar Oct 25 '22 18:10 gaziqbal

@ildoonet First, use torch to generate bfloat16 input: https://pytorch.org/docs/stable/generated/torch.Tensor.bfloat16.html

Then use IO Bining (search PyTorch tensor in the following document): https://onnxruntime.ai/docs/api/python/api_summary.html

According to your solution, I wrote toy code as below

import torch
import torch.nn as nn

import onnx
import onnxruntime

import numpy as np

class CNN(nn.Module):
  def __init__(self):
    super(CNN, self).__init__()
    self.relu = nn.ReLU()

  def forward(self, a):
    return self.relu(torch.add(a, a))

device = torch.device('cuda')
model = CNN()
model.to(device)
model.eval()
a = torch.randn((1,3,10,10), dtype=torch.bfloat16)
model_name = "model_bf16.onnx"
torch.onnx.export(model,
                  a,
                  model_name,
                  export_params=True,
                  opset_version=17,
                  do_constant_folding=True,
                  input_names=['ina'],
                  output_names=['outc'])

sess = onnxruntime.InferenceSession(model_name, providers=["CUDAExecutionProvider"])
binding = sess.io_binding()
a_tensor = a.contiguous()

binding.bind_input(
    name='ina',
    device_type='cuda',
    device_id=0,
    element_type=np.float32, #torch.bfloat16,
    shape=tuple(a_tensor.shape),
    buffer_ptr=a_tensor.data_ptr(),
    )

sess.run_with_iobinding(binding)

If I assign bind_input .element_type = torch.bfloat16, the following error occurs image If I assign bind_input .element_type = np.float32, I get error again image

Test environment:

cudatoolkit               11.3.1               h2bc3f7f_2  
cudnn                     8.2.1                cuda11.3_0  
numpy                     1.21.6                   pypi_0    pypi
onnx                      1.12.0                   pypi_0    pypi
onnxruntime-gpu           1.13.1                   pypi_0    pypi
python                    3.7.15               haa1d7c7_0  
torch                     1.13.0                   pypi_0    pypi

So I am just wandering to have a working example with bfloat16 as input.

linbaiwpi avatar Nov 15 '22 18:11 linbaiwpi

BFloat16 support is not present in Onnxruntime Python API and you certainly cannot lie to the computer about the type, otherwise, it will get its revenge.

yuslepukhin avatar Nov 15 '22 19:11 yuslepukhin

I see that the issue is closed, but from comments it is not clear whether BFloat16 data type is supported natively in ONNX Runtime as of March 2023. Can anybody clarify ?

kartikpodugu avatar Mar 06 '24 03:03 kartikpodugu

For python API, bfloat16 is not supported in I/O binding (You can use bfloat16 internally in the onnx graph, but not in graph inputs/outputs).

@kartikpodugu, You can try C/C++ API since that is not limited by numpy.

tianleiwu avatar Mar 06 '24 22:03 tianleiwu