Tianlei Wu
Tianlei Wu
@ildoonet First, use torch to generate bfloat16 input: https://pytorch.org/docs/stable/generated/torch.Tensor.bfloat16.html Then use IO Bining (search `PyTorch tensor` in the following document): https://onnxruntime.ai/docs/api/python/api_summary.html
The error message is related to `{'enable_cuda_graph': True}`. It is an advanced feature and it cannot be applied to every model due to limitations: https://natke.github.io/onnxruntime/docs/performance/tune-performance.html#using-cuda-graphs-in-the-cuda-ep Try remove it or update...
@tvkai. Thanks for reporting the issue and identifying the root cause. Since you have a fix, could you please submit a pull request? @snnn, could we add a build pipeline...
Latest script can be found here: https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/python/tools/transformers/models/stable_diffusion/README.md Example script to convert FP32 to FP16: ``` # You can clone the source code of onnxruntime to run this script as the...
@cprivitere, thanks for the feedback. To improve accuracy, onnx model need convert to mixed precision by adding some operators (like LayerNormalization, Gelu etc) to op_block_list in the script. The list...
@BowenBao, thanks for raising the feature request. we are working on encoder and decoder support in beam search op (currently using T5 as example). After that is ready, we will...
I think you will need export a model by adding labels to inputs and loss to the outputs. See corresponding part in python: https://github.com/huggingface/transformers/blob/8cf4a6f0a63ed3aeed68192a9304fed2bd0ce100/src/transformers/models/gpt2/modeling_gpt2.py#L1087-L1094 You will need update the interface...
@OriAlpha, see the related code: https://github.com/huggingface/transformers/blob/8cf4a6f0a63ed3aeed68192a9304fed2bd0ce100/src/transformers/models/gpt2/modeling_gpt2.py#L1096-L1107 The first tuple is the loss. So we can `def get_loss(result): return result[0]` If you use return_dict=True, the result contains a field called loss.
@OriAlpha, you will need other changes: For example, create dummy inputs need a real tensor for labels. Need add dynamic axes setting for labels and loss. If you are familiar...
@0xideas, it is a little complicated but it is possible. First you need prepare a list of operators that need such conversion. Then, for each node of those operators, you...