Tianlei Wu comments

Results 108 comments of


                                            Tianlei Wu

Support BFloat16 ?

@ildoonet First, use torch to generate bfloat16 input: https://pytorch.org/docs/stable/generated/torch.Tensor.bfloat16.html Then use IO Bining (search `PyTorch tensor` in the following document): https://onnxruntime.ai/docs/api/python/api_summary.html

[Performance] cuda graphs optimization refuses to apply to a cuda provider model

The error message is related to `{'enable_cuda_graph': True}`. It is an advanced feature and it cannot be applied to every model due to limitations: https://natke.github.io/onnxruntime/docs/performance/tune-performance.html#using-cuda-graphs-in-the-cuda-ep Try remove it or update...

Big endian issue: Graph Transformation Attention Fusion tests are failing

@tvkai. Thanks for reporting the issue and identifying the root cause. Since you have a fix, could you please submit a pull request? @snnn, could we add a build pipeline...

Is it possible to convert the onnx model to fp16 model?

Latest script can be found here: https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/python/tools/transformers/models/stable_diffusion/README.md Example script to convert FP32 to FP16: ``` # You can clone the source code of onnxruntime to run this script as the...

Is it possible to convert the onnx model to fp16 model?

@cprivitere, thanks for the feedback. To improve accuracy, onnx model need convert to mixed precision by adding some operators (like LayerNormalization, Gelu etc) to op_block_list in the script. The list...

Create script to export BART encoder and decoder for use with custom beam search op

@BowenBao, thanks for raising the feature request. we are working on encoder and decoder support in beam search op (currently using T5 as example). After that is ready, we will...

Computing loss within onnxrunitme inference (GPT2 model)

I think you will need export a model by adding labels to inputs and loss to the outputs. See corresponding part in python: https://github.com/huggingface/transformers/blob/8cf4a6f0a63ed3aeed68192a9304fed2bd0ce100/src/transformers/models/gpt2/modeling_gpt2.py#L1087-L1094 You will need update the interface...

Computing loss within onnxrunitme inference (GPT2 model)

@OriAlpha, see the related code: https://github.com/huggingface/transformers/blob/8cf4a6f0a63ed3aeed68192a9304fed2bd0ce100/src/transformers/models/gpt2/modeling_gpt2.py#L1096-L1107 The first tuple is the loss. So we can `def get_loss(result): return result[0]` If you use return_dict=True, the result contains a field called loss.

Computing loss within onnxrunitme inference (GPT2 model)

@OriAlpha, you will need other changes: For example, create dummy inputs need a real tensor for labels. Need add dynamic axes setting for labels and loss. If you are familiar...

How can I convert a onnx model from INT64 to INT32

@0xideas, it is a little complicated but it is possible. First you need prepare a list of operators that need such conversion. Then, for each node of those operators, you...