[QST] Edge compute using Transform4rec models with ONNX runtime

Open fire opened this issue 3 years ago • 5 comments

❓ Questions & Help

Details

I was wondering if there was a tutorial or a workflow for using Transform4rec models on edge compute using ONNX runtime?

Would be interested to collaborate.

Feb 23 '22 19:02 fire

@fire hello. Thanks for your question. Currently, we do not have an example with ONNX runtime. You can contribute of course if you want to.

Feb 23 '22 19:02 rnyak

Any suggestions on how to approach this?

Generate a model using transform4rec
???
Convert model from torchscript to an onnx model
Execute onnx model using onnxruntime on directml (nvidia, intel, amd), cuda directly and cpu.

Feb 23 '22 19:02 fire

@fire that'd be a question for Nvidia Triton team. You can ask your question on Triton repo. Thanks.

Feb 23 '22 19:02 rnyak

I cross posted https://github.com/triton-inference-server/server/issues/3976. Feel free to keep this open or close if wanted.

Feb 23 '22 19:02 fire

You can just export the trained torch model to onnx, and use onnx runtime for inference with the same inputs as used during training. See for example this repo for how to do transformer inference in the browser using onnx runtime https://github.com/jobergum/browser-ml-inference

Feb 24 '22 14:02 jobergum

Transformers4Rec Transformers4Rec copied to clipboard

[QST] Edge compute using Transform4rec models with ONNX runtime

❓ Questions & Help

Details

Transformers4Rec
Transformers4Rec copied to clipboard