Filipe Oliveira (Personal)
Filipe Oliveira (Personal)
Further references: - https://github.com/Microsoft/onnxruntime/issues/133 - https://github.com/Microsoft/onnxruntime/tree/master/onnxruntime/test/perftest - https://cloudblogs.microsoft.com/opensource/2020/01/21/microsoft-onnx-open-source-optimizations-transformer-inference-gpu-cpu/
Given the different backends dimension order, we should be able to prepare the tensors on HxWxC formatting or CxHxW formatting.
Models to be assessed: - DLRM: https://github.com/facebookresearch/dlrm
- [x] RedisAI - [ ] Triton
Each runner struct that is of `inference.Processor` type should implement a common `CollectRunTimeMetrics()` method that will asks the specific runner to fetch (opt-in) runtime stats that will then be stored...
- 10K models - reference data can be interesting to have
following the changes in https://github.com/RedisAI/RedisAI/pull/383
please notice that even though the examples pass, the TLS example is not being run given there is a missing artifact there... We made this on purpose so that users...