Add focus set models to SHARK Tank
We've identified some models that we would like to track closely. Some of these are not in the Shark Tank. Can you please add the following:
-
[ ] GPT2 in JAX: https://github.com/iree-org/iree-jax/tree/main/models/gpt2
-
[ ] RetinaNet-ResNeXT50 800x800: PyTorch (https://zenodo.org/record/6617981/files/resnext50_32x4d_fpn.pth) or ONNX (https://zenodo.org/record/6617879/files/resnext50_32x4d_fpn.onnx)
-
[ ] t5-11b: Both float and quantized versions. Colab for loading the model and quantizing it: https://colab.sandbox.google.com/drive/1YORPWx4okIHXnjW7MSAidXN29mPVNT7F?usp=sharing#scrollTo=3HibeFxJnwq7
-
[ ] MoviNet: https://tfhub.dev/google/collections/movinet/1
-
[ ] FastNeRF: https://github.com/houchenst/FastNeRF
-
[ ] MCUNetV2: https://hanlab.mit.edu/projects/tinyml/mcunet/release/mcunet-256kb-1mb_imagenet.tflite
Please also add:
- [ ] BertForMaskedLM: https://huggingface.co/docs/transformers/v4.21.1/en/model_doc/bert#transformers.BertForMaskedLM
This model is currently 17x slower than Torch on A100 GPU.
I added two issues to the SHARK repo:
Add RNNT: https://github.com/nod-ai/SHARK/issues/329 Add BertForMaskedLM: https://github.com/nod-ai/SHARK/issues/324
RNNT work to start next week.
Hi @mariecwhite Are these models still relevant for the IREE team? We are looking to add more models to the shark tank. Some of them may require some extra support in SHARK, so I wanted to verify that these are still high priority. Are there any other models that the IREE team would like to track closely?
These models are no longer high priority. If you are planning to add models, it would be useful to use models in existing benchmarks suites like MLPerf Inference, HuggingFace Transformers, TorchBench, timm, NVIDIA DeepLearningExamples. This would give us an idea of performance and feature gaps on models that are being actively benchmarked.
I think priorities have shifted here. Do we still want to track SHARK Tank in this repository? Maybe move the issue to another repo?
Thank you for the attention here @ScottTodd. SHARK tank is on its way out in favor of moving model tests/benchmarks to SHARK-Turbine. We should track any integrations of the Turbine CI separately, so it's safe to close this issue.