iree icon indicating copy to clipboard operation
iree copied to clipboard

Add focus set models to SHARK Tank

Open mariecwhite opened this issue 3 years ago • 2 comments

We've identified some models that we would like to track closely. Some of these are not in the Shark Tank. Can you please add the following:

  • [ ] GPT2 in JAX: https://github.com/iree-org/iree-jax/tree/main/models/gpt2

  • [ ] RetinaNet-ResNeXT50 800x800: PyTorch (https://zenodo.org/record/6617981/files/resnext50_32x4d_fpn.pth) or ONNX (https://zenodo.org/record/6617879/files/resnext50_32x4d_fpn.onnx)

  • [ ] t5-11b: Both float and quantized versions. Colab for loading the model and quantizing it: https://colab.sandbox.google.com/drive/1YORPWx4okIHXnjW7MSAidXN29mPVNT7F?usp=sharing#scrollTo=3HibeFxJnwq7

  • [ ] MoviNet: https://tfhub.dev/google/collections/movinet/1

  • [ ] FastNeRF: https://github.com/houchenst/FastNeRF

  • [ ] MCUNetV2: https://hanlab.mit.edu/projects/tinyml/mcunet/release/mcunet-256kb-1mb_imagenet.tflite

mariecwhite avatar Aug 24 '22 21:08 mariecwhite

Please also add:

  • [ ] BertForMaskedLM: https://huggingface.co/docs/transformers/v4.21.1/en/model_doc/bert#transformers.BertForMaskedLM

This model is currently 17x slower than Torch on A100 GPU.

mariecwhite avatar Sep 01 '22 01:09 mariecwhite

I added two issues to the SHARK repo:

Add RNNT: https://github.com/nod-ai/SHARK/issues/329 Add BertForMaskedLM: https://github.com/nod-ai/SHARK/issues/324

mariecwhite avatar Sep 15 '22 22:09 mariecwhite

RNNT work to start next week.

erob710 avatar Sep 29 '22 22:09 erob710

Hi @mariecwhite Are these models still relevant for the IREE team? We are looking to add more models to the shark tank. Some of them may require some extra support in SHARK, so I wanted to verify that these are still high priority. Are there any other models that the IREE team would like to track closely?

monorimet avatar Feb 01 '23 22:02 monorimet

These models are no longer high priority. If you are planning to add models, it would be useful to use models in existing benchmarks suites like MLPerf Inference, HuggingFace Transformers, TorchBench, timm, NVIDIA DeepLearningExamples. This would give us an idea of performance and feature gaps on models that are being actively benchmarked.

mariecwhite avatar Feb 07 '23 09:02 mariecwhite

I think priorities have shifted here. Do we still want to track SHARK Tank in this repository? Maybe move the issue to another repo?

ScottTodd avatar Dec 15 '23 23:12 ScottTodd

Thank you for the attention here @ScottTodd. SHARK tank is on its way out in favor of moving model tests/benchmarks to SHARK-Turbine. We should track any integrations of the Turbine CI separately, so it's safe to close this issue.

monorimet avatar Dec 15 '23 23:12 monorimet