mariecwhite
mariecwhite
> @mariecwhite Something I mentioned in our meeting in the other day was incorrect. What I added here does not run for 3 hours, as it doesn't generate new artifacts,...
I double-checked with Dan and these tests run over a couple of hours (more models have been added recently). @GMNGeoffrey @pzread How do we want to handle test failures if...
@GMNGeoffrey The test_shark_model_suite workflow is stuck at finding a runner: ``` Requested labels: self-hosted, runner-group=, environment=, gpu, os-family=Linux Job defined at: iree-org/iree/.github/workflows/ci.yml@refs/pull/9748/merge Waiting for a runner to pick up this...
The Shark tank successfully ran and took 38min. @dan-garvey @monorimet Is it possible to run a smaller set?
Here is the set that is currently submitted: export MODEL_LIST="bert_base_cased or mobilebert_uncased or MiniLM_L12_H384_uncased or module_resnet50 or mobilenet_v3 or squeezenet1_0 or vit_base_patch16_224" pytest tank/test_models.py -k "cpu and ($MODEL_LIST)" When running...
These models are now compiling but the numerical differences are large ``` > python tflitehub/mobilebert_tf2_quant_test.py --config=vulkan ... I0503 12:13:31.038540 140054783264576 test_util.py:72] Max error (0): 44.021111 I0503 12:13:31.038809 140054783264576 test_util.py:72] Max...
Reassigning to Lei to take a further look.
A few things to double-check to make sure the models are close to apple-to-apples: - Since the Shark model has smaller input and vocab size, it would suggest that the...
@powderluv, @monorimet Given the differences, can we update the Shark model to be the same as what is in Torch bench?
> What kind of threading is the IREE team interested in having reported in benchmark results? PyTorch and TF have API for fetching the number of inter- and intra-op parallelism...