xla
xla copied to clipboard
Failing Torchbench Models: tracking issue
Summary of Contributions (9th Feb)
-
Improve the number of models in TorchBench that work with Dynamo as a tracer: These passing rates are now comparable to those from torch.compile using Inductor. Some of the fixes also improved the previous tracer that PyTorch/XLA used to use.
Inference Training Inductor 87 63 Dynamo 60 to 82 41 to 53 Non-Dynamo 79 to 82 54 to 56 -
Improve the benchmarking tools used by Google: The initial Google runs benchmarking these models showed a discrepancy of about 15 models with the results reported. We identified and fixed 10+ issues that helped reconcile Google's benchmarks with those reported and, in turn, with the PyTorch HUD.
Current State
This post has two lists:
- Failing inference models
- Failing training models
Each of them shows the failing models:
- Tracing without Dynamo (Eager-mode)
- Tracing with Dynamo into openxla (Dynamo+
openxla
)
These lists were created using the benchmarking scripts that currently live in the upstream. The following command was executed:
python xla/benchmarks/experiment_runner.py \
--suite-name torchbench \
--accelerator cuda \
--xla PJRT --xla None \
--dynamo openxla --dynamo inductor --dynamo None \
--test eval --test train \
--repeat 30 --iterations-per-run 5 \
--print-subprocess \
--no-resume
Environment
- GPU: A100 40GB
Inference
Non-Dynamo. Pass rate: 87/99 (87%)
- [x] DALLE2_pytorch
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- Issue: #6010
- [ ] cm3leon_generate
- Issue: #6004
- [ ] hf_Longformer
- Issue: #5835
- [ ] hf_T5_generate
- Issue: #6004
- [ ] moco
- Issue: #6083
- Issue: #6011
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- [x] nvidia_deeprecommender
- Issue: #6006
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6006
- [x] pytorch_CycleGAN_and_pix2pix
- Issue: #6007
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6007
- [ ] simple_gpt
- RTX 2060 doesn't support BF16
- Issue: #6011
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- [ ] simple_gpt_tp_manual
- RTX 2060 doesn't support BF16
- Issue: #6011
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- [ ] tacotron2
- Issue: #6112
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- [x] timm_efficientdet
- Issue: #6011
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- Issue: #6011
- [ ] vision_maskrcnn
- PyTorch/XLA PR: #5743
- PyTorch PR: https://github.com/pytorch/pytorch/pull/112202
- SKIP because of incompatible model and experiment configs
Dynamo+openxla
. Pass rate: 86/99 (86%)
- [x] DALLE2_pytorch
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- Issue: #6010
- [x] Super_SloMo
- PyTorch/XLA PR: #5707
- PyTorch/benchmark PR: https://github.com/pytorch/benchmark/pull/2038
- [ ] cm3leon_generate
- Issue: #5967
- [x] detectron2_fasterrcnn_r_101_c4
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] detectron2_fasterrcnn_r_101_dc5
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] detectron2_fasterrcnn_r_101_fpn
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] detectron2_fasterrcnn_r_50_c4
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] detectron2_fasterrcnn_r_50_dc5
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] detectron2_fasterrcnn_r_50_fpn
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] detectron2_fcos_r_50_fpn
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] detectron2_maskrcnn
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] detectron2_maskrcnn_r_101_c4
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] detectron2_maskrcnn_r_101_fpn
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] detectron2_maskrcnn_r_50_c4
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] detectron2_maskrcnn_r_50_fpn
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] dlrm
- PyTorch/XLA PR: #5743
- PyTorch PR: https://github.com/pytorch/pytorch/pull/112202
- [x] hf_BigBird
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] hf_GPT2
- PyTorch/XLA PR: #5922
- [x] hf_GPT2_large
- PyTorch/XLA PR: #5922
- [ ] hf_Longformer
- Issue: #5835
- [ ] hf_Reformer
- Issue: #5837
- [ ] hf_T5_generate
- Issue: #5967
- [ ] moco
- Issue: #6083
- Issue: #6011
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- [x] nvidia_deeprecommender
- Issue: #6006
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6006
- [x] pyhpc_isoneutral_mixing
- PyTorch/XLA PR: #5743
- PyTorch PR: https://github.com/pytorch/pytorch/pull/112202
- [x] pyhpc_turbulent_kinetic_energy
- PyTorch/XLA PR: #5743
- PyTorch PR: https://github.com/pytorch/pytorch/pull/112202
- [x] pytorch_CycleGAN_and_pix2pix
- Issue: #6007
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6007
- [x] speech_transformer
- PyTorch/XLA PR: #5823
- [x] timm_efficientdet
- Issue: #6011
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- Issue: #6011
Models also Failing on Inductor
Inference Failing on Inductor CUDA with the Same Error
Benchmarks that raise the same error on inductor:
- [ ] hf_clip
- 'str' object has no attribute 'shape'
- [ ] mobilenet_v2_quantized_qat
- [ ] resnet50_quantized_qat
Inference Failing on Inductor CUDA with Different Errors
- [ ] doctr_det_predictor
- Issue: #6005
- [ ] simple_gpt
- RTX 2060 doesn't support BF16
- Issue: #6011
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- [ ] simple_gpt_tp_manual
- RTX 2060 doesn't support BF16
- Issue: #6011
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- [ ] tacotron2
- Issue: #6005
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
Training
Non-Dynamo. Pass rate: 67/99 (67%)
- [ ] DALLE2_pytorch
- Issue: #6084
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- [ ] demucs
- Issue: #6003
- [ ] densenet121
- Issue: #6003
- [x] detectron2_fasterrcnn_r_101_c4
- Issue: #6004
- [x] detectron2_fasterrcnn_r_101_dc5
- Issue: #6004
- [x] detectron2_fasterrcnn_r_101_fpn
- Issue: #6004
- [x] detectron2_fasterrcnn_r_50_c4
- Issue: #6004
- [x] detectron2_fasterrcnn_r_50_dc5
- Issue: #6004
- [x] detectron2_fasterrcnn_r_50_fpn
- Issue: #6004
- [ ] detectron2_fcos_r_50_fpn
- Skipped by the benchmarking script
- [x] detectron2_maskrcnn_r_101_c4
- Issue: #6004
- [x] detectron2_maskrcnn_r_101_fpn
- Issue: #6004
- [x] detectron2_maskrcnn_r_50_c4
- Issue: #6004
- [x] detectron2_maskrcnn_r_50_fpn
- Issue: #6004
- [ ] dlrm
- Issue: #6008
- [ ] hf_GPT2_large
- Issue: #6003
- [ ] hf_Longformer
- Issue: #5835
- [ ] hf_T5_base
- Issue: #6003
- [ ] llama_v2_7b_16h
- Issue: #6003
- [ ] moco
- Issue: #6083
- Issue: #6011
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- [ ] nvidia_deeprecommender
- RTX 2060 OOM
- Issue: #6006
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- [x] pytorch_CycleGAN_and_pix2pix
- Issue: #6007
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6007
- [ ] stable_diffusion_unet
- Issue: #6003
- [ ] tacotron2
- Issue: #6112
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- [x] timm_efficientdet
- Issue: #6011
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- Issue: #6011
- [ ] timm_nfnet
- Issue: #6003
- [ ] timm_vision_transformer_large
- Issue: #6003
- [x] yolov3
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- Issue: #6010
Dynamo+openxla
. Pass rate: 57/99 (57%)
- [ ] densenet121
- Issue: #6003
- [ ] dlrm
- Issue: #6008
- [x] hf_BigBird
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [x] hf_GPT2
- PyTorch/XLA PR: #5922
- [x] hf_GPT2_large
- PyTorch/XLA PR: #5922
- [ ] hf_Longformer
- Issue: #5835
- [ ] hf_Reformer
- Issue: #6009
- [ ] moco
- Issue: #6083
- Issue: #6011
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- [ ] nvidia_deeprecommender
- Issue: #6084
- Issue: #6006
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- [x] pytorch_CycleGAN_and_pix2pix
- Issue: #6007
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6007
- [ ] stable_diffusion_unet
- Issue: #6003
- [ ] timm_efficientdet
- Issue: #6003
- Issue: #6011
- PyTorch/XLA PR: #6296
- PyTorch/XLA PR: #6076
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- [x] timm_vision_transformer
- Issue: #6003
- [x] torch_multimodal_clip
- Issue: #6005
- [x] yolov3
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- Issue: #6010
Models also Failing on Inductor
No Training Support on Inductor CUDA
Benchmarks that raise the error: Model's DEFAULT_TRAIN_BSIZE is not implemented
.
- [ ] cm3leon_generate
- [ ] detectron2_fcos_r_50_fpn
- [ ] doctr_det_predictor
- [ ] doctr_reco_predictor
- [ ] hf_T5_generate
- [ ] llama
- [ ] phi_1_5
- [ ] pyhpc_equation_of_state
- [ ] pyhpc_isoneutral_mixing
- [ ] pyhpc_turbulent_kinetic_energy
- [ ] sam
- [ ] simple_gpt
- [ ] simple_gpt_tp_manual
Training Failing on Inductor CUDA with the Same Error
Benchmarks that raise the same error on inductor:
- [ ] DALLE2_pytorch
- Issue: #6084
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
- [ ] demucs
- Issue: #6003
- [ ] llama_v2_7b_16h
- Issue: #6003
- [ ] maml
- Issue: #6084
- [ ] timm_vision_transformer_large
- Issue: #6003
- [ ] vision_maskrcnn
- targets should not be none when in training mode
- Fix https://github.com/pytorch/pytorch/pull/114774
Training Failing on Inductor CUDA with Different Errors
- [ ] detectron2_fasterrcnn_r_101_c4
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [ ] detectron2_fasterrcnn_r_101_dc5
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [ ] detectron2_fasterrcnn_r_101_fpn
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [ ] detectron2_fasterrcnn_r_50_c4
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [ ] detectron2_fasterrcnn_r_50_dc5
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [ ] detectron2_fasterrcnn_r_50_fpn
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [ ] detectron2_maskrcnn
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [ ] detectron2_maskrcnn_r_101_c4
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [ ] detectron2_maskrcnn_r_101_fpn
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [ ] detectron2_maskrcnn_r_50_c4
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [ ] detectron2_maskrcnn_r_50_fpn
- Issue: #5966
- PyTorch/XLA PR: #6170
- Issue: #5966
- [ ] opacus_cifar10
- Issue: #5967
- [ ] tacotron2
- Issue: #6005
- Issue: #6010
- PyTorch/XLA PR: #6060
- PyTorch/XLA PR: #6071
cc @JackCaoG @miladm
State after 7 weeks of work:
Models fixed so far:
- pyhpc_isoneutral_mixing
- pyhpc_turbulent_kinetic_energy
- dlrm
- Super_SloMo
- speech_transformer
PRs to fix the models. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/5688
- https://github.com/pytorch/xla/pull/5689
- https://github.com/pytorch/xla/pull/5707
- https://github.com/pytorch/xla/pull/5743
- https://github.com/pytorch/xla/pull/5769
- https://github.com/pytorch/xla/pull/5823
- https://github.com/pytorch/xla/pull/5914
- https://github.com/pytorch/pytorch/pull/112202
- https://github.com/pytorch/pytorch/pull/114626
- https://github.com/pytorch/pytorch/pull/114626
- https://github.com/pytorch/benchmark/pull/2038
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/pytorch/pull/114932
- https://github.com/pytorch/xla/pull/5922
- https://github.com/pytorch/xla/pull/5960
- https://github.com/pytorch/xla/pull/5963
- https://github.com/pytorch/xla/pull/5939
- https://github.com/pytorch/benchmark/pull/2072
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/issues/5835
- https://github.com/pytorch/xla/issues/5837
- https://github.com/pytorch/xla/issues/5839
- https://github.com/pytorch/xla/issues/5932
- https://github.com/pytorch/xla/issues/5942
- https://github.com/pytorch/pytorch/issues/111033
- https://github.com/pytorch/pytorch/issues/114302
Weekly update (Dec 1~Dec 10):
Models fixed:
- DALLE2_pytorch
- training is now failing with the same error as inductor
- stable_diffusion_unet
- training is still failing with OOM
- stable_diffusion_text_encoder
- hf_GPT2
- hf_GPT2_large
- training without dynamo is still failing
- yolov3
- Failing possibly due to a cuNND error, which is likely an OOM, on a RTX 2060. Haven't tested it yet on a A100, though
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/5922
- https://github.com/pytorch/xla/pull/5939
- https://github.com/pytorch/xla/pull/6060
- https://github.com/pytorch/xla/pull/6068
- https://github.com/pytorch/xla/pull/6069
- https://github.com/pytorch/xla/pull/6071
- https://github.com/pytorch/benchmark/pull/2072
- https://github.com/pytorch/pytorch/pull/114932
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6076
- https://github.com/pytorch/xla/pull/6067
- https://github.com/pytorch/xla/pull/6070
- https://github.com/pytorch/xla/pull/6072
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/issues/5966
- https://github.com/pytorch/xla/issues/5967
- https://github.com/pytorch/xla/issues/6003
- https://github.com/pytorch/xla/issues/6004
- https://github.com/pytorch/xla/issues/6005
- https://github.com/pytorch/xla/issues/6008
- https://github.com/pytorch/xla/issues/6009
- https://github.com/pytorch/xla/issues/6083
- https://github.com/pytorch/xla/issues/6085
- https://github.com/pytorch/xla/issues/6086
Weekly update (Dec 11~Dec 15):
Models fixed:
- pytorch_CycleGAN_and_pix2pix
- nvidia_deeprecommender
- dynamo+openxla training is still failling
- simple_gpt and simple_gpt_tp_manual
- failing due to the same reasons as inductor
- moco
- failing due to distributed backend
- timm_efficientdet
- dynamo+openxla training is still failing
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6072
- https://github.com/pytorch/xla/pull/6076
- https://github.com/pytorch/xla/pull/6130
- https://github.com/pytorch/xla/pull/6153
- https://github.com/pytorch/xla/pull/6182
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6070
- https://github.com/pytorch/xla/pull/6160
- https://github.com/pytorch/xla/pull/6170
- https://github.com/pytorch/xla/pull/6178
- https://github.com/pytorch/xla/pull/6180
- https://github.com/pytorch/pytorch/pull/115924
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/issues/6084
- https://github.com/pytorch/xla/issues/6112
- https://github.com/pytorch/pytorch/issues/115900
Can we please add a pass rate table in the weekly report that includes:
Inference
- Inductor, Dynamo+PyTocrh/XLA:GPU, Non-Dynamo+PyTocrh/XLA:GPU
Training
- Inductor, Dynamo+PyTocrh/XLA:GPU, Non-Dynamo+PyTocrh/XLA:GPU
Weekly update (Jan 8 ~ Jan 12):
Pass rate (out of 99 benchmarks):
Inference | Training | |
---|---|---|
Inductor | 91 | 64 |
Non-Dynamo | 87 | 67 |
Dynamo | 86 | 57 |
Models fixed:
- detectron2 models (inference with dynamo)
- hf_BigBird (inference and training with dynamo)
- torch_multimodal_clip (training with dynamo)
- timm_vision_transformer (training with dynamo)
- Likely not due to the merged PRs below:
- detectron2 models: all but detectron2_fcos_r_50_fpn (training without dynamo)
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/pytorch/pull/115924
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6302
- https://github.com/pytorch/xla/pull/6296
- https://github.com/pytorch/xla/pull/6160
- https://github.com/pytorch/xla/pull/6070
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/issues/6292
Weekly update (Jan 15 ~ Jan 19):
Pass rate (out of 99 benchmarks):
Inference | Training | |
---|---|---|
Inductor | 85 | 62 |
Non-Dynamo | 70 | 57 |
Dynamo | 71 | 55 |
Models that started failing:
- After #6296:
- detectron2_fasterrcnn_r_101_c4
- detectron2_fasterrcnn_r_101_dc5
- detectron2_fasterrcnn_r_101_fpn
- detectron2_fasterrcnn_r_50_c4
- detectron2_fasterrcnn_r_50_dc5
- detectron2_fasterrcnn_r_50_fpn
- detectron2_fcos_r_50_fpn
- detectron2_maskrcnn_r_101_c4
- detectron2_maskrcnn_r_101_fpn
- detectron2_maskrcnn_r_50_c4
- detectron2_maskrcnn_r_50_fpn
- mobilenet_v3_large
- timm_regnet
- hf_Bart
- Started being skipped:
- pytorch_CycleGAN_and_pix2pix
- pytorch_unet
- Unsupported precision:
- pytorch_unet
- yolov3
- cuDNN error:
- Super_SloMo (inductor)
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6302
- https://github.com/pytorch/xla/pull/6296
- https://github.com/pytorch/xla/pull/6325
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6160
- https://github.com/pytorch/xla/pull/6070
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/issues/6336
Can we track separate passrate tables for L4 and A100 GPUs going forward @ysiraichi?
cc @frgossen @golechwierowicz @cota
Weekly update (Jan 22 ~ Jan 26):
Pass rate (out of 99 benchmarks):
Inference | Training | |
---|---|---|
Inductor | 88 | 63 |
Non-Dynamo | 69 | 57 |
Dynamo | 72 | 55 |
Models fixed:
- (inductor) moco
- (inductor) Super_SloMo
- Failed when executed with all other benchmarks
- Passed when executed alone (by specifying
--filter
argument)
- (inference) llama_v2_7b_16h
Models that started failing:
- (inference + non-dynamo) timm_efficientnet (to be fixed by: #6389)
- (inference + non-dynamo) timm_nfnet (to be fixed by: #6389)
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6350
- https://github.com/pytorch/xla/pull/6374
- https://github.com/pytorch/xla/pull/6375
- https://github.com/pytorch/benchmark/pull/2124
- https://github.com/pytorch/pytorch/pull/118032
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6389
- https://github.com/pytorch/xla/pull/6160
- https://github.com/pytorch/xla/pull/6070
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/issues/6348
- https://github.com/pytorch/xla/issues/6353
- https://github.com/pytorch/xla/issues/6366
- https://github.com/pytorch/xla/issues/6367
- https://github.com/pytorch/xla/issues/6380
- https://github.com/pytorch/xla/issues/6391
Weekly update (Jan 29 ~ Feb 2):
Pass rate (out of 99 benchmarks):
A100
Inference | Training | |
---|---|---|
Inductor | 87 (last: 88) | 63 |
Non-Dynamo | 82 (last: 69) | 56 (last: 57) |
Dynamo | 82 (last: 72) | 53 (last: 55) |
L4
Inference | Training | |
---|---|---|
Inductor | 86 | 60 |
Non-Dynamo | 81 | 53 |
Dynamo | 82 | 49 |
Models Summary (for A100)
-
Inductor: Inference (-4, +3)
- (fail) New skips by PyTorch's torchbench skip list:
- detectron2_maskrcnn
- hf_Bert
- hf_Bert_large
- maml
- (pass) Remove outdated skip:
- vision_maskrcnn
- (pass) AMP supported:
- pytorch_unet
- yolov3
- (fail) New skips by PyTorch's torchbench skip list:
-
Inductor: Training (-3, +3)
- (fail) New skips by PyTorch's torchbench skip list:
- hf_Bert
- hf_Bert_large
- (fail) Failing due to sparse error:
- dlrm
- (pass) AMP supported:
- pytorch_unet
- (pass) No OOM:
- demucs
- opacus_cifar10
- (fail) New skips by PyTorch's torchbench skip list:
-
XLA:GPU (non-dynamo): Inference (-3, +16)
- (fail) New skips by PyTorch's torchbench skip list:
- detectron2_maskrcnn
- hf_Bert
- hf_Bert_large
- (pass) Forcing
fp32
precision (while settingXLA_USE_FP16
):- detectron2 benchmarks (11)
- mobilenet_v3_large
- timm_efficientnet
- timm_nfnet
- timm_regnet
- (pass) AMP supported:
- yolov3
- (fail) New skips by PyTorch's torchbench skip list:
-
XLA:GPU (non-dynamo): Training (-2, +1)
- (fail) New skips by PyTorch's torchbench skip list:
- hf_Bert
- hf_Bert_large
- (pass) No OOM:
- hf_GPT2_large
- (fail) New skips by PyTorch's torchbench skip list:
-
XLA:GPU (dynamo): Inference (-4, +14)
- (fail) New skips by PyTorch's torchbench skip list:
- detectron2_maskrcnn
- hf_Bert
- hf_Bert_large
- maml
- (pass) Remove outdated skip:
- vision_maskrcnn
- (pass) Forcing
fp32
precision (while settingXLA_USE_FP16
):- detectron2 benchmarks (11)
- hf_Bart
- (pass) AMP supported:
- yolov3
- (fail) New skips by PyTorch's torchbench skip list:
-
XLA:GPU (dynamo): Training (-2, +0)
- (fail) New skips by PyTorch's torchbench skip list:
- hf_Bert
- hf_Bert_large
- (fail) New skips by PyTorch's torchbench skip list:
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6070
- https://github.com/pytorch/xla/pull/6160
- https://github.com/pytorch/xla/pull/6389
- https://github.com/pytorch/xla/pull/6402
- https://github.com/pytorch/xla/pull/6407
- https://github.com/pytorch/xla/pull/6416
- https://github.com/pytorch/xla/pull/6419
- https://github.com/pytorch/xla/pull/6421
- https://github.com/pytorch/xla/pull/6446
- https://github.com/pytorch/xla/pull/6447
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/pytorch/pull/118783
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/issues/6403
- https://github.com/pytorch/xla/issues/6404
Weekly update (Feb 5 ~ Feb 9):
Pass rate (out of 99 benchmarks):
A100
Inference | Training | |
---|---|---|
Inductor | 87 (last: 87) | 63 |
Non-Dynamo | 82 (last: 82) | 57 (last: 56) |
Dynamo | 84 (last: 82) | 53 (last: 53) |
L4
Inference | Training | |
---|---|---|
Inductor | 86 | 60 |
Non-Dynamo | 81 | 53 |
Dynamo | 84 | 49 |
Models Summary
-
XLA:GPU (non-dynamo): Training (0, +1)
- (pass) No OOM:
- densenet121
- (pass) No OOM:
-
XLA:GPU (dynamo): Inference (0, +2)
- (pass) Increased compilation cache:
- cm3leon_generate
- hf_T5_generate
- (pass) Increased compilation cache:
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6484
- https://github.com/pytorch/xla/pull/6491
- https://github.com/pytorch/xla/pull/6509
- https://github.com/pytorch/xla/pull/6512
- https://github.com/pytorch/pytorch/pull/118783
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6518
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/issues/6483
- https://github.com/pytorch/xla/issues/6511
- https://github.com/pytorch/pytorch/issues/119680
Weekly update (Feb 12 ~ Feb 16):
Pass rate (out of 99 benchmarks):
Could not run the benchmarks this time, due to a compilation issue: #6564
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6518
- https://github.com/pytorch/xla/pull/6558
- https://github.com/pytorch/xla/pull/6550
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6542
- https://github.com/pytorch/pytorch/pull/120117
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/issues/6520
- https://github.com/pytorch/xla/issues/6521
- https://github.com/pytorch/xla/issues/6540
- https://github.com/pytorch/xla/issues/6521
- https://github.com/pytorch/xla/issues/6556
- https://github.com/pytorch/xla/issues/6557
- https://github.com/pytorch/xla/issues/6564
- https://github.com/pytorch/pytorch/issues/120115
Weekly update (Feb 19 ~ Feb 23):
Pass rate (out of 99 benchmarks):
There was an error in the benchmarking scripts, making it so we were unable to run using XLA: https://github.com/pytorch/xla/pull/6612
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6597
- https://github.com/pytorch/pytorch/pull/120117
- https://github.com/pytorch/pytorch/pull/120299
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6542
- https://github.com/pytorch/xla/pull/6612
- https://github.com/pytorch/pytorch/pull/120435
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/pytorch/issues/120336
- https://github.com/pytorch/pytorch/issues/120585
Pass rate (out of 99 benchmarks):
A100
Inference | Training | |
---|---|---|
Inductor | 81 (last: 87) | 65 (last: 63) |
Non-Dynamo | 72 (last: 82) | 61 (last: 57) |
Dynamo | 73 (last: 84) | 54 (last: 53) |
L4
Inference | Training | |
---|---|---|
Inductor | 81 (last: 86) | 62 (last: 60) |
Non-Dynamo | 71 (last: 81) | 57 (last: 53) |
Dynamo | 73 (last: 84) | 52 (last: 49) |
Models Summary
-
Inductor: Inference (-10, +4)
- (fail) "roi_align_forward_kernel" not implemented for 'BFloat16' (after: #6518)
- detectron2 benchmarks (10)
- (pass) Remove outdated skips
- hf_Bert and hf_Bert_large
- maml
- pytorch_CycleGAN_and_pix2pix
- (fail) "roi_align_forward_kernel" not implemented for 'BFloat16' (after: #6518)
-
Inductor: Training (-3, +5)
- (fail) Running on AMP (after: #6518)
- mobilenet_v2_quantized_qat
- resnet50_quantized_qat
- (pass) Remove outdated skips
- hf_Bert and hf_Bert_large
- pytorch_CycleGAN_and_pix2pix
- (fail) Running on AMP (after: #6518)
-
XLA:GPU (non-dynamo): Inference (-15, +5)
- (fail) Error while lowering:
aten::upsample_bilinear2d
(after: #6518) (issue: #6520)- Background_Matting
- (fail) CPU fallback does not work with mixed dtypes (issue: #6336)
- detectron2 benchmarks (11)
- (fail) Seen floating point types of different precisions in HLO (after: #6518) (issue: #6521)
- hf_GPT2 and hf_GPT2_large
- (fail) Indices types are not Long (they are Int) (after: #6518) (issue: #6648)
- llama
- (pass) Remove outdated skips
- hf_Bert and hf_Bert_large
- maml
- pytorch_CycleGAN_and_pix2pix
- pytorch_unet
- (fail) Error while lowering:
-
XLA:GPU (non-dynamo): Training (0, +4)
- (pass) Remove outdated skips
- hf_Bert and hf_Bert_large
- pytorch_CycleGAN_and_pix2pix
- pytorch_unet
- (pass) Remove outdated skips
-
XLA:GPU (dynamo): Inference (-16, +5)
- (fail) expected scalar type Float but found Half (after: #6518) (issue: #6556)
- Super_SloMo
- (fail) CPU fallback does not work with mixed dtypes (issue: #6336)
- detectron2 benchmarks (11)
- (fail) Seen floating point types of different precisions in HLO (after: #6518) (issue: #6521)
- hf_GPT2 and hf_GPT2_large
- (fail) Indices types are not Long (they are Int) (after: #6518) (issue: #6648)
- llama
- (fail) Slice size at index 0 in gather op is out of range, must be within [0, 1), got 1. (issue: #6557)
- vision_maskrcnn
- (fail) expected scalar type Float but found Half (after: #6518) (issue: #6556)
-
XLA:GPU (dynamo): Training (-4, +5)
- (fail) expected scalar type Float but found Half (after: #6518) (issue: #6556)
- Super_SloMo
- (fail) Seen floating point types of different precisions in HLO (after: #6518)
- (pass) Remove outdated skips
- hf_Bert and hf_Bert_large
- pytorch_CycleGAN_and_pix2pix
- pytorch_unet
- (pass) No OOM
- stable_diffusion_unet
- (fail) expected scalar type Float but found Half (after: #6518) (issue: #6556)
Weekly update (Feb 26 ~ Mar 01):
Pass rate (out of 99 benchmarks):
- PyTorch commit: d9db9e62e3d2d58d4e76a43f30c15db389e51c17
- PyTorch/XLA commit: 5a113aff98ce42420891c724843ccb30691dc24a
- PyTorch/benchmark commit: 62f4e9c6427b467ba77d06fc9952bf4a28204488
A100
Inference | Training | |
---|---|---|
Inductor | 81 (last: 81) | 65 (last: 65) |
Non-Dynamo | 72 (last: 72) | 61 (last: 61) |
Dynamo | 73 (last: 73) | 56 (last: 54) |
L4
Inference | Training | |
---|---|---|
Inductor | 81 (last: 81) | 63 (last: 62) |
Non-Dynamo | 72 (last: 71) | 58 (last: 57) |
Dynamo | 71 (last: 73) | 54 (last: 52) |
Models Summary
-
XLA:GPU (non-dynamo): Training (-1, +1)
- (fail) Timeout:
- timm_efficientdet
- (pass) Smaller batch size
- demucs
- (fail) Timeout:
-
XLA:GPU (dynamo): Inference (-2, 0)
- (fail) Timeout:
- cm3leon_generate
- hf_T5_generate
- (fail) Timeout:
-
XLA:GPU (dynamo): Training (0, +2)
- (pass) Smaller batch size
- densenet121
- timm_efficientdet
- (pass) Smaller batch size
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6542
- https://github.com/pytorch/xla/pull/6612
- https://github.com/pytorch/xla/pull/6632
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6659
- https://github.com/pytorch/xla/pull/6624
- https://github.com/pytorch/xla/pull/6661
- https://github.com/pytorch/pytorch/pull/120435
- https://github.com/pytorch/pytorch/pull/121007
- https://github.com/pytorch/pytorch/pull/121074
- https://github.com/pytorch/pytorch/pull/121075
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6648
- https://github.com/pytorch/xla/pull/6649
Weekly update (Mar 04 ~ Mar 08):
Pass rate (out of 99 benchmarks):
- PyTorch commit: c253d1c1db06beb128f6bb4db861cd08a3c23c6b
- PyTorch/XLA commit: 57f4780d2d5efd04e85e4a2c288eefdb596d2200
- PyTorch/benchmark commit: 62f4e9c6427b467ba77d06fc9952bf4a28204488
A100
Inference | Training | |
---|---|---|
Inductor | 81 (last: 81) | 66 (last: 65) |
Non-Dynamo | 72 (last: 72) | 61 (last: 61) |
Dynamo | 71 (last: 71) | 57 (last: 56) |
L4
Inference | Training | |
---|---|---|
Inductor | 81 (last: 81) | 64 (last: 63) |
Non-Dynamo | 72 (last: 72) | 58 (last: 58) |
Dynamo | 71 (last: 71) | 55 (last: 54) |
Models Summary (A100)
-
Inductor: Training (0, +1)
- (pass) Reason unknown
- dlrm
- (pass) Reason unknown
-
XLA:GPU (dynamo): Training (0, +1)
- (pass)
Tensor.new
dynamo support- hf_Reformer
- (pass)
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6624
- https://github.com/pytorch/pytorch/pull/121075
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6659
- https://github.com/pytorch/xla/pull/6661
- https://github.com/pytorch/xla/pull/6697
- https://github.com/pytorch/pytorch/pull/121007
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
Weekly update (Mar 11 ~ Mar 15):
Pass rate (out of 99 benchmarks):
- PyTorch commit: 5f601a41e0a8c91ecf7ca5e4b95d752166ed9093
- PyTorch/XLA commit: dbe2bc2aa9c680e42c49cb9a0c3a2c0a562082f8
- PyTorch/benchmark commit: 62f4e9c6427b467ba77d06fc9952bf4a28204488
A100
Inference | Training | |
---|---|---|
Inductor | 81 (last: 81) | 66 (last: 66) |
Non-Dynamo | 37 (last: 72) | 28 (last: 61) |
Dynamo | 31 (last: 71) | 18 (last: 57) |
L4
Inference | Training | |
---|---|---|
Inductor | 81 (last: 81) | 64 (last: 63) |
Non-Dynamo | 45 (last: 72) | 38 (last: 58) |
Dynamo | 44 (last: 71) | 22 (last: 55) |
Models Summary (A100)
No summary this week because:
- Diff is too big
- It might be due to a pin update
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6718
- https://github.com/pytorch/xla/pull/6745
- https://github.com/pytorch/xla/pull/6697
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6659
- https://github.com/pytorch/xla/pull/6661
- https://github.com/pytorch/pytorch/pull/121007
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6750
- https://github.com/pytorch/pytorch/pull/121926
@ysiraichi The regression you saw might be due to https://github.com/pytorch/xla/pull/6677 (open xla pin update). Our team is looking into this issue.
Weekly update (Mar 18 ~ Mar 21):
Pass rate (out of 99 benchmarks):
- PyTorch commit: 5f601a41e0a8c91ecf7ca5e4b95d752166ed9093
- PyTorch/XLA commit: dbe2bc2aa9c680e42c49cb9a0c3a2c0a562082f8
- PyTorch/benchmark commit: 62f4e9c6427b467ba77d06fc9952bf4a28204488
A100
Inference | Training | |
---|---|---|
Inductor | 81 (last: 81) | 66 (last: 66) |
Non-Dynamo | 76 (last: 72) | 64 (last: 61) |
Dynamo | 73 (last: 71) | 58 (last: 57) |
L4
Inference | Training | |
---|---|---|
Inductor | 80 (last: 81) | 64 (last: 64) |
Non-Dynamo | 76 (last: 72) | 61 (last: 58) |
Dynamo | 74 (last: 71) | 56 (last: 55) |
Models Summary (A100)
-
XLA:GPU (non-dynamo): Inference (0, +4)
- (pass)
as_strided_copy
new implementation- hf_Longformer
- (pass)
pow
data-type promotion fixed- hf_GPT2
- hf_GPT2_large
- (pass) Loosen
Embedding
index type requirement- llama
- (pass)
-
XLA:GPU (non-dynamo): Training (0, +3)
- (pass)
as_strided_copy
new implementation- hf_Longformer
- (pass) Unknown reason:
- hf_T5_base
- timm_efficientdet
- (pass)
-
XLA:GPU (dynamo): Inference (-2, +4)
- (pass)
as_strided_copy
new implementation- hf_Longformer
- (pass)
pow
data-type promotion fixed- hf_GPT2
- hf_GPT2_large
- (pass) Loosen
Embedding
index type requirement- llama
- (fail) Unknown reason:
- doctr_reco_predictor https://github.com/pytorch/xla/issues/6832
- speech_transformer https://github.com/pytorch/xla/issues/6831
- (pass)
-
XLA:GPU (dynamo): Training (-2, +3)
- (pass)
as_strided_copy
new implementation- hf_Longformer
- (pass)
pow
data-type promotion fixed- hf_GPT2
- hf_GPT2_large
- (fail) Unknown reason:
- hf_Reformer https://github.com/pytorch/xla/issues/6830
- speech_transformer https://github.com/pytorch/xla/issues/6831
- (pass)
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6659
- https://github.com/pytorch/xla/pull/6661
- https://github.com/pytorch/xla/pull/6814
- https://github.com/pytorch/pytorch/pull/121007
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
Last week, the results were unchanged. We are preparing for performance optimizations. cc @ysiraichi
Weekly update (Apr 1 ~ Apr 5):
Pass rate (out of 99 benchmarks):
- PyTorch commit: 72662bf05b3499ce96aae9183a489c78f0c44c84
- PyTorch/XLA commit: 5c48be19e6ded305bb524b3d1231fd4ce4d46208
- PyTorch/benchmark commit: d6015d42d9a1834bc7595c4bd6852562fb80b30b
A100
Inference | Training | |
---|---|---|
Inductor | 81 (last: 81) | 66 (last: 66) |
Non-Dynamo | 75 (last: 76) | 63 (last: 64) |
Dynamo | 73 (last: 73) | 53 (last: 58) |
L4
Inference | Training | |
---|---|---|
Inductor | 82 (last: 80) | 65 (last: 64) |
Non-Dynamo | 75 (last: 76) | 61 (last: 61) |
Dynamo | 74 (last: 74) | 51 (last: 56) |
Models Summary (A100)
-
Inductor: Inference (-1, +1)
- (pass) dlrm
- (fail) maml
-
XLA:GPU (non-dynamo): Inference (-1, 0)
- (fail) timm_efficientdet https://github.com/pytorch/xla/issues/6889
-
XLA:GPU (non-dynamo): Training (-1, 0)
- (fail) timm_efficientdet: OOM
-
XLA:GPU (dynamo): Inference (-1, +1)
- (pass) speech_transformer
- (fail) timm_efficientdet https://github.com/pytorch/xla/issues/6899
-
XLA:GPU (dynamo): Training (-7, +2)
- (pass) hf_Reformer and speech_transformer
- (fail) hf_GPT2 and hf_GPT2_large https://github.com/pytorch/xla/issues/6900
- (fail) hf_T5, hf_T5_base, stable_diffusion_unet, and timm_vision_transformer_large: OOM
- (fail) hf_T5_large https://github.com/pytorch/xla/issues/6901
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6814
- https://github.com/pytorch/xla/pull/6881
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6659
- https://github.com/pytorch/xla/pull/6661
- https://github.com/pytorch/pytorch/pull/121007
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6833
- https://github.com/pytorch/xla/pull/6899
- https://github.com/pytorch/xla/pull/6900
- https://github.com/pytorch/xla/pull/6901
Weekly update (Apr 8 ~ Apr 12):
Pass rate (out of 99 benchmarks):
- PyTorch commit: f5331aade57725b03c36d5cc6c683f6a6bc0692d
- PyTorch/XLA commit: 58a412cb271a3f98ae2e01fd1d24bdbb66645d4e
- PyTorch/benchmark commit: d6015d42d9a1834bc7595c4bd6852562fb80b30b
A100
Inference | Training | |
---|---|---|
Inductor | 81 (last: 81) | 66 (last: 66) |
Non-Dynamo | 74 (last: 75) | 64 (last: 63) |
Dynamo | 74 (last: 73) | 53 (last: 53) |
L4
Inference | Training | |
---|---|---|
Inductor | 82 (last: 82) | 65 (last: 65) |
Non-Dynamo | 75 (last: 75) | 61 (last: 61) |
Dynamo | 75 (last: 74) | 51 (last: 51) |
Models Summary (A100)
-
XLA:GPU (non-dynamo): Inference (-1, 0)
- (fail) doctr_reco_predictor: TIMEOUT
-
XLA:GPU (non-dynamo): Training (0, +1)
- (pass) timm_efficientdet
-
XLA:GPU (dynamo): Inference (0, +1)
- (pass) hf_Reformer
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6659
- https://github.com/pytorch/xla/pull/6661
- https://github.com/pytorch/pytorch/pull/121007
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
Weekly update (Apr 15 ~ Apr 19):
Pass rate (out of 99 benchmarks):
- PyTorch commit: f5331aade57725b03c36d5cc6c683f6a6bc0692d
- PyTorch/XLA commit: b06c9c7700e13b7731a2b2f3b9ddbbfef2d0793c
- PyTorch/benchmark commit: d6015d42d9a1834bc7595c4bd6852562fb80b30b
A100
Inference | Training | |
---|---|---|
Inductor | ? (last: 81) | ? (last: 66) |
Non-Dynamo | ? (last: 74) | ? (last: 64) |
Dynamo | ? (last: 74) | ? (last: 53) |
L4
Inference | Training | |
---|---|---|
Inductor | 82 (last: 82) | 65 (last: 65) |
Non-Dynamo | 76 (last: 75) | 61 (last: 61) |
Dynamo | 76 (last: 75) | 51 (last: 51) |
Models Summary (A100)
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6933
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
Weekly update (Apr 22 ~ Apr 26):
Pass rate (out of 99 benchmarks):
- PyTorch commit: f5331aade57725b03c36d5cc6c683f6a6bc0692d
- PyTorch/XLA commit: 2a204e9b473831776def499c8106bafe2c418d24
- PyTorch/benchmark commit: d6015d42d9a1834bc7595c4bd6852562fb80b30b
A100
Inference | Training | |
---|---|---|
Inductor | 81 (last: 81) | 66 (last: 66) |
Non-Dynamo | 75 (last: 74) | 64 (last: 64) |
Dynamo | 75 (last: 74) | 53 (last: 53) |
L4
Inference | Training | |
---|---|---|
Inductor | 81 (last: 82) | 65 (last: 65) |
Non-Dynamo | 76 (last: 76) | 61 (last: 61) |
Dynamo | 76 (last: 76) | 51 (last: 51) |
Models Summary (A100)
-
XLA:GPU (non-dynamo): Inference (0, +1)
- (pass) timm_efficientdet
-
XLA:GPU (dynamo): Inference (0, +1)
- (pass) timm_efficientdet
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/6958
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/issues/6988
Weekly update (Apr 29 ~ May 3):
Pass rate (out of 99 benchmarks):
- PyTorch commit: 489b4586e95752dc65a1821a4383b9679ccd5b6b
- PyTorch/XLA commit: d1235858628417ed7abc0d61e6e9be50df3e1a87
- PyTorch/benchmark commit: d6015d42d9a1834bc7595c4bd6852562fb80b30b
A100
Inference | Training | |
---|---|---|
Inductor | 81 (last: 81) | 66 (last: 66) |
Non-Dynamo | 76 (last: 75) | 64 (last: 64) |
Dynamo | 75 (last: 75) | 53 (last: 53) |
L4
Inference | Training | |
---|---|---|
Inductor | 82 (last: 81) | 65 (last: 65) |
Non-Dynamo | 76 (last: 76) | 61 (last: 61) |
Dynamo | 76 (last: 76) | 51 (last: 51) |
Models Summary (A100)
-
XLA:GPU (non-dynamo): Inference (0, +1)
- (pass) doctr_reco_predictor
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/6958
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
Weekly update (May 6 ~ May 10):
Pass rate (out of 99 benchmarks):
- CUDA version: 12.1 (before: 11.8)
- Python version: 3.10 (before: 3.8)
- Reason:
networkx
had removed support to Python 3.9 (see issue update)
- Reason:
- PyTorch commit: 946b96fd54fdaa05d2f5b1e49d837124fbace983
- PyTorch/XLA commit: 40f7e1f54b506475d40b40c0f49193411de6d68f
- PyTorch/benchmark commit: d6015d42d9a1834bc7595c4bd6852562fb80b30b
A100
Inference | Training | |
---|---|---|
Inductor | 82 (last: 81) | 66 (last: 66) |
Non-Dynamo | 76 (last: 75) | 64 (last: 64) |
Dynamo | 75 (last: 75) | 53 (last: 53) |
L4
Inference | Training | |
---|---|---|
Inductor | 82 (last: 82) | 65 (last: 65) |
Non-Dynamo | 76 (last: 76) | 61 (last: 61) |
Dynamo | 76 (last: 76) | 51 (last: 51) |
Notes
- Inductor on L4 started failing with:
SyntaxError: unterminated string literal
- Oddly enough, A100 didn't have the same error
- Didn't update the results of L4
Models Summary (A100)
-
Inductor: Inference (0, +1)
- (pass) maml
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/pytorch/pull/125876
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
Weekly update (May 13 ~ May 17):
Pass rate (out of 99 benchmarks):
- CUDA version: 12.1
- Python version: 3.10
- PyTorch commit: 8619fe6214cd8f31345ae73c5b90024a0233dc40
- PyTorch/XLA commit: 62c3ba652ea09e2076a27f200ad755541f37daeb
- PyTorch/benchmark commit: d6015d42d9a1834bc7595c4bd6852562fb80b30b
A100
Inference | Training | |
---|---|---|
Inductor | 82 (last: 82) | 66 (last: 66) |
Non-Dynamo | 77 (last: 76) | 61 (last: 64) |
Dynamo | 78 (last: 75) | 55 (last: 53) |
L4
Inference | Training | |
---|---|---|
Inductor | 82 (last: 82) | 65 (last: 65) |
Non-Dynamo | 77 (last: 76) | 59 (last: 61) |
Dynamo | 78 (last: 76) | 52 (last: 51) |
Models Summary (A100)
All the difference shown bellow is likely the result of #7067, which fixes AMP. Reason: (i) training benchmarks use AMP, by default; and (ii) there are some inference benchmarks that use AMP instead of bfloat16
.
-
XLA:GPU (non-dynamo): Inference (0, +1)
- (pass) detectron2_fcos_r_50_fpn
-
XLA:GPU (non-dynamo): Training (-5, +2)
- (fail) Super_SloMo
- (fail) mobilenet_v2_quantized_qat
- (fail) resnet50_quantized_qat
- (fail) timm_efficientdet
- (fail) timm_nfnet
- (pass) stable_diffusion_unet
- (pass) timm_vision_transformer_large
-
XLA:GPU (dynamo): Inference (0, +3)
- (pass) Super_SloMo
- (pass) detectron2_fcos_r_50_fpn
- (pass) doctr_reco_predictor
-
XLA:GPU (dynamo): Training (0, +2)
- (pass) Super_SloMo
- (pass) timm_nfnet
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/7067
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/7080
- https://github.com/pytorch/xla/pull/7081
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
Weekly update (May 20 ~ May 24):
Pass rate (out of 99 benchmarks):
- CUDA version: 12.1
- Python version: 3.10
- PyTorch commit: 8619fe6214cd8f31345ae73c5b90024a0233dc40
- PyTorch/XLA commit: cb8533be03c228a84db26ab7d44fdf0a2311462f
- PyTorch/benchmark commit: d6015d42d9a1834bc7595c4bd6852562fb80b30b
A100
Inference | Training | |
---|---|---|
Inductor | 82 (last: 82) | 66 (last: 66) |
Non-Dynamo | 77 (last: 77) | 63 (last: 61) |
Dynamo | 78 (last: 78) | 55 (last: 55) |
L4
Inference | Training | |
---|---|---|
Inductor | 82 (last: 82) | 65 (last: 65) |
Non-Dynamo | 77 (last: 77) | 61 (last: 59) |
Dynamo | 78 (last: 78) | 52 (last: 52) |
Models Summary (A100)
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/7080
- https://github.com/pytorch/xla/pull/7081
- https://github.com/pytorch/xla/pull/7090
- https://github.com/pytorch/xla/pull/7091
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/7111
- https://github.com/pytorch/xla/pull/7113
- https://github.com/pytorch/xla/pull/7116
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/issues/7095
Weekly update (May 27 ~ May 29):
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/7111
- https://github.com/pytorch/xla/pull/7113
- https://github.com/pytorch/xla/pull/7116
- https://github.com/pytorch/xla/pull/7130
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/7168
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
Weekly update (June 3 ~ June 6):
Pass rate (out of 99 benchmarks):
- CUDA version: 12.1
- Python version: 3.10
- PyTorch commit: f5328542b5365741176e71dd8a2954e0f350b9bc
- PyTorch/XLA commit: aec273056a95d8119279c15d36c0f48f739fb810
- PyTorch/benchmark commit: d6015d42d9a1834bc7595c4bd6852562fb80b30b
A100
Inference | Training | |
---|---|---|
Inductor | 82 (last: 82) | 65 (last: 66) |
Non-Dynamo | 79 (last: 77) | 61 (last: 63) |
Dynamo | 79 (last: 78) | 55 (last: 55) |
L4
Inference | Training | |
---|---|---|
Inductor | 82 (last: 82) | 64 (last: 65) |
Non-Dynamo | 79 (last: 77) | 60 (last: 61) |
Dynamo | 79 (last: 78) | 52 (last: 52) |
Models Summary (A100)
-
Inductor: Training (-1, +0)
- (fail) dlrm
-
XLA:GPU (non-dynamo): Inference (-0, +2)
-
XLA:GPU (non-dynamo): Training (-3, +1)
-
XLA:GPU (dynamo): Inference (-0, +1)
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
- https://github.com/pytorch/xla/pull/7168
Issues identified that the PRs in flight do not fix. For an updated list see [XLA, pytorch/pytorch]
- https://github.com/pytorch/xla/issues/7198
- https://github.com/pytorch/pytorch/issues/128165
Weekly update (June 10 ~ June 14):
Pass rate (out of 99 benchmarks):
- CUDA version: 12.1
- Python version: 3.10
- PyTorch commit: 0344f95c2ea944cc916290097133470f963a5532
- PyTorch/XLA commit: 286b31f0c0c752306e4a80a566b1ec9e82653991
- PyTorch/benchmark commit: d6015d42d9a1834bc7595c4bd6852562fb80b30b
A100
Inference | Training | |
---|---|---|
Inductor | 82 (last: 82) | 65 (last: 65) |
Non-Dynamo | 79 (last: 79) | 63 (last: 61) |
Dynamo | 79 (last: 79) | 55 (last: 55) |
L4
Inference | Training | |
---|---|---|
Inductor | 82 (last: 82) | 64 (last: 64) |
Non-Dynamo | 79 (last: 79) | 61 (last: 60) |
Dynamo | 79 (last: 79) | 52 (last: 52) |
Models Summary (A100)
-
XLA:GPU (non-dynamo): Training (-1, +3)
- (pass) drq
- (pass) stable_diffusion_unet
- (pass) timm_vision_transformer_large
- (fail) timm_nfnet #7271
PRs merged. For an updated list see [XLA, pytorch/benchmarks, pytorch/pytorch]
PRs in flight. For an updated list see [XLA, pytorch/pytorch, pytorch/benchmarks]
- https://github.com/pytorch/xla/pull/7257
- https://github.com/pytorch/benchmark/pull/2292