benchmark icon indicating copy to clipboard operation
benchmark copied to clipboard

TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.

Results 259 benchmark issues
Sort by recently updated
recently updated
newest added

TorchBench CI has detected a performance signal. Base PyTorch version: 1.13.0.dev20220810+cu113 Base PyTorch commit: e1007950484aa1df4a2f87c9c14b514ffd7736a5 Affected PyTorch version: 1.13.0.dev20220811+cu113 Affected PyTorch commit: 3aeb5e4ff9d56ecd680401cfa3f23e97a279efbe Affected Tests: - test_train[hf_BigBird-cpu-eager]: +11.08978% - test_eval[mnasnet1_0-cpu-jit]:...

torchbench-perf-report

when running `python run_sweep.py -m pytorch_unet -t eval -d cuda --jit`, this would raise error `AttributeError: 'RecursiveScriptModule' object has no attribute 'n_classes'` add a final mark to keep this attibute...

cla signed

The default allclose() correctness check doesn't work for bfloat16 kernels due to precision differences between fp32 and bfloat16. For trt kernels this is addressed by switching to cosine_similarity. Instead of...

cla signed

TorchBench CI has detected a performance signal. Base PyTorch version: 1.13.0.dev20220810+cu113 Base PyTorch commit: e1007950484aa1df4a2f87c9c14b514ffd7736a5 Affected PyTorch version: 1.13.0.dev20220811+cu113 Affected PyTorch commit: 3aeb5e4ff9d56ecd680401cfa3f23e97a279efbe Affected Tests: - test_eval[fastNLP-cuda-eager]: -8.62826% - test_train[hf_BigBird-cpu-eager]:...

torchbench-perf-report

Add an argument `--torchexpert` to enable TorchExpert analysis for profiling. Add submodule [TorchExpert](https://github.com/FindHao/TorchExpert). TorchExpert is an analysis tool for profiling results of PyTorch Profiler.

cla signed

Command: `python run.py yolov3 -d cpu -t train --bs 1` Error: ```python Running train method from yolov3 on cpu in eager mode. Traceback (most recent call last): File "run.py", line...

Added BertLarge model that has 16 attention heads, 24 hidden layers and 1024 as hidden size. Added ddp_trainer as an example that allows to benchmark ddp, all_reduce annd non_ddp version...

cla signed

Add world_language_model transformer and lstm. They are also used in release testing.

cla signed