benchmark icon indicating copy to clipboard operation
benchmark copied to clipboard

TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.

Results 259 benchmark issues
Sort by recently updated
recently updated
newest added

The current TFLOPS is only for FP32. Need to add support for other floating point formats such as TF32 and FP16.

When there are multiple metrics, the DCGM API may return `-1` values for some metrics. In general, most metrics use `max` as the default aggregator function while our FP32 metric...

bug

Stack from [ghstack](https://github.com/ezyang/ghstack): * __->__ #1275 On some OSS models we see CUDA OOM if we enable train correctness checks. For certain models, we can prevent this OOM by copying...

cla signed

# Patching CVE-2007-4559 Hi, we are security researchers from the Advanced Research Center at [Trellix](https://www.trellix.com). We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a...

torch dynamo inductor backend is not seeing the` torch.no_grad()` setting from torchbench framework. This is the reproducer `python run_benchmark.py cpu --model hf_Bert --test eval --torchdynamo inductor` Here is the code...

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2255 python run.py resnet50 -d cuda -t train --backend torchscript_trace Differential Revision: [D56849015](https://our.internmc.facebook.com/intern/diff/D56849015)

cla signed

Summary: https://fb.workplace.com/groups/257735836456307/permalink/657458576484029/ upload cprofile to manifold D56696397 has a script to convert profiler stats to dot graphs (see its test plan) Differential Revision: D56679561

cla signed

Test Plan: ``` $ python run_benchmark.py triton --ci { "name": "triton", "environ": { "pytorch_git_version": "734a000f16b60b3a4e18404e5047a467e2bc96d4", "pytorch_version": "2.4.0.dev20240425+cu121", "triton_version": "3.0.0+45fff310c8" }, "metrics": { "tritonbench_softmax[x_256-naive_softmax-gbps]": 186.1818167276619, "tritonbench_softmax[x_256-naive_softmax-latency]": 0.04505600035190582, "tritonbench_softmax[x_256-triton_softmax-gbps]": 546.1333471298221, "tritonbench_softmax[x_256-triton_softmax-latency]": 0.015359999611973763,...

cla signed

Issue Description I encounter a RuntimeError related to gradient computation when enabling accuracy checks during the training of yolov3 in a GPU docker environment. The training runs without issues when...

**Issue Description** I encounter a RuntimeError related to gradient computation when enabling accuracy checks during the training of DALLE2_pytorch in a GPU docker environment. The training runs without issues when...