benchmark issues

Rolling the nightly release testing using the main branch

We are now in a state where the existing models on main branch is fairly stable, so we can run all models on the main branch instead of a v*...

xuzhao9

Molly/enable unet1d

3

------------------------------------------- Enabling Unet1d to track OP performance:conv_transpose1d. ------------------------------------------- This PR is to enable Huggingface/Diffusers/UNET1D which includes conv_transpose1d on upsampling and downsampling phase. (In [#120982,](https://github.com/pytorch/pytorch/issues/120982) a remarkable performance regression happens to...

DiweiSun

cla signed

add dynamo config cprofile_rename

2

Summary: without it, each run will purge svg files when previewed locally. no affect by default. Reviewed By: Yuzhen11 Differential Revision: D55453161

dshi7

cla signed

fb-exported

Fix docker tag in push

Test workflow: https://github.com/pytorch/benchmark/actions/runs/9793666962

xuzhao9

cla signed

Add help for run_benchmark

Show more help messages for Tritonbench ``` usage: run_benchmark.py [-h] [--op OP] [--mode {fwd,bwd,fwd_bwd}] [--bwd] [--fwd_bwd] [--device DEVICE] [--warmup WARMUP] [--iter ITER] [--csv] [--dump-csv] [--skip-print] [--plot] [--ci] [--metrics METRICS] [--only...

xuzhao9

cla signed

Use the default NVIDIA driver version in the `test-infra` for K8s cluster

Make the version used in https://github.com/pytorch/test-infra/blob/main/.github/actions/setup-nvidia/action.yml and https://github.com/pytorch/benchmark/blob/main/docker/infra/daemonset.yaml#L73 consistent

xuzhao9

Use AWS to store the GCP base docker

1

xuzhao9

Use ARC Kubernetes mode to run benchmark

2

ARC Kubernetes mode has the highest security level among all runner modes. We can keep using CPU-only instance for docker building, but all GPU instances should use ARC Kubernetes mode....

xuzhao9

[1/2] tritonbench bf16xint16 matmul template

8

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #2349 * __->__ #2348 Overall context: Before looking further into the bf16xint4 matmul, I'm planning to look into a bf16xint16 matmul first. The...

davidberard98

cla signed

[2/2] Correct bf16xint16 kernel

1

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2349 * #2348 Modify the bf16xint16 kernel from the previous PR to actually do as intended: load the int16 input, convert it...

davidberard98

cla signed

benchmark
benchmark copied to clipboard

Metadata

Rolling the nightly release testing using the main branch

Molly/enable unet1d

add dynamo config cprofile_rename

Fix docker tag in push

Add help for run_benchmark

Use the default NVIDIA driver version in the `test-infra` for K8s cluster

Use AWS to store the GCP base docker

Use ARC Kubernetes mode to run benchmark

[1/2] tritonbench bf16xint16 matmul template

[2/2] Correct bf16xint16 kernel

← Metadata

Owner

Metadata

benchmark benchmark copied to clipboard

Metadata

← Metadata

Owner

Metadata

benchmark
benchmark copied to clipboard