test-infra
test-infra copied to clipboard
This repository hosts code that supports the testing infrastructure for the main PyTorch repo. For example, this repo hosts the logic to track disabled tests and slow tests, as well as our continuatio...
Within the last 50 commits, there are the following failures on the main branch of pytorch: - [inductor / rocm6.1-py3.8-inductor / test (inductor, 1, 1, linux.rocm.gpu.2)](https://hud.pytorch.org/minihud?name_filter=inductor%20/%20rocm6.1-py3.8-inductor%20/%20test%20%28inductor%2C%201%2C%201%2C%20linux.rocm.gpu.2%29) failed consecutively starting with...
Within the last 5 minutes, these machines had long queues (exact numbers may be out of date): - linux.aws.h100, 3 machines, 4.39 hours - linux.rocm.gpu.mi300.2, 45 machines, 4.16 hours -...
I forgot that reruns get put under their own element Also say if the test ultimately succeeded or failed format is usually ``` { rerun: [{stuff},{stuff}], failure: {stuff}, job_id: jobid...
Display test counts + times info Maybe add skips in a future PR? https://torchci-p0oj9kqfs-fbopensource.vercel.app/pytorch/pytorch/commit/e10b2ba357f7998edaec5a181352c02ae503ccb9
The container in question https://hub.docker.com/r/pytorch/manylinux-builder/tags. There are reports from ExecuTorch and Torchtune about the problematic way of running Nova Linux build job as root inside the container. 1. On ExecuTorch,...
Sometimes the paramselector doesn't update if you click on the prev page/back button This fixes that by turning the input into a controlled component https://react.dev/reference/react-dom/components/input#controlling-an-input-with-a-state-variable cc @huydhn
In the **FIRST** comment you can specify the configuration for each experiment. The format is: - Above the line break, you have yaml formatted text listing all experiments and their...
Within the last 50 commits, there are the following failures on the main branch of pytorch: - [linux-binary-libtorch / libtorch-rocm6_2_4-shared-with-deps-release-build / build](https://hud.pytorch.org/minihud?name_filter=linux-binary-libtorch%20/%20libtorch-rocm6_2_4-shared-with-deps-release-build%20/%20build) failed consecutively starting with commit [233790922b73fc0c273ad786fdbd7adde0b42267](https://hud.pytorch.org/commit/pytorch/pytorch/233790922b73fc0c273ad786fdbd7adde0b42267) - [linux-binary-manywheel...
* Follow up to https://github.com/pytorch/test-infra/pull/5164 and https://github.com/pytorch/test-infra/pull/5124 The startup failures only show up as queued on rockset since github doesn't send a webhook for them, so we can backfill the...