algorithmic-efficiency
algorithmic-efficiency copied to clipboard
Two copies of `criteo_resnet_pytorch` exist in .github/workflows/regression_tests_variants.yml
Two copies of criteo_resnet_pytorch exist in .github/workflows/regression_tests_variants.yml
Is this intentional? If not, which is the correct version? Thanks!!
criteo_resnet_pytorch:
runs-on: self-hosted
needs: build_and_push_pytorch_docker_image
steps:
- uses: actions/checkout@v2
- name: Run containerized workload
run: |
docker pull us-central1-docker.pkg.dev/training-algorithms-external/mlcommons-docker-repo/algoperf_pytorch_${{ github.head_ref || github.ref_name }}
docker run -v $HOME/data/:/data/ -v $HOME/experiment_runs/:/experiment_runs -v $HOME/experiment_runs/logs:/logs --gpus all --ipc=host us-central1-docker.pkg.dev/training-algorithms-external/mlcommons-docker-repo/algoperf_pytorch_${{ github.head_ref || github.ref_name }} -d criteo1tb -f pytorch -s reference_algorithms/paper_baselines/adamw/pytorch/submission.py -w criteo1tb_resnet -t reference_algorithms/paper_baselines/adamw/tuning_search_space.json -e tests/regression_tests/adamw -m 10 -c False -o True -r false
criteo_resnet_pytorch:
runs-on: self-hosted
needs: build_and_push_pytorch_docker_image
steps:
- uses: actions/checkout@v2
- name: Run containerized workload
run: |
docker pull us-central1-docker.pkg.dev/training-algorithms-external/mlcommons-docker-repo/algoperf_pytorch_${{ github.head_ref || github.ref_name }}
docker run -v $HOME/data/:/data/ -v $HOME/experiment_runs/:/experiment_runs -v $HOME/experiment_runs/logs:/logs --gpus all --ipc=host us-central1-docker.pkg.dev/training-algorithms-external/mlcommons-docker-repo/algoperf_pytorch_${{ github.head_ref || github.ref_name }} -d criteo1tb -f pytorch -s reference_algorithms/paper_baselines/adamw/pytorch/submission.py -w criteo1tb_embed_init -t reference_algorithms/paper_baselines/adamw/tuning_search_space.json -e tests/regression_tests/adamw -m 10 -c False -o True -r false
Nope, that is not intentional. The regression tests for the variants are a work in progress. It looks like the bottom one actually tests criteo1tb_embed_init and not criteo1tb_resnet, so the top one runs the intended workload.