Alexander Grund comments

Results 1184 comments of


                                            Alexander Grund

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

The H100 failures are mostly from ` inductor/test_cutlass_backend (39 failed, 1 passed, 2 skipped, 0 errors)` I expect most failures to be caused by "BytesWarning". That needs a rebuild of...

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

Test report by @Flamefire ~~**FAILED**~~ Build succeeded for 6 out of 7 (7 easyconfigs in total) n1450.barnard.hpc.tu-dresden.de - Linux RHEL 9.6, x86_64, Intel(R) Xeon(R) Platinum 8470 (sapphirerapids), Python 3.9.21 See...

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

Test report by @Flamefire **SUCCESS** Build succeeded for 7 out of 7 (7 easyconfigs in total) c92 - Linux Rocky Linux 9.6, x86_64, AMD EPYC 9334 32-Core Processor (zen4), 4...

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

Test report by @Flamefire **SUCCESS** Build succeeded for 7 out of 7 (7 easyconfigs in total) i8018 - Linux Rocky Linux 9.6, x86_64, AMD EPYC 7352 24-Core Processor (zen2), 8...

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

4 (of 8) failures are in test_cpu_select_algorithm and test_select_algorithm which I assume have the same cause. However the errors are not in the gist, so can't tell Is it possibly...

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

Rebased

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

> ``` > FAILED [15.1010s] inductor/test_cpu_select_algorithm.py::TestSelectAlgorithmDynamicShapesCPU::test_linear_with_embedding_dynamic_shapes_batch_size_384_in_features_196_out_features_384_bias_True_cpu_bfloat16 - AssertionError: Scalars are not equal! > ``` Looks different to my error > ``` > FAILED [8.5177s] inductor/test_select_algorithm.py::TestSelectAlgorithm::test_convolution2 - torch._inductor.exc.InductorError: AssertionError: Incorrect result...

Alexander Grund

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

{ai}[foss/2024a] PyTorch v2.7.1 w/ CUDA 12.6.0

Add `--debug-module-cmds` option

Enforcing compilers in cmake, -DCMAKE_XXX_COMPILER is not sufficient, we should consider CMAKE_TOOLCHAIN_FILE

Explictly call `PythonPackage` and `Cargo` configure step in `CargoPythonPackage` easyblock