Alexander Grund
Alexander Grund
The H100 failures are mostly from ` inductor/test_cutlass_backend (39 failed, 1 passed, 2 skipped, 0 errors)` I expect most failures to be caused by "BytesWarning". That needs a rebuild of...
Test report by @Flamefire ~~**FAILED**~~ Build succeeded for 6 out of 7 (7 easyconfigs in total) n1450.barnard.hpc.tu-dresden.de - Linux RHEL 9.6, x86_64, Intel(R) Xeon(R) Platinum 8470 (sapphirerapids), Python 3.9.21 See...
Test report by @Flamefire **SUCCESS** Build succeeded for 7 out of 7 (7 easyconfigs in total) c92 - Linux Rocky Linux 9.6, x86_64, AMD EPYC 9334 32-Core Processor (zen4), 4...
Test report by @Flamefire **SUCCESS** Build succeeded for 7 out of 7 (7 easyconfigs in total) i8018 - Linux Rocky Linux 9.6, x86_64, AMD EPYC 7352 24-Core Processor (zen2), 8...
4 (of 8) failures are in test_cpu_select_algorithm and test_select_algorithm which I assume have the same cause. However the errors are not in the gist, so can't tell Is it possibly...
> ``` > FAILED [15.1010s] inductor/test_cpu_select_algorithm.py::TestSelectAlgorithmDynamicShapesCPU::test_linear_with_embedding_dynamic_shapes_batch_size_384_in_features_196_out_features_384_bias_True_cpu_bfloat16 - AssertionError: Scalars are not equal! > ``` Looks different to my error > ``` > FAILED [8.5177s] inductor/test_select_algorithm.py::TestSelectAlgorithm::test_convolution2 - torch._inductor.exc.InductorError: AssertionError: Incorrect result...
Looks like I identified and fixed the same bug in the tests in multiple occasions. Rebased
My feeling is that this is a bit too much. If we notice a misbehaving program we could just patch out the relevant line in the CML. Otherwise we might...
It was "just missing". So the question worth a comment would rather by why it *does not* need to be this way. If we don't do this we only call...