Xiaodong Wang

Results 23 comments of Xiaodong Wang

Thanks @jeffdaily ! QQ are you going to expose those icache stuff with env var or just the API?

@jeffdaily do you think that new error reported is related? (it looks it does)

It seems to complain that lld is a generic tool, and asks me to use the specific ones: stderr: lld is a generic driver. Invoke ld.lld (Unix), ld64.lld (macOS), lld-link...

Looks like failed test was disabled: https://github.com/pytorch/pytorch/pull/92945. Landing

Sorry just see this. Thanks @jeffdaily for pinging, this will break our internal codebase but it should be an easy fix. I'm not objecting the idea, if you can ping...

hmm, do you think this test is already broken on trunk because of the first tunableop PR? Regardless, it may be a good idea to add "gemm_internal_cublas" into the error...

Discussed offline, it looks some pre-existing flaky test. @jeffdaily will also have one more commit switching the default rotating tensor and icache flush to true.

@scxiao thanks! Do you have the performance results (on both AMD and H100)?