Huy Do

Results 161 comments of


                                            Huy Do

Short-term fix to preserve NJT metadata cache in torch.compile

> @huydhn can we just forward fix and skip the test in ROCm? Yes, plz go ahead with the fix, I can stamp it if you need

Short-term fix to preserve NJT metadata cache in torch.compile

@jbschlosser I have also just noticed another periodic failure coming out of this PR https://hud.pytorch.org/pytorch/pytorch/commit/2a41fc03903de63270d325bd1886a50faf32d7e4#26340619959. It's a CUDA memory leak failure (we only run memory leak check periodically) and your...

First draft FSDP2 Memory Tracker

@sanketpurandare I'm seeing the new test `test_tracker_multi_group_eager` failing on ROCm distributed job https://hud.pytorch.org/pytorch/pytorch/commit/287c68c5eca2e15bf73b84fe9e39755ae3f842ba#26578545778. Could you help take a look? The job is only run periodically, do its signal was missed...

First draft FSDP2 Memory Tracker

Btw, I disable the test in https://github.com/pytorch/pytorch/issues/129390 to keep trunk sane. In your fixed PR, please add "Fixes https://github.com/pytorch/pytorch/issues/129390" in your PR description to run the test in your PR

Don't call item() into torch.scalar_tensor uselessly

@pytorchbot drci

Don't call item() into torch.scalar_tensor uselessly

@pytorchbot revert -m 'Sorry for reverting your change, but there are real failures on the PR that sneak in during the log classifier outage' -c weird

S390x ci periodic tests

@pytorchbot rebase

S390x ci periodic tests

Answer to the capacity question https://github.com/pytorch/pytorch/pull/125399#issuecomment-2345746062

s390x: build s390x binaries on each pull request

@AlekseiNikiforovIBM I have sent out an invite to you with write permission to the repo so that you will have the permission to run CI on your end without our...

s390x: build s390x binaries on each pull request

@pytorchbot merge

‹
1
2
...
8
9
10
11
12
13
14
15
16
17
›