Stas Bekman
Stas Bekman
oh, and btw, my earlier suggestion to move `sudo` outside the shell script was a bad idea - since then it then tries to run from a different environment and...
I propose a different approach, before running the benchmark have a test script that simply validates that everything in the env is ready? e.g. if it can't build the required...
I will also post here my experiment to do staggered loading (which appears to make no difference peak CPU memory wise) I have used 2 approaches. 1. The `flock`-based approach...
I am not sure why you're asking me. I reported the issue so I'm in the same boat as you are. And there has been no follow up since I...
Tunji, it's pretty safe to assume that this problem impacts anybody with available CPU memory < total GPU memory if they load the gpu memory to the brim. e.g. this...
This is for the Deepspeed team to solve. As far as I know it hasn't been resolved. cc: @tjruwase
Thank you, @ezyang for the ping. I have already followed up here: https://github.com/pytorch/pytorch/issues/94788#issuecomment-1430518601 but let me copy it here: ------------------------- @mlazos, if it's the way proposed by your PR, please...
additionally, this PR invents some sort of new logging level definition semantics, which again looks very neat for devs, but this is not what I think is needed for non-pytorch...
Also from the description in the OP I don't see it proposing to cover everything, e.g. it doesn't look like `torch.distributed` is there and it can be pretty noisy on...
Your summary looks right to me, @mlazos. `TORCH_COMPILE_DEBUG` is by definition written for compilation-related sub-modules and debug-related. So a general one could be: `TORCH_VERBOSITY` or `TORCH_LOG_LEVEL` or something similar. You...