Vedant Roy

Results 96 comments of Vedant Roy

Quick question: If we were to use ESM, wouldn't it still use the "node" key in the "exports" object. And the "node" key refers to a commonjs file, which is...

Also running into this issue for `xonsh` version 0.13.3.

@hanlint If I provide a github repository + a Dockerfile, would that be helpful? I've also filed an issue here: https://github.com/pytorch/pytorch/issues/83824 since it might be a Pytorch issue.

@hanlint Also to be clear, I can reliably reproduce this issue when training with multiple GPUs. It is somewhat inconsistent at 2, but it happens at >= 6 every time....

Ok, additional details. The error is happening because my process is receiving a SIGCHILD signal, which is causing the interruption. I can workaround the error by doing a `sleep` before...

@kobindra ``` contrastive_train-contrastive_train-1 | Traceback (most recent call last): contrastive_train-contrastive_train-1 | contrastive_train-contrastive_train-1 | File "contrastive_train.py", line 63, in contrastive_train-contrastive_train-1 | app() contrastive_train-contrastive_train-1 | contrastive_train-contrastive_train-1 | File "contrastive_train.py", line 52, in...

@kobindra Is there a way to specify the folder name for the checkpoints. for example, I don't really want it to be "some random integer + a word", I would...

Doesn't work, see: ``` contrastive_train-contrastive_train-1 | Traceback (most recent call last): contrastive_train-contrastive_train-1 | File "/root/miniconda3/envs/video-rec/lib/python3.8/site-packages/boto3/s3/transfer.py", line 288, in upload_file contrastive_train-contrastive_train-1 | future.result() contrastive_train-contrastive_train-1 | File "/root/miniconda3/envs/video-rec/lib/python3.8/site-packages/s3transfer/futures.py", line 103, in result...

Setting `num_concurrent_uploads=1` doesn't help