opeide comments

Repositories
Issues
Comments

Results 4 comments of


                                            opeide

RandomFog bug

I've also had this issue! And it was very hard to find with dataloader with multiple workers. I had to set workers=0 to even get told where the code was...

Problem with tracing the model

For anyone having similar issues, I had trouble tracing until I added a torch.jit.is_tracing() check in Anchor's forward to not use last_anchors during tracing.

0.5.6 failing Multiple GPU Error

I had the same issue with DDP and in my case the culprit was torch.nn.SyncBatchNorm.convert_sync_batchnorm(model) (in my own code). I guess a BN layer was my hidden output.

Training got stuck due to timeout from dataloader

For me the issue was apparently in my augmentations. In albumentations there are some augmentations that can infiniteloop, like randomfog. I was only able to see where the code froze...