Ross Wightman comments

Results 510 comments of


                                            Ross Wightman

Using timm gcvit models and getting errors when using resolution other than 224x224

@sarmientoj24 I made the window size adapt to img input size, so that fixes one issue for scaling, but the relpos indices still need interpolation.... Unfortunately, the interpolation itself is...

Using timm gcvit models and getting errors when using resolution other than 224x224

@mosheliv I don't believe Hugging Face transformers has it implemented for Swin. It is also implemented for vanilla vit in `timm`, but models like gcvit, swin, maxvit, etc that use...

training hparam clarifications

@ahatamiz thank you for the detailed response, my LR needs a bit of adjustment based on that info, I'll try another run with that and a new seed, I noticed...

[GPUNet/PyTorch] Model padding is incorrect, layer naming goes against norms.

@linnanwang thanks for updating the copyright/license info The issue with padding is that the 3x3 convs in the EdgeResidual (FusedMBConv) layers have 0 padding, for typical PyTorch use they should...

[GPUNet/PyTorch] Model padding is incorrect, layer naming goes against norms.

In terms of the naming / 'modelling interface', when dealing with models across frameworks, exporting, deploying, etc it's helpful to follow some norms in naming and these ones are quite...

[GPUNet/PyTorch] Model padding is incorrect, layer naming goes against norms.

@linnanwang thanks for the updates, will the models be retrained with a padding fix? as it stands, if the padding is fixed to follow a consistent scheme, the accuracy of...

[GPUNet/PyTorch] Model padding is incorrect, layer naming goes against norms.

@linnanwang that's looking much better, yes. Looks like the padding is fixed too? Only other comment it's a bit odd that the stem/prologue and head/epilogue have different activations, was that...

[GPUNet/PyTorch] Model padding is incorrect, layer naming goes against norms.

@linnanwang yes, I can turn on/off the se for InvertedResidual per block in timm, whether it's identity or None, doesn't matter for the weight loading

Simulator performance degrades over time / drastically uneven step times

Note the vectored setup is based on OpenAI vectored environments and has been used in the same state with other simulators and absolutely no such issues were observed.

Simulator performance degrades over time / drastically uneven step times

Thanks some helpful comments. So to summarize: 1. a high degree of variance in step times is expected depending on what's happening in the sim. Like mattias, I have also...