Ross Wightman

Results 510 comments of Ross Wightman

@isaaccorley already on the radar but haven't had a chance to come up with a design yet. See #617. I was going to rename the title of that but I'll...

@isaaccorley some help could be useful here, I need to ask torchvision team and figure out what the fx solution might look like, whether it has a chance of working...

@JarvisKevin mnv3 is a completely different situation, if you read the comments in that model definition, that architecture, by design does not match other convnets due to its 'efficient head'...

@NProkoptsev the version here should match the official version that was eventually added to PyTorch In terms of use, yeah, the pytorch ver is different than tf in terms of...

this is what I mean, 'like_tf' is what tf impl does ``` python def trunc_normal_(tensor, mean=0., std=1., a=-2., b=2., like_tf=False): if like_tf: _no_grad_trunc_normal_(tensor, 0, 1.0, a, b) tensor.mul_(std).add_(mean) return tensor...

@lixiaolx the accuracy is wrong before quantization as well so not sure what your setup is, it should be between 80.3 and 80.4 for default image pipeline.

@lixiaolx have you tried V2 torchvision weights? they have a training recipe that's closer to the current ones in timm https://pytorch.org/vision/stable/models.html#initializing-pre-trained-models ... I feel there is a chance that training...

Like #68, this would be nice to have ... more useful than advprop. Needs some interesting larger datasets to work with, had thoughts on OpenImages, have yet to find time...

@hal-314 I'm hesitant to add it because I haven't seen anything indicating it'd be better than heavy EMA + avg_checkpoints and a decent LR schedule. I have yet to see...

@ardeal unlikely any time soon, you're probably already aware that they are here, and were implemented using some parts / code from timm via deit/swin https://github.com/microsoft/SPACH they would be a...