Ross Wightman comments

Results 510 comments of


                                            Ross Wightman

[FEATURE] Vision Transformer Feature Extraction

@isaaccorley already on the radar but haven't had a chance to come up with a design yet. See #617. I was going to rename the title of that but I'll...

[FEATURE] Vision Transformer Feature Extraction

@isaaccorley some help could be useful here, I need to ask torchvision team and figure out what the fx solution might look like, whether it has a chance of working...

[FEATURE] Vision Transformer Feature Extraction

@JarvisKevin mnv3 is a completely different situation, if you read the comments in that model definition, that architecture, by design does not match other convnets due to its 'efficient head'...

[BUG] Truncated normal initialization

@NProkoptsev the version here should match the official version that was eventually added to PyTorch In terms of use, yeah, the pytorch ver is different than tf in terms of...

[BUG] Truncated normal initialization

this is what I mean, 'like_tf' is what tf impl does ``` python def trunc_normal_(tensor, mean=0., std=1., a=-2., b=2., like_tf=False): if like_tf: _no_grad_trunc_normal_(tensor, 0, 1.0, a, b) tensor.mul_(std).add_(mean) return tensor...

[BUG]Resent50 model with wrong precision after quantization with tensorrt int8PTQ

@lixiaolx the accuracy is wrong before quantization as well so not sure what your setup is, it should be between 80.3 and 80.4 for default image pipeline.

[BUG]Resent50 model with wrong precision after quantization with tensorrt int8PTQ

@lixiaolx have you tried V2 torchvision weights? they have a training recipe that's closer to the current ones in timm https://pytorch.org/vision/stable/models.html#initializing-pre-trained-models ... I feel there is a chance that training...

Add noisystudent train method

Like #68, this would be nice to have ... more useful than advprop. Needs some interesting larger datasets to work with, had thoughts on OpenImages, have yet to find time...

Add SWA support

@hal-314 I'm hesitant to add it because I haven't seen anything indicating it'd be better than heavy EMA + avg_checkpoints and a decent LR schedule. I have yet to see...

[FEATURE] Implement SPACH models (A Battle of Network Structures)

@ardeal unlikely any time soon, you're probably already aware that they are here, and were implemented using some parts / code from timm via deit/swin https://github.com/microsoft/SPACH they would be a...