Ross Wightman comments

Results 523 comments of


                                            Ross Wightman

[FEATURE] Visualize gradient maps for attention based network

@AmbiTyga that adds a significant amount of non-trivial code to the base model for a fairly specific feature, considering that there are now vit/deit, pit, tnt, swin, soon cait and...

[FEATURE] Visualize gradient maps for attention based network

I should also add that I do have plans to add feature extraction for the vit networks, like I have for the convnets, where activations of internal transformer blocks can...

hack ln implementation in convnext

@ngimel thanks for demonstrating this... it works, but like my prev impl, it is a hack. I actually just pulled my iscontiguous version out. It was causing too many problems,...

hack ln implementation in convnext

@ngimel a lot of people still use scripting for serving / export, not just performance. 'aot' vs 'torchcsript' on 1.12 is interesting, they are still quite different in some cases...

hack ln implementation in convnext

I will add though, per the original torch issue w/ LN + axis... regardless of performance, not having a native norm layer that covers this use case (C-dim), without needing...

hack ln implementation in convnext

@csarofeen keep in mind, I'm likely working with something a bit older than you, I was doing some testing on 1.12 release (cuda 11.3) via torchscript and aot-autograd. If you're...

hack ln implementation in convnext

@ngimel @csarofeen I ran a whole lot of BM runs on both 3090 and v100. As you can see, it's messy, I feel really messy without a clear cut win...

hack ln implementation in convnext

> We explicitly tested on 1.12 release, CC @ptrblck and @kevinstephano in case we were testing something slightly different. Definitely keep us posted, we're highly motivated to get our codegen...

hack ln implementation in convnext

There should probably be another location for 'nvfuser' + timm concerns, but will put this observation here for now A number of torchscript + nvfuser failures are due to handling...

hack ln implementation in convnext

@ngimel @csarofeen I spent a bit more time hacking around with this, as I keep getting frustrated by the lack of performance of non BN layers w/ PyTorch + GPU......