Ross Wightman comments

Results 522 comments of


                                            Ross Wightman

hack ln implementation in convnext

@csarofeen yes we're normalizing across C in the NCHW tensor. Thanks for the insight. From a high level, it was hard for me to fathom how the end results could...

hack ln implementation in convnext

@ngimel @csarofeen so, quick check of apex LN in the ResNet50 case (which is quite a bit worse for my measurements than in many of the hybrid cnn-transformer models). It's...

hack ln implementation in convnext

FY my resnet50 test case for ln was a quick hack in resnet.py ``` @register_model def resnet50_ln(pretrained=False, **kwargs): from .layers.norm import LayerNormExp2d, LayerNormExpNg2d, LayerNorm2d # different LN experiments model_args =...

hack ln implementation in convnext

A difference this big ins't adding up. And also Natalia's promising fusion codegen tests being done in float16, all of mine in AMP (since that's how I train). And looking...

hack ln implementation in convnext

> Yes, layer_norm is force-cast to fp32 by amp (tbh, I don't know if it's strictly necessary or is it out of abundance of caution, I've heard some stories where...

hack ln implementation in convnext

> Sorry, wouldn't `_cast_if_autocast_enabled` cast all inputs to fp32? I couldn't find this function. @ngimel it casts the args to get_autocast_gpu_dtype()

[BUG] Training job cannot resume if LR scheduler is Plateau.

@wuye9036 that is a known issue, I don't run into it frequently because I rarely run LR plateau schedules of lengths long enough to care too much about resume (usually...

[BUG] Training job cannot resume if LR scheduler is Plateau.

By the hack I mean edit line 233 in main to `lr_scheduler.step(start_epoch, metric=-100)` or opposite if your metric scale is reverse.

Video dataset support

@Epiphqny yes, video is going to become a focus soon. I'm working on collecting some datasets and will start building/experimenting with model architectures and data loading/augmentation pipelines soonish. I have...

[FEATURE] TPU training / validation support and train / val code refactor.

@tmabraham thanks, might take you up on that, currently thinking through the abstracitons, trying to hide most of cuda + distributed config vs xla + distributed config without making too...