Ross Wightman
Ross Wightman
@SimJeg in addition to @rom1504's comments, I will warn you that increasing the resolution in a ViT model results in a SIGNIFICANT increase in compute & memory consumption. Also, with...
@SimJeg thanks for the report. I've got some convnext fiddling in the works, so many some more 'resizable' models in the future...
@rom1504 after I thought about this a while back, I feel that --dataset-resampled to create your own 'virtual' epochs is the best approach, adding checkpoints support in the middle of...
@SamuelGabriel thanks for bringing TrivialAugment to my attention, I took a quick look through the code and paper. There is a lot of overlap with RandAugment, especially with some of...
@SamuelGabriel I don't believe you've shown that TA is 'hyper-parameter free', the experiments in the paper max out at ResNet-50 on ImageNet with fairly minor improvements. Being able to adjust...
@SamuelGabriel yes, something like the V2-M using the same hparams would be an interesting data point m=6 is magnitude 6, the # layers is n which is 5 there. The...
@tgisaturday thanks for the PR, I'm going to leave here for now (unmerged). There are too many duplications of loader, model, and layer code, etc for me to merge this...
@tgisaturday or anyone who reads this... With the beta FX feature in PyTorch 1.8, a better approach to post-quantization appears to be on the horizon, I'd be open to a...
> FYI I intend to review (can't set myself as a reviewer) Seems I can't add you as a formal reviewer either, might require reviewer to be added as collaborator....
@csarofeen @kevinstephano @xwang233 putting a few comments down here that relate to the whole PR One of the reasons I haven't put time into exploring the graph replay in train...