Daya Khudia
Daya Khudia
Is this ready for review?
Do we clear the already computed gradients if we run out of memory while backward is partially executed? PyTorch will do += for such already computed gradients on retry.
Are we going with `Mathematically Equivalent` ? Can't think of a better short name but something like `convergence preserving` would be more accurate.
Since others have already reviewed, I am just rubber stamping it.
@andrewilyas Thanks a lot for the explanation. In https://github.com/mosaicml/composer, we have a set of data augmentation algorithms that can be applied flexibly anytime during the training. Dataloader by that time...
Thanks. FFCV already respects the is_active flag?
@andrewilyas Thanks. No problem. I see how this can work for the transformations I write. I was wondering how to make it work for the existing FFCV transformations.
Thanks @andrewilyas , Eagerly waiting for the next release as it has some of the other improvments as well.
> This looks good! Do you mind fixing the merge conflict so that I can add it to `v1.0.0` ? @GuillaumeLeclerc Thanks. Just updated.
IMO it's ok to return "something" for apply methods if it makes sense (e.g., number of layers replaced as @mvpatel2000 suggested) but not model if apply method is modifying the...