Fartash Faghri

Results 2 issues of Fartash Faghri

Hi, I think the way optim.sgd is called in the examples has issues with momentum. `config` passed to optim.sgd(_,_,config,state) should be kept from one training iteration to another when no...

I observe a potential memory leak when training with OpenCLIP on a WebDataset. The memory usage is flat during an epoch and then increases at the beginning of the next...

bug