pytorch_optimizer
pytorch_optimizer copied to clipboard
optimizer & lr scheduler & loss function collections in PyTorch
Hi, thank you so much for your repo, I am using SAM optimizer but I am facing this error, how to fix this? RuntimeError: [-] Sharpness Aware Minimization (SAM) requires...
Hi I just discovered your repo and I would like to try it to fine-tune my ParlAI blenderbot2 (see https://github.com/facebookresearch/ParlAI) model. However, I am running the model in FP16 precision...
## Paper or Code REX LR scheduler From https://arxiv.org/abs/2107.04197 Implementation is based on https://github.com/Nerogar/OneTrainer/blob/2c6f34ea0838e5a86774a1cf75093d7e97c70f03/modules/util/lr_scheduler_util.py#L66
## SAM as an Optimal Relaxation of Bayes > Sharpness-aware minimization (SAM) and related adversarial deep-learning methods can drastically improve generalization, but their underlying mechanisms are not yet fully understood....
In `pytorch-optimizer v3`, `loss function` will be added. So, finally, the optimizer & lr scheduler & loss function are all in one package. ## Feature - [x] support at least...
#params = 151111638 #non emb params = 41066400 | epoch 1 step 50 | 50 batches | lr 0.06 | ms/batch 1378.43 | loss 7.85 | ppl 2570.784 | epoch...
## Paper and Code Paper: [Memory Efficient Optimizers with 4-bit States](https://arxiv.org/abs/2309.01507) Code : https://github.com/thu-ml/low-bit-optimizers/blob/main/lpmm/optim/optimizer.py
I just swap out Nero optimizer in my Lightning AI loop and gave the new Shampoo a try. There is something going on with it, as this card is typically...
https://arxiv.org/abs/2211.09760 > While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers. In this work, we leverage the same scaling approach...