Masahiro Wada
Results
2
issues of
Masahiro Wada
## What does this PR do? This PR provides some workarounds to use DeepSpeed in finetuning. In fact, DeepSpeed cannot work with pytorch-lightning completely because its parameter loading and storing...
## What does this PR do? If wrong parameters are given as arguments, `_adjust_batch_size` cannot adjust batch size correctly. So this PR add some codes to raise ValueErrors in that...
trainer: tune
has conflicts
code quality
community
pl