Results 2 issues of Masahiro Wada

## What does this PR do? This PR provides some workarounds to use DeepSpeed in finetuning. In fact, DeepSpeed cannot work with pytorch-lightning completely because its parameter loading and storing...

## What does this PR do? If wrong parameters are given as arguments, `_adjust_batch_size` cannot adjust batch size correctly. So this PR add some codes to raise ValueErrors in that...

trainer: tune
has conflicts
code quality
community
pl