Results 2 issues of flycser

## 🐛 Bug [FloatingPointError: Minimum loss scale reached (0.0001). ](https://github.com/facebookresearch/fairseq/issues/1529#issuecomment-567955547) I met this assertation often when do training in fp16. I found it may be avoided by modifying the way...

bug
needs triage

## 🐛 Bug When I ran a model in a distributed model(2 nodes, each node with 2 GPUs) via hydra_train.py. the hydra can not accept arguments starting with "--", while...

bug
needs triage