DeepHyperX
DeepHyperX copied to clipboard
Why the lose value is Nan at the beginning of training sometimes?
Describe the bug A clear and concise description of what the bug is.
To Reproduce Steps to reproduce the behavior (e.g. the command that you used).
Expected behavior A clear and concise description of what you expected to happen.
Desktop (please complete the following information):
- OS: [e.g. Linux/Windows]
- CUDA : yes/no
With what parameters? Can you give some more details to reproduce this problem?