I'm wondering about this as well. 1.414 is approximately sqrt(2), which is the diagonal of a unit square. Perhaps they are normalizing by the diagonal here? But their doc (
I have the same problem. I think it fails when the input size is too big
I see your point. It would be bad to lose a checkpoint one wants. I train my models on a slurm server that often preempts and requeues my jobs, so...
Thanks! I will try writing a callback myself.