imaginaire
imaginaire copied to clipboard
UNIT/MUNIT training get stuck after FID computation
Following the official settings but replaced to my own dataset, I found the the training would get stuck just after FID computing. Also, I found that if I replace the None
return value of FID computing function for not-master thread (https://github.com/NVlabs/imaginaire/blob/c6f74845c699c58975fd12b778c375b72eb00e8d/imaginaire/evaluation/fid.py#L66) to a fixed float number, it would hot-fix the problem but a Long value like -1984 would not. According to the error information, it seems that there is a reducer to sum up all the FID value and return an average one.
Currently, I have let every thread to compute the FID like this:
if is_master() or True:
fid = _calculate_frechet_distance(
fake_act, real_act)["FID"]
if return_act:
return fid, real_act, fake_act
else:
return fid
elif return_act:
return None, None, None
else:
return None
Would there be some problems? By the way, my envs is py3.6+torch.1.7.0
Have you found the solution? I encountered the same problem.
Have you found the solution? I encountered the same problem.
Had changed torch version to 1.8.1 and it solved. Or just try the hot-fix codes above.