PIDM
PIDM copied to clipboard
fix checkpoint
- make
use_reentrant=True
explicit, because it will default to true if it is not assigned - fix gradient checkpoint when it used with dropout turned on. if
preserve_rng_state=False
, the dropout will definitely not work, because gradient flows into wrong input cells
it can be shown with a failed 300 epoch training, with preserve_rng_state=False
and use_checkpoint=True
.
some samples at the end of one failed training: