PIDM icon indicating copy to clipboard operation
PIDM copied to clipboard

fix checkpoint

Open gameofdimension opened this issue 8 months ago • 0 comments

  1. make use_reentrant=True explicit, because it will default to true if it is not assigned
  2. fix gradient checkpoint when it used with dropout turned on. if preserve_rng_state=False, the dropout will definitely not work, because gradient flows into wrong input cells

it can be shown with a failed 300 epoch training, with preserve_rng_state=False and use_checkpoint=True.

some samples at the end of one failed training:

image

gameofdimension avatar Jun 11 '24 14:06 gameofdimension