Checkpointing.jl icon indicating copy to clipboard operation
Checkpointing.jl copied to clipboard

Correctness issue

Open wsmoses opened this issue 1 year ago • 5 comments

https://github.com/Argonne-National-Laboratory/Checkpointing.jl/blob/5cacc30d614963def9c78f1d152f6caaff576fc9/src/Rules/EnzymeRules.jl#L15

@michel2323 this line seems to have a correctness issue. It's not necessarily the cause of a different issue @swilliamson7 and I were debugging, but seems sufficiently bad that it could be the culprit.

In essence you're only running the body of the loop (func.val) if the return is needed. However, if, for example, the return is nothing and the body of the fnuction updates something in place, your rule will tell Enzyme to not execute the original body of the function which is wrong.

cc @vchuravy

wsmoses avatar May 01 '24 22:05 wsmoses

The fix should basically just be moving func.val before the if statement

wsmoses avatar May 01 '24 22:05 wsmoses

Same thing for the while loop, presumably

wsmoses avatar May 01 '24 22:05 wsmoses

Should be fixed https://github.com/Argonne-National-Laboratory/Checkpointing.jl/pull/45 . Though when doing multilevel checkpointing I still have an issue with the primal. Gotta look into it.

michel2323 avatar May 02 '24 19:05 michel2323

Hopefully okay now, thank you Michel! I need to check derivatives, but they aren't zero anymore which is good 😊

swilliamson7 avatar May 02 '24 20:05 swilliamson7

Fingers crossed. Ping me if not! @swilliamson7

michel2323 avatar May 02 '24 23:05 michel2323

We should move to https://github.com/Argonne-National-Laboratory/Checkpointing.jl/pull/72.

michel2323 avatar Jun 09 '25 16:06 michel2323