Revise.jl icon indicating copy to clipboard operation
Revise.jl copied to clipboard

Possibility of lost events due to use of edge-triggered Condition in entr(...)

Open frankier opened this issue 1 year ago • 5 comments

@JanisErdmanis caught a possible race-condition in Revise.entr(...)

In particular in these lines:

https://github.com/timholy/Revise.jl/blob/13a5eb7986ee1239ff938a92043c13fee04579cc/src/callbacks.jl#L163-L166

It looks like any event fired while Revise.revise() is running will be lost. I guess something level-triggered like https://docs.julialang.org/en/v1/base/parallel/#Base.Event could be used. This would assume that there is only one listener, which is probably reasonable.

frankier avatar Aug 09 '24 09:08 frankier

I believe this is still a problem even though the file watching tasks this revision_event watching loop should typically end up running in the same thread (the file watching tasks are sticky). The reason is is that the Julia interpreter can choose to yield at any time and particularly it may yield during IO performed in Revise.revise(...).

frankier avatar Aug 09 '24 10:08 frankier

I'm not sure there's a great solution for this short of file-locking. xref #841

timholy avatar Sep 24 '24 10:09 timholy

I'm not sure if locking one of the places the event is fired with the revision_lock solves the issue. The problem is that for Condition, notify(...) will only wake up all blocking tasks that are blocking at that moment in time.

Imagine we have two tasks, and task 1 is running entr(...), then:


          ----------------------------------------------------------------------------------------------
Task 1    |  wait(revision_event)  |      Revise.revise()     | wait(revision_event)
          ----------------------------------------------------------------------------------------------

Task 2                             |                       |
                             notify(revision_event)  notify(revision_event)

                                                           ^ Event is lost  ^ Waiting despite changes

Another issue is that it appears Base.Condition is being used which is not threadsafe, apparently https://docs.julialang.org/en/v1/base/parallel/#Base.Threads.Condition should be used in this case.

The solution is to use something level triggered. https://docs.julialang.org/en/v1/base/parallel/#Base.Event together with autoreset seems like it should work, although requires Julia 1.8. Note that in this case we need to have only a single "listener", which should probably be fine I guess? Or are there situations where there need to be multiple listeners? In the latter case, I guess some kind of pub/sub pattern would be needed where each listener adds its own autoresetting event, and then notify notifies all of them.

frankier avatar Sep 25 '24 10:09 frankier

Ah, I assumed it was because files were being added to revision_queue in the middle of revise running (which would also be a way for this to happen). So #846 should help, but you're right this doesn't completely solve the problem.

Soon I'll ditch support for anything less than Julia 1.10, so perhaps this can be fully fixed then.

timholy avatar Sep 25 '24 10:09 timholy

https://github.com/JuliaLang/julia/pull/55877 should fix this, with a couple of changes to Revise. Too many deadlines now but it's on the agenda.

timholy avatar Oct 03 '24 12:10 timholy