lean4 icon indicating copy to clipboard operation
lean4 copied to clipboard

Additional concurrency primitives

Open gebner opened this issue 3 years ago • 7 comments

Lean only has a single concurrency primitive right now, Task, which is very nice when it fits your use case (and also hard to misuse) but you can't easily use it to achieve mutual exclusion. The most complex concurrent program (to my knowledge) in Lean so far is the LSP server, and it already casually mentions possible race conditions in its comments:

  private def nextCmdSnap (ctx : WorkerContext) (m : DocumentMeta) (cancelTk : CancelToken)
      : AsyncElabM (Option Snapshot) := do
    cancelTk.check
    let s ← get
    let lastSnap := s.snaps.back
    if lastSnap.isAtEnd then  -- ...
      return none
    publishProgressAtPos m lastSnap.endPos ctx.hOut
    -- ...
    cancelTk.check
    /- [...] Specifically, it may happen that
      we interrupted this elaboration task right at this point and a newer elaboration task
      emits diagnostics, after which we emit old diagnostics because we did not yet detect
      the interrupt. Explicitly clearing diagnostics is difficult for a similar reason,
      because we cannot guarantee that no further diagnostics are emitted after clearing
      them. [...] -/
    publishDiagnostics m snap.diagnostics.toArray ctx.hOut
    publishIleanInfoUpdate m ctx.hOut #[snap]
    return some snap

To be clear, the concrete issue here is that something can happen between cancelTk.check and publishDiagnostics. (Of course, it's not just the publishDiagnostics call that is a problem. Every other publish* call falls prey to the same race condition.)

The obvious fix for this race condition is to add a mutex primitive and then just acquire the lock to execute cancelTk.check and publishDiagnostics atomically. A concrete API proposal could look like this:

BaseMutex : Type
BaseMutex.mk : BaseIO BaseMutex
BaseMutex.lock (mutex : BaseMutex) : BaseIO Unit
BaseMutex.unlock (mutex : BaseMutex) : BaseIO Unit

-- rust-style guarded state
structure Mutex (α : Type) where
  ref : IO.Ref α
  mutex : BaseMutex

def atomically [Monad M] [MonadLift BaseIO M] [MonadFinally M] (mutex : Mutex α) (k : StateT α M β) : M β := ...

Alternatively we could add proper locking to IO.Ref. Right now, IO.Refs use busy waiting in case of concurrent modifications, so they cannot (practically) be used for mutual exclusion. Adding locking to IO.Ref has several issues. One is a performance issue. We'd need to allocate a mutex for every IO.Ref used from multiple threads and the default mutex implementations take something like 40 bytes of memory and some overhead. We could do our own implementation or integrate something like parking lot to make this cheaper though. The other issue is when to allocate this mutex, e.g. a reference might first be taken in a single-thread context, and only then be sent to another thread while it is still taken.

Other useful primitives would be condition variables and (blocking) queues:

Condvar : Type
Condvar.mk : BaseIO Condvar
Condvar.wait : Condvar → BaseMutex → BaseIO Unit
Condvar.notifyOne : Condvar → BaseIO Unit
Condvar.notifyAll : Condvar → BaseIO Unit

Queue (α : Type) : Type
Queue.mk : BaseIO (Queue α)
Queue.send : α → Queue α → BaseIO Unit
Queue.recv : Queue α → BaseIO α

See also Haskell's MVars for another reference-like concurrency primitive that can be used to directly build queues.

gebner avatar Jul 04 '22 16:07 gebner

FWIW, I think the LSP comment should be restated to point out that this is only a problem with clients without (correct) PublishDiagnosticsClientCapabilities.versionSupport?. Which I'd very much hope is none of them in practice.

Kha avatar Jul 04 '22 16:07 Kha

Which I'd very much hope is none of them in practice.

I'm pretty sure that neovim completely ignores the version field (and doesn't set versionSupport at all). VSCode explicitly sets versionSupport to false. I couldn't figure out where publishDiagnostics is handled in emacs, but I'd be surprised if it does something useful with the version.

gebner avatar Jul 04 '22 18:07 gebner

Yikes, I assumed that they would at least ignore diagnostics with regressing version numbers... I only checked that lsp-mode set versionSupport to t, but it doesn't look like it does anything with it either

Kha avatar Jul 04 '22 19:07 Kha

Another interesting primitive is Promise, which allows you to create a Task whose value is provided later by calling Promise.resolve. This connects the "poll-based" world of tasks with the "push-based" world of IO and channels.

Promise (α : Type) : Type
Promise.new [Nonempty α] : BaseIO (Promise α)
Promise.resolve : α → Promise α → BaseIO Unit
Promise.result : Promise α → Task α

The semantics is that the task promise.result waits until promise.resolve a is called for the first time; after that the task has the value a, and subsequent calls to Promise.resolve do nothing.

gebner avatar Aug 08 '22 10:08 gebner

In some cases, it is interesting to split the above into Promise and a Future. The former is the producer/implementor and the latter is the consummer/caller. This is, for instance, how Scala models it (here and here), and Racket (here and here), for instance.

ydewit avatar Aug 08 '22 10:08 ydewit

@ydewit Thanks for the references! If you look closely the split is already there: Promise is the producer and Task is the consumer. The proposed API is essentially the same as Scala's except for the naming:

Lean Scala Racket
Task Future future
Promise Promise n/a?
Thunk lazy promise

gebner avatar Aug 08 '22 11:08 gebner

Yes, I see now! I think that I unconsciously was assuming that even though all Tasks are a Future, not necessarily all Futures are Tasks, if they can be represented as something other than a Task.

ydewit avatar Aug 08 '22 12:08 ydewit

@gebner Should we close this issue?

leodemoura avatar Oct 08 '22 02:10 leodemoura