coq-serapi
coq-serapi copied to clipboard
Enable Async support in SerAPI.
SerAPI should support the async proof mode. Experimental support is already in place, however more testing is needed. Remaining tasks are:
- [ ] Teach our internal document mode of
focus
operations. - [x] Document better Coq's upstream API / Flags for async, check we are using the API correctly.
- [x] Fix our handling of command line parameters [needs upstream help]
- [x] Provide worker control commands in the
Stm*
family.
Notes will be taken in notes/async-support.md
a5b5bf9d5755787f9d22f1f8794f0ae94423b3d6 added some preliminary flags.
I don't quite understand it either.
There seem to be many flags:
https://github.com/coq/coq/blob/trunk/toplevel/coqtop.ml#L493
-async-proofs on
seems like the first step.
Yeah I tried to imitate that, but I'm kind of doing things blindly here :dancer:
This script https://github.com/coq/coq/blob/trunk/test-suite/interactive/proof_block.v seems to be processed asynchronously now. YMMV.
Ok I did some more changes and arrived to this:
(Control (StmObserve 22))
....
sertop: unknown option `-m'.
Usage: sertop [OPTION]...
Try `sertop --help' for more information.
(Feedback ...
"Anomaly: Uncaught exception (Failure \"The spawned process did not connect back in 2.0s\").\
\nPlease report.")))
So indeed this looks promising! We just need to instruct the Stm
not to call sertop but the regular coqtop for child processes.
Hey Emilio,
Pretty cool :) Things seem to be going very nicely. Question: after compiling the latest build and passing async, I can see states being processed in parallel, but I'm not getting feedback for states inside of a proof. Here is an example of a transcript:
(add-ltac (Control (StmAdd 1 None "Ltac slow := do 1000 (do 1000 idtac).")))
(Answer add-ltac Ack)
(Answer add-ltac
(StmAdded 3
((fname "")
(line_nb 1)
(bol_pos 0)
(line_nb_last 1)
(bol_pos_last 0)
(bp 0)
(ep 37))
NewTip))
(Answer add-ltac Completed)
(add-lemma (Control (StmAdd 2 (Some 3) "Lemma a : True. Proof.")))
(Answer add-lemma Ack)
(Answer add-lemma
(StmAdded 4
((fname "")
(line_nb 1)
(bol_pos 0)
(line_nb_last 1)
(bol_pos_last 0)
(bp 0)
(ep 15))
NewTip))
(Answer add-lemma
(StmAdded 5
((fname "")
(line_nb 1)
(bol_pos 0)
(line_nb_last 1)
(bol_pos_last 0)
(bp 16)
(ep 22))
NewTip))
(Answer add-lemma Completed)
(add-proof (Control (StmAdd 4 (Some 5) "slow. slow. slow. auto.")))
(Answer add-proof Ack)
(Answer add-proof
(StmAdded 6
((fname "")
(line_nb 1)
(bol_pos 0)
(line_nb_last 1)
(bol_pos_last 0)
(bp 0)
(ep 5))
NewTip))
(Answer add-proof
(StmAdded 7
((fname "")
(line_nb 1)
(bol_pos 0)
(line_nb_last 1)
(bol_pos_last 0)
(bp 6)
(ep 11))
NewTip))
(Answer add-proof
(StmAdded 8
((fname "")
(line_nb 1)
(bol_pos 0)
(line_nb_last 1)
(bol_pos_last 0)
(bp 12)
(ep 17))
NewTip))
(Answer add-proof
(StmAdded 9
((fname "")
(line_nb 1)
(bol_pos 0)
(line_nb_last 1)
(bol_pos_last 0)
(bp 18)
(ep 23))
NewTip))
(Answer add-proof Completed)
(add-qed (Control (StmAdd 2 (Some 9) "Qed.")))
(Answer add-qed Ack)
(Answer add-qed
(StmAdded 10
((fname "")
(line_nb 1)
(bol_pos 0)
(line_nb_last 1)
(bol_pos_last 0)
(bp 0)
(ep 4))
NewTip))
(Answer add-qed Completed)
(Control (StmObserve 10))
(Answer 4 Ack)
(Feedback
((id
(State 10))
(contents
(ProcessingIn master))
(route 0)))
(Feedback
((id
(State 5))
(contents
(ProcessingIn master))
(route 0)))
(Feedback
((id
(State 4))
(contents
(ProcessingIn master))
(route 0)))
(Feedback
((id
(State 3))
(contents
(ProcessingIn master))
(route 0)))
(Feedback
((id
(State 2))
(contents Processed)
(route 0)))
(Feedback
((id
(State 3))
(contents
(Message Info
(Element
(_ nil
((PCData "slow is defined"))))))
(route 0)))
(Feedback
((id
(State 3))
(contents Processed)
(route 0)))
(Feedback
((id
(State 4))
(contents Processed)
(route 0)))
(Feedback
((id
(State 5))
(contents Processed)
(route 0)))
(Feedback
((id
(State 5))
(contents Processed)
(route 0)))
(Feedback
((id
(State 4))
(contents Processed)
(route 0)))
(Feedback
((id
(State 10))
(contents
(Message Info
(Element
(_ nil
((PCData "a is defined"))))))
(route 0)))
(Feedback
((id
(State 10))
(contents Incomplete)
(route 0)))
(Feedback
((id
(State 10))
(contents Processed)
(route 0)))
(Answer 4 Completed)
Although I get a message from state 10 saying that a
is defined, I don't get anything for states 6, 7, 8, 9 (the ones inside the Proof .. Qed
block). Am I doing something wrong?
Data point: apparently it works if I call observe separately on the last state of each proof.
Cool ! :) Here we are walking a bit on unexplored territory, it will be hard to figure things out if we cannot get @gares on board :)
A guess, does using the Join
method change things?
Also a good reference is to use CoqIde an see how it behaves. The mapping roughly is:
-
Go to point icon:
Add
the document up to point, doesObserve
in the tip of the Stm. -
Gears icon: does
Join
on the document.
Ok; let me try join :) Can you reproduce the thing I'm seeing in the example above btw?
I see exactly what you see. Plus if I do join:
(Control StmJoin)
(Answer 5 Ack)
(Feedback ((id (State 10)) (contents Processed) (route 0)))
(Feedback ((id (State 9)) (contents (ProcessingIn master)) (route 0)))
(Feedback ((id (State 8)) (contents (ProcessingIn master)) (route 0)))
(Feedback ((id (State 7)) (contents (ProcessingIn master)) (route 0)))
(Feedback ((id (State 6)) (contents (ProcessingIn master)) (route 0)))
(Feedback ((id (State 5)) (contents Processed) (route 0)))
(Feedback ((id (State 6)) (contents Processed) (route 0)))
(Feedback ((id (State 7)) (contents Processed) (route 0)))
(Feedback ((id (State 8)) (contents Processed) (route 0)))
(Feedback ((id (State 9)) (contents Processed) (route 0)))
(Feedback ((id (State 10)) (contents Complete) (route 0)))
(Answer 5 Completed)
Cool, thanks. Implemented Join
in https://github.com/cpitclaudel/elcoq/commit/17c73f175ad7a5f027fddb3bee7a7cee95db0a40. It seems to only process the last proof though (that is, if there are, say, 10 closed proofs, it just observes the last closed one).
https://asciinema.org/a/cqmk80fnyfn26xlvcmu9ttf61
Great, I'll have a look!
Wow it is beautiful !!! wrt to the join, indeed when you join the Stm goes backwards on the document (I have no clue if there is any reason for that), the moment it finds an error, it stops.
This is due to join checking the whole document I think, it really checks that the proofs the worker generated are right and this needs to be done somehow in a single threaded fashion to still ensure consistency wrt the kernel.
I guess that it depends on the interpretation we do of the feedback messages, I see in this example you get a few Incomplete
messages. This may indicate that the STM is being very very lazy....
But somehow I'm seeing now a different parallelism than before. I wonder why.
Wow it is beautiful !!!
Thanks :blush:
Now workers are created again, there was a problem with the ordering of flag-setting. I've chosen a conservative flag setup: SerAPI imitates CoqIDE. Feel free to play thou with the flags in sertop_init.ml and in particular with setting async_proofs_full
back to true.
Note however that there is a race condition in printing. I have no clue how to the workers send the feedback messages so I'll have to look into it!
Did you use a mutex as CoqIDE? https://github.com/coq/coq/blob/trunk/ide/ide_slave.ml#L467
Sorry, I did not see the commit. IIRC It is needed because the Stm uses 1 thread per worker, and feedback coming from workers is forwarded. A thread may be interrupted while sending a XML tree, hence you need the mutex. The feeder code is called by the thread.
Now that I think about it, I guess the proper place for the mutex is Pp.feedback ...
@gares IMHO it is not so bad to require clients to assume a re-entrant callback, but it should be documented of course.
Catching up with this thread. Stm.finish is like observing the tip (what the "satus false" XML message does, IIRC); while Stm.join also joins the environment ("status true" in XML) that means all proof terms produced by workers are type-checked by the kernel of main Coqtop and all universe constraints are also checked. A worker typechecks in his corner the proof term, but 1) I don't want to trust workers, 2) only the main coqtop has a complete view and can check universes (a worker alone can only check satisfiability locally to his constraints).
Both calls are blocking, but the latter waits for a complete checking of everything, while the former does not wait for delegated proofs (as GoTo-Point in CoqIDE).
Thanks Enrico! Note that Stm.join
seems broken for me when using --async_full
. Are we missing something there?
I think it is untested, since Coqoon has no join buttun. But should work (I see no if async_proofs_full in the code path). Broken in which way.
If I do:
$ rlwrap ./sertop.native --human --prelude ~/external/coq-git --async ~/external/coq-git/bin/coqtop --async-full
(Control (StmAdd 100 None "
Ltac slow := do 1000 (do 1000 idtac).
Ltac wrong := intro.
Lemma a: True.
Proof. slow. slow. slow. slow. slow. slow. auto. Qed.
Lemma b: True.
Proof. slow. slow. slow. auto. Qed.
Lemma c: True.
Proof. slow. slow. slow. auto. Qed.
Lemma d: True.
Proof. slow. slow. slow.
"))
(Control StmJoin)
master never responds again, it seems hung. Of course, it could well be a SerAPI bug.
Hum, I guess you need to observe the tip first... I can't say why now, I've to check.
I guess join crawls the document for futures, but nobody scheduled them... observe does.
It is a combination CoqIDE does not let you do, since adds are always followed by observe.
We do an observe first, but something seems to be going on... We need to investigate more. Somehow with our --async-full
we are getting very strange things.
It is a combination CoqIDE does not let you do, since adds are always followed by observe.
I see, thanks, we do n adds one observe; with usual CoqIDE mode it seems to work well.
By the way, I'd swear that I witnessed a race condition just now: somehow, after an observe that triggered async checking, a worker produced a spurious error! Interesting!