coq-serapi
coq-serapi copied to clipboard
[query] Support querying dependency data
Since the inception of SerAPI providing a way to query dependency data was a goal. In fact, this is possible to do now by clients and you can dump the actual proof term in sexp format and then perform the analysis yourself.
However, the very nice work on regression testing https://github.com/coq/coq/issues/9262 could indeed benefit from having specific support for dependency data. In particular, we may want to provide two calls:
-
(Query opts (Deps id))
: Output dependencies of identifiedid
, -
(Query opts (Digest id))
: Output digest ofid
's body, -
(Query opts (RangeDigest range))
: Output digest + deps for identifiers in a document range [in icoq style]
The interface for updating a prior proof is not clear yet, but likely the easiest for now is to piggyback on the .vio
interface.
There are many open questions about how to better do this, a start will be to embed the dep-tracking computation of https://github.com/Karmaki/coq-dpdgraph , and implement 1 and 2 so we can start experimenting.
cc: @palmskog
This is a great start, but I will make some additions to the wish list of commands.
In regression proving, we make a distinction between the digest/dependencies for the type of an identifier, and those for its body. In particular, when an identifier id
is opaque, another identifier that depends directly on id
does not transitively depend on the dependencies in the body of id
- only on the dependencies in the type of id
. Intuitively, the rationale is that only a change in the type of id
can make proofs that use id
fail.
On that note, I propose the following additional commands:
-
(Query opts (TypeDeps id))
: Output dependencies of the type ofid
-
(Query opts (TypeDigest id))
: Output digest of the type ofid
Note in particular that for opaque identifiers in .vio
files, the body of id
cannot be obtained. This was one of the reasons I had to fork dpdgraph for iCoq.
Furthermore, we are interested in getting dependencies/digests for all identifiers in a specific module, but this could be subsumed in a range
parameter.
Indeed maybe we should just have the query to return a struct:
type object_imprint =
{ digest : Digest.t
; deps : Digest.t list
}
type definition_imprint =
{ type : object_imprint
; body : object_imprint option
}
or something like that.
Regarding getting all the identifiers of a module we will need to use a hack :( :(
I have complained in https://github.com/coq/coq/issues/8495 and https://github.com/coq/coq/pull/8228 about this very strong regression [if we would like to use the environment to grab objects]. This is the kind of silly blockage that makes our life hard IMHO [the worst is that this change was accidentally introduced !!]
But indeed, the part of Coq that manages objects [libobject] is by far the one that poses the most problems for people willing to do the kind of work we are trying to do. Even Search
is implemented in Coq using a hack, due to lack of an structured representation for .vo
data :(
I've given this some more thought now, and I believe we need kernel names (e.g., mathcomp.ssreflect.fintype.uniq_enumP
) and not just digests for dependencies. Here is a revised proposal.
Data structures:
type const_deps =
{ type : Name.t list (* kernel names of type deps *)
; body : Name.t list option (* kernel names of body deps, if available *)
}
type const_digest =
{ type : Digest.t (* canonical checksum of the type *)
; body : Digest.t option (* canonical checksum of the body, if available *)
}
API calls:
-
(Query opts (Deps id))
: returnsconst_deps
forid
-
(Query opts (RangeDeps range))
: returnslist const_deps
forrange
-
(Query opts (Digest id))
: returnsconst_digest
forid
-
(Query opts (RangeDigest range))
: returnslist const_digest
forrange
Note that dependencies are assumed to be direct dependencies rather than the whole transitive closure (which usually cannot be obtained anyway from .vio
files).
Here is an example to make things more concrete.
Files in namespace ST
ListUtil.v
:
Require Import List. Import ListNotations.
Lemma remove_preserve : forall (A : Type) A_eq_dec (x y : A) xs,
x <> y -> In y xs -> In y (remove A_eq_dec x xs).
Proof.
induction xs; simpl; intros.
- intuition.
- case A_eq_dec; intros.
* apply IHxs; subst; intuition.
* intuition; subst; left; auto.
Qed.
Lemma in_remove : forall (A : Type) A_eq_dec (x y : A) xs,
In y (remove A_eq_dec x xs) -> In y xs.
Proof.
induction xs; simpl; intros; auto.
destruct A_eq_dec; simpl in *; intuition.
Qed.
Dedup.v
:
Require Import List ListUtil. Import ListNotations.
Fixpoint dedup (A : Type) A_eq_dec (xs : list A) : list A :=
match xs with
| [] => []
| x :: xs =>
if in_dec A_eq_dec x xs then dedup A A_eq_dec xs
else x :: dedup A A_eq_dec xs
end.
Lemma remove_dedup : forall A A_eq_dec (x : A) xs,
remove A_eq_dec x (dedup A A_eq_dec xs) =
dedup A A_eq_dec (remove A_eq_dec x xs).
Proof.
induction xs; intros; auto; simpl.
repeat (try case in_dec; try case A_eq_dec; simpl; intuition); auto using f_equal.
- exfalso. apply n0. apply remove_preserve; auto.
- exfalso. apply n. apply in_remove in i; intuition.
Qed.
File dependencies
-
Dedup.v
→ListUtil.v
Obtained by running coqdep
on both files.
Identifier partitioning
-
ListUtil.v
:-
ST.ListUtil.remove_preserve#type
-
ST.ListUtil.remove_preserve#body
(proof task) -
ST.ListUtil.in_remove#type
-
ST.ListUtil.in_remove#body
(proof task)
-
-
Dedup.v
:-
ST.Dedup.dedup
-
ST.Dedup.remove_dedup#type
-
ST.Dedup.remove_dedup#body
(proof task)
-
Obtained by quick-compiling both files in topological sort order and querying the .vio
files.
Identifier dependencies
-
ST.ListUtil.remove_preserve#body
→ST.ListUtil.remove_preserve#type
-
ST.ListUtil.in_remove#body
→ST.ListUtil.in_remove#type
-
ST.Dedup.remove_dedup#type
→ST.Dedup.remove
-
ST.Dedup.remove_dedup#body
→ST.Dedup.remove_dedup#type
-
ST.Dedup.remove_dedup#body
→ST.Dedup.remove
-
ST.Dedup.remove_dedup#body
→ST.ListUtil.remove_preserve#type
-
ST.Dedup.remove_dedup#body
→ST.ListUtil.in_remove#type
Obtained by querying the .vio
files and checking proofs.
Effects of a change
Suppose we change ST.Dedup.dedup
to the following equivalent definition:
Fixpoint dedup (A : Type) A_eq_dec (xs : list A) : list A :=
match xs with
| [] => []
| x :: xs =>
let tail := dedup A A_eq_dec xs in
if in_dec A_eq_dec x xs then tail else x :: tail
end.
We then perform the following steps:
- compute digests of
ListUtil.v
andDedup.v
; mark the latter as modified and topologically sort - quick-compile
ListUtil.v
andDedup.v
intoListUtil.vio
andDedup.vio
- compare digests to determine whether
ST.Dedup.remove
orST.Dedup.remove_dedup#type
are different; mark the former as modified - compute impacted proof tasks in the inverse of the identifier dependency graph above:
ST.Dedup.remove_dedup#body
- check the proof of
ST.Dedup.remove_dedup
usingDedup.vio
We came up with a JSON-based format for representing Coq dependency graphs at the file and proof levels. It may help understanding to show the above example in this format.
Old file-level dependency graph
[
{ "uri": "/home/user/projects/chip/examples/st/ListUtil.v",
"id": 3955365859,
"checkable": true,
"neighbors": [],
"contents": [3345550460,2461010471,3728870587,2163935719],
"checksum": "60e8a717"
},
{ "uri": "/home/user/projects/chip/examples/st/Dedup.v",
"id": 2758678171,
"checkable": true,
"neighbors": [3955365859],
"contents": [2433485337,591398152,2138507738],
"checksum": "9ffbce4b"
}
]
New file-level dependency graph
[
{ "uri": "/home/user/projects/chip/examples/st/ListUtil.v",
"id": 3955365859,
"checkable": true,
"neighbors": [],
"contents": [3345550460,2461010471,3728870587,2163935719],
"checksum": "60e8a717"
},
{ "uri": "/home/user/projects/chip/examples/st/Dedup.v",
"id": 2758678171,
"neighbors": [3955365859],
"contents": [2433485337,591398152,2138507738],
"checkable": true,
"checksum":"62f7cfdd"
}
]
Old proof-level dependency graph
[
{ "uri": "ST.ListUtil.remove_preserve#type",
"id": 3345550460,
"checkable": false,
"neighbors": [],
"checksum": "69c406b0"
},
{ "uri": "ST.ListUtil.remove_preserve#body",
"id": 3728870587,
"checkable": true,
"neighbors": [3345550460],
"checksum": "9e602d3e"
}
{ "uri": "ST.ListUtil.in_remove#type",
"id": 2163935719,
"checkable": false,
"neighbors": [],
"checksum": "6a7806de"
},
{ "uri": "ST.ListUtil.in_remove#body",
"id": 2461010471,
"checkable": true,
"neighbors": [2163935719],
"checksum": "30251b5b"
},
{ "uri": "ST.Dedup.remove_dedup#type",
"id": 2138507738,
"checkable": false,
"neighbors": [591398152],
"checksum": "69ad06ac"
},
{ "uri": "ST.Dedup.remove_dedup#body",
"id": 2433485337,
"checkable": true,
"neighbors": [3345550460,591398152,2163935719,2138507738],
"checksum": "dd804e41"
},
{ "uri": "ST.Dedup.dedup",
"id": 591398152,
"checkable": false,
"neighbors": [],
"checksum": "6a3206c51d6103c5"
}
]
New proof-level dependency graph
[
{ "uri": "ST.ListUtil.in_remove#type",
"id":2163935719,
"checkable": false,
"checksum": "6a7806de"
},
{ "uri": "ST.ListUtil.in_remove#body",
"id": 2461010471,
"checkable": true,
"checksum": "30251b5b"
},
{ "uri": "ST.ListUtil.remove_preserve#type",
"id": 3345550460,
"checkable": false,
"checksum": "69c406b0"
}
{ "uri": "ST.ListUtil.remove_preserve#body",
"id": 3728870587,
"checkable": true,
"checksum": "9e602d3e"
},
{ "uri": "ST.Dedup.remove_dedup#body",
"id": 2433485337,
"checkable": true,
"checksum":"dd804e41"
},
{ "uri": "ST.Dedup.remove_dedup#type",
"id": 2138507738,
"checkable": false,
"checksum": "69ad06ac"
},
{ "uri": "ST.Dedup.dedup",
"id": 591398152,
"checkable": false,
"checksum": "6a3206c51de703dc"
}
]
Expected output from topological sorting
/home/user/projects/chip/examples/st/ListUtil.v
/home/user/projects/chip/examples/st/Dedup.v
Expected output from selection
ST.Dedup.remove_dedup#body
Discussion
- The checksum for
/home/user/projects/chip/examples/st/Dedup.v
is different in the new file-level graph. - The checksum for
ST.Dedup.dedup
is different in the new proof-level graph. - Dependencies from the Coq standard library have been filtered out.
- There are no special requirements on identifiers (
id
field) except that they must be integers - for example, they could be generated from kernel names by hashing. - Topological sorting uses dependencies in the new file dependency graph, while the old file dependencies are used for selection.
- All Coq files have
checkable: true
, since we can runcoqc -quick
on them. - Only the bodies of opaque constants have
checkable: true
, since they are the ones we can check in isolation (viacoqc -check-vio-tasks
). - We simply concatenate the type and body checksums for transparent constants like
ST.Dedup.dedup
; one could instead compute new a checksum of their concatenated checksums to get something of the same length. - We have an OCaml tool, extracted from verified Coq code, for doing this type of hierarchical selection and topological sorting given the information above.
- We currently calculate file checksums on the raw
.v
files for selection; it may be better to calculate "smarter" checksums by, e.g., checksumming the sexprialized output ofsercomp
.
Very cool, thanks for the info.
I plan to use https://github.com/janestreet/ppx_hash for this; ideally we'd get this in Coq 8.11 by the way.
I mean, once Dune is the main build system of Coq, we can vendor ppx painlessly. So for example we could remove from serlib
big chunks of [@@deriving sexp]
.