feat: linter and script for `theorem` vs `lemma`
This PR contains a linter for flagging doc-string-less theorems, as well as a bash script that, once the linter ran on Mathlib, automatically replaces every linter-offending theorem by lemma.
The logic of the linter is as follows.
First, a linter is a command that takes the syntax of each command and does something with it. (Really, a linter can have access to more than just the syntax, since it can also read information contained in infotrees, but this linter does not need that.)
thmNoDoc inspects the syntax of each command. The match does the following:
- if the syntax is a declaration,
-
- if among its modifiers, the one for the docstring is empty,
-
- if among declarations is a
theorem(as opposed toinstance,def, ...),
- if among declarations is a
then it extracts the syntax node for theorem (that will be underlined by the warning) and the syntax node for the name of the theorem (to report that name in the warning).
With this information, the linter logs its warning!
If it helps, this is a "pictorial" representation of the syntax tree of a simple theorem:
inspect: '
@[simp]
nonrec theorem easy : True :=
.intro'
node Lean.Parser.Command.declaration, none
|-node Lean.Parser.Command.declModifiers, none
| |-node null, none
| |-node null, none
| | |-node Lean.Parser.Term.attributes, none
| | | |-atom original: ⟨⟩⟨⟩-- '@['
| | | |-node null, none
| | | | |-node Lean.Parser.Term.attrInstance, none
| | | | | |-node Lean.Parser.Term.attrKind, none
| | | | | | |-node null, none
| | | | | |-node Lean.Parser.Attr.simp, none
| | | | | | |-atom original: ⟨⟩⟨⟩-- 'simp'
| | | | | | |-node null, none
| | | | | | |-node null, none
| | | |-atom original: ⟨⟩⟨\n⟩-- ']'
| |-node null, none
| |-node null, none
| |-node null, none
| |-node null, none
| | |-node Lean.Parser.Command.nonrec, none
| | | |-atom original: ⟨⟩⟨\n⟩-- 'nonrec'
|-node Lean.Parser.Command.theorem, none
| |-atom original: ⟨⟩⟨ ⟩-- 'theorem'
| |-node Lean.Parser.Command.declId, none
| | |-ident original: ⟨⟩⟨ ⟩-- (easy,easy)
| | |-node null, none
| |-node Lean.Parser.Command.declSig, none
| | |-node null, none
| | |-node Lean.Parser.Term.typeSpec, none
| | | |-atom original: ⟨⟩⟨ ⟩-- ':'
| | | |-ident original: ⟨⟩⟨ ⟩-- (True,True)
| |-node Lean.Parser.Command.declValSimple, none
| | |-atom original: ⟨⟩⟨ ⟩-- ':='
| | |-node Lean.Parser.Term.dotIdent, none
| | | |-atom original: ⟨⟩⟨⟩-- '.'
| | | |-ident original: ⟨⟩⟨\n\n⟩-- (intro,intro)
| | |-node Lean.Parser.Termination.suffix, none
| | | |-node null, none
| | | |-node null, none
| | |-node null, none
and of a similar definition:
inspect: '
/-- docs -/
def easy : True :=
.intro'
node Lean.Parser.Command.declaration, none
|-node Lean.Parser.Command.declModifiers, none
| |-node null, none
| | |-node Lean.Parser.Command.docComment, none
| | | |-atom original: ⟨⟩⟨ ⟩-- '/--'
| | | |-atom original: ⟨⟩⟨\n⟩-- 'docs -/'
| |-node null, none
| |-node null, none
| |-node null, none
| |-node null, none
| |-node null, none
|-node Lean.Parser.Command.definition, none
| |-atom original: ⟨⟩⟨ ⟩-- 'def'
| |-node Lean.Parser.Command.declId, none
| | |-ident original: ⟨⟩⟨ ⟩-- (easy,easy)
| | |-node null, none
| |-node Lean.Parser.Command.optDeclSig, none
| | |-node null, none
| | |-node null, none
| | | |-node Lean.Parser.Term.typeSpec, none
| | | | |-atom original: ⟨⟩⟨ ⟩-- ':'
| | | | |-ident original: ⟨⟩⟨ ⟩-- (True,True)
| |-node Lean.Parser.Command.declValSimple, none
| | |-atom original: ⟨⟩⟨ ⟩-- ':='
| | |-node Lean.Parser.Term.dotIdent, none
| | | |-atom original: ⟨⟩⟨⟩-- '.'
| | | |-ident original: ⟨⟩⟨\n\n⟩-- (intro,intro)
| | |-node Lean.Parser.Termination.suffix, none
| | | |-node null, none
| | | |-node null, none
| | |-node null, none
| |-node null, none
The declModifiers can be doc-strings, attributes, visibility stuff, noncomputable, unsafe, partial,... The important part is that they must appear in a specific order, the first one being a doc-string.
Also: should the linter move to
Mathlib/Tactic/Linter? (Did the zulip discussion come to a conclusion?)
That discussion did not come to a conclusion...
Also: should the linter move to
Mathlib/Tactic/Linter? (Did the zulip discussion come to a conclusion?)That discussion did not come to a conclusion...
Given that some consensus was reached now: please move this linter to Tactic/Linter --- but this can happen after the mass replacements.
PR summary d88c5cb5c5
Import changes
No significant changes to the import graph
Declarations diff
+ getLinterHash
+ thmLemmaLinter
+ thmNoDoc
You can run this locally as follows
## summary with just the declaration names:
./scripts/no_lost_declarations.sh short <optional_commit>
## more verbose report:
./scripts/no_lost_declarations.sh <optional_commit>
I just realised: this linter already exists in batteries; it is the docBlameThm linter, defined here in batteries. (That linter is not autofixable by itself, so I see no pain in this linter being written. But I would rather run that linter in mathlib/find a way to run it from mathlib CI, than land a new linter re-invent the wheel.)
Let me ask on zulip about configuring runLinter to enable certain lints explicitly.
Do I understand this correctly that this PR should be adapted to use the docBlameThm linter instead, but the shell script might still be useful?