mathlib4
mathlib4 copied to clipboard
feat: Mathlib weekly reports
This PR introduces a weekly cron job that computes a "global" report on the evolution of Mathlib in the previous week.
It consists of
- a CI workflow with a cron job that runs at minight on Sunday (
.github/workflows/mathlib_stats.yaml); - a Lean file extracting a categorized list of "all" the declarations in Mathlib (
scripts/count_decls.lean); - a bash file computing Git-diff-related information and collating the data from the Lean file (
scripts/mathlib_stats.sh); - a convenience CI workflow that is triggered on adding the
test-cilabel and results in posting on the PR and on Zulip the report (.github/workflows/mathlib_stats_label.yaml).
The second CI workflow is intended to be removed just before/right after the PR is ready to merge.
Weekly stats (2024-05-13...2024-05-20)
| Type | Total | % |
|---|---|---|
| Theorems | 160,386 (+765 -104) | 0.41% |
| Definitions-Data | 57,821 (+306 -31) | 0.48% |
| Data | 8,121 (+354 -29) | 0.62% |
| Definitions-Props | 7,760 (+43 -3) | 0.52% |
| Types | 1,264 (+1 -0) | 0.00% |
| Definitions-Types | 411 (+1 -0) | 0.24% |
| Predicates | 5 (+0 -0) | 0.00% |
| Definitions-Predicates | 3 (+0 -0) | 0.00% |
866 files changed (+43 -10 ~805), 7508 lines changed (+19972 -12464)
Reference commits: old cec3ec0be2, new b2214f7318.
Take also a look at the Mathlib stats page.
PR summary
Import changes
No significant changes to the import graph
Diff of declarations
+ Tally
+ mkTally
+ toString
+ updateTally
You can run this locally as follows
## summary with just the declaration names:
./scripts/no_lost_declarations.sh short <optional_commit>
## more verbose report:
./scripts/no_lost_declarations.sh <optional_commit>
I'm sorry, but I still don't understand the row labels, shed think they need work. Are the dashes subtraction, or subcategories, etc. The labels need to be completely self explanatory.
Coming here for PR triage: is this PR still relevant? (Perhaps it is: we have the technical debt report, but that reports other statistics.) If it's relevant, what's missing to move this forward?