`pre-commit` hooks demo
This PR expands upon #7884 to demo some potential DX improvements through adding pre-commit hooks. It contains the following features:
- Run
shellcheck. - Re-implements
make check-amount-accessas a local pygrep hook. - ~Implements a clang-format hook to fail on suggestion warnings with additional configuration to sort includes.~
- Re-implements
make check-discouraged-functionsas a local pygrep hook. - Runs
codespellto check for common spelling errors. - Runs check-jsonschema to validate schemas and metaschemeas in the
doc/directory.- Last checked, this exposed a few invalidations in a few of the the schema definitions, and seems particularly useful.
- Pretty-formats the
JSONschemas - ~Checks Git commit message conforms to Conventional Commits standard, using a best-guess prototype for the list of application scopes.~
Many of these experiments output a lot of errors/warnings against the existing code. Generally, pre-commit should be only checking the current changeset and provoking the developer to address issues incrementally. Interested parties can run specific hooks on the entire codebase with pre-commit run --all-files [hook-id]
Particular checks and standards may not be applicable or aligned with the workflow of the maintainers. I'd be happy and interested to break this work into separate PRs for specific hooks where there is interest, or explore adding more.
Relates to #7765.
Checklist
Before submitting the PR, ensure the following tasks are completed. If an item is not applicable to your PR, please mark it as checked:
- [x] The changelog has been updated in the relevant commit(s) according to the guidelines.
- [x] Documentation has been reviewed and updated as needed.
- [x] Related issues have been listed and linked, including any that this PR closes.
Changelog-None
@s373nZ hi! noting your recent reference to this PR.... do you plan on returning to this PR anytime soon? I'm doing some spring cleaning of our open PRs
Hi @madelinevibes! :wave: Updated this by rebasing against master. Happy to return to it if parts seem useful, but don't want to make too many assumptions about which checks would be acceptable to the team.
Marking this ready for review and removing the [WIP] to better flag it for someone to take a quick look, and hopefully provide initial feedback or further direction on requirements.
Hey! Thank you for this awesome PR! I had been running this heavily vibecoded (and very verbose) precommit hook to fix up my nits but this looks very neat and tidy! The repos used for the most part also seem reliable and maintained. I would also add whitespace fixing and flake8 fixing in this PR as well, I find that most of the time my prebuild checks fail because of whitespace and flake8 errors :(
Also, have you checked how long it takes to run this? Just wondering because we're making a fair few external requests.
Rebased against master and added trailing-whitespace and end-of-file-fixer. Also responded to other review comments.
Also, have you checked how long it takes to run this? Just wondering because we're making a fair few external requests.
First run ALL FILES (fresh install):
pre-commit clean
time pre-commit run --all-files
Executed in 37.23 secs fish external
usr time 47.16 secs 0.33 millis 47.16 secs
sys time 22.12 secs 1.19 millis 22.12 secs
Regular run ALL FILES:
time pre-commit run --all-files
Executed in 14.27 secs fish external
usr time 35.73 secs 0.00 micros 35.73 secs
sys time 19.07 secs 759.00 micros 19.07 secs
Regular run on common/* (a supposedly massive change set):
time pre-commit run --files common/*
Executed in 1.43 secs fish external
usr time 6.85 secs 1.15 millis 6.85 secs
sys time 2.31 secs 1.05 millis 2.31 secs
Regular run on cli/* (a supposedly more minimal change set):
time pre-commit run --files cli/*
Executed in 574.59 millis fish external
usr time 461.00 millis 0.03 millis 460.98 millis
sys time 399.49 millis 1.02 millis 398.47 millis
During a commit, pre-commit only runs on the files in the changeset, so most of the time it is processing very few files. Hooks are installed into the system at ~/.cache/pre-commit so we aren't making many external requests except for the first run.
It's always good to have an opinionated list. The downside is, we might all disagree with that list!
In particular, the conventionalcommits.org is crap. I hate "chore:" as a prefix. It's non-informative.
Prefixes should refer to subsystems, e.g:
- lightningd: xxx
- plugins: xxx
- askrene: xxx
- pytest: xxx
- common: xxx
- docs: xxx
If there's no logical subsystem, I sometimes omit it, and sometimes use "global". But usually I talk about what system, or subdaemon, or plugin is hit (even if, as a side effect, other code needs to be adapted!).
Will this be integrated into the CI, or is it intended only for local testing? My local run is currently failing on almost all steps :(.
- Rebased against
master. - REMOVED the
conventional-commitandclang-formathooks because they are highly experimental and will likely be far more intrusive than helpful for now. Just let me know if you want them back! Havingclang-formatactive is like opting in to fix every C file you touch and there are so many failures, I imagine it to be really annoying. Could be good to reintegrate after some configuration tuning or in a separate initiative. See below for conventional commits rationale. - Replaced
shellcheckwithshellcheck-pyto avoid dependency on Docker per @sangbida's feedback. - Added @sangbida's script in https://github.com/ElementsProject/lightning/pull/8193#discussion_r2604904006 to
tools/fix-style-errors(still usesclang-formatFYI). - Added
pre-committo thedevgroup inpyproject.tomlso it's installed along with other Python dependencies during a new development environment setup. - Added a section to the Contributor Workflow documentation to describe
pre-commitand how to opt-in / install it to a local repo. - Replied to all the review feedback, any unresolved comments are awaiting your resolution, feedback or direction.
It's always good to have an opinionated list. The downside is, we might all disagree with that list!
@rustyrussell this list was a best-guess demo derived by scanning a few pages of the commit history in April and configuring based on observed conventions.
In particular, the conventionalcommits.org is crap. I hate "chore:" as a prefix. It's non-informative.
Noted, and removed it as a commit linter. The prior configuration should have required chore(subsystem) or fix(subsystem), which isn't what you're looking for. Would probably be way more annoying than helpful for you, as well as new contributors.
Prefixes should refer to subsystems
If the team wanted to formalize conventions for commit messages and add them to the developer documentation, it may be possible to leverage the Conventional Commit hook to enforce it. Although, it might not make sense to bend that tool too far and consider a customized implementation.
Will this be integrated into the CI, or is it intended only for local testing? My local run is currently failing on almost all steps :(.
@ShahanaFarooqui Definitely not ready for CI yet, IMO. An idea for a "mini road map" toward this might be something like:
- Reach internal consensus on which hooks to accept for now in this PR.
- Merge this PR.
- Open follow up issues / PRs to address outstanding failures for the accepted hooks (spelling, formatting, schema etc).
- Interested developers opt-in and test-drive it daily to discover bugs, quirks and improvement ideas.
- When all the checks are passing using
--all-files(PRs in3addressed) and deemed non-intrusive, integrate into CI alongside existing checks. - Incrementally replace the checks in
make checkwithpre-commithook equivalents.
I would be interested / curious to help out with the follow ups in 3 to start with.
Hey! Here's a list of requirements that @ShahanaFarooqui and I came up with:
Must-haves:
- Compatible with the existing codebase: if code passes prebuild checks, it should also pass the pre-commit hook.
- Opt-in: developers should be able to commit "ugly" code if needed.
- Detect shell scripting bugs (check-shellcheck).
- Prevent direct access to amount_msat and amount_sat members (check-amount-access).
- Flag usage of discouraged functions (check-discouraged-functions).
- Enforce commit message format with subsystem: prefix.
- Automatic fixes for:
- Sorting #include statements in C files (check-hdr-include-order).
- Formatting C files (clang-format).
- Python code style (check-python-flake8).
- JSON schemas (check-fmt-schemas).
- Trailing whitespace, end-of-file issues, and extra lines.
Nice-to-haves:
- Spelling fixes.
- From @ShahanaFarooqui - I attempted to do this locally and after 100+ file fixed, I was just ~25% done. So yes, It's useful but should not be added at this time. We will revisit once the hook is ready to be added in the CI directly.
@s373nZ Let me know what you think?
@sangbida @ShahanaFarooqui This is a great list. Thanks! I'm happy you're interested in restoring clang-format and commit linting. A few questions:
Slightly concerned about a conflict between the requirements if code passes prebuild checks, it should also pass the pre-commit hook and Automatic fixes for.... If pre-commit performs automatic changes to the code, it will also fail the hook run. Should the automatic fixes be applied prior to the commit, or in a separate process? For example, if clang-format runs, on a C file in the change set, it won't pass the check until it's feedback has been resolved.
Python code style (check-python-flake8).
I assume this means you prefer flake8 over ruff, so I'll remove the existing ruff hook?
Enforce commit message format with subsystem: prefix.
Would you like to supply a list of available subsystems? I can also just take a first guess based on Rusty's and the commit log...
Given the present information, here's my current plan:
- Restore the
clang-formathook, explore removing--dry-runand-Werrorflags to accept its changes automatically. - Explore checking (and fixing?) the
#includesort order outside ofclang-formatwith a custom hook.- There is an interesting challenge in that the list of checked files is built dynamically from project's included
Makefiles. Looking into a clean way to leverage the list into a hook process.
- There is an interesting challenge in that the list of checked files is built dynamically from project's included
- Define which hooks should be run during the
manualstage, meaning they aren't run automatically for each commit, but must be invoked explicitly.- Could do this for
codespellfor now, to make it available, but not intrusive. clang-formatmight also be better used like this; it may be more intrusive thancodespell.
- Could do this for
- Explore commitizen as an alternative to conventionalcommits.org. Either could be suitable.
commitizenseems to have some other nice release process management features. :eyes:
A bit of work and a lot of testing ahead, but optimistic many of the requirements are already mostly addressed. Let me know if you have feedback or corrections.
@s373nZ Let's start with the following list of subsystem prefixes for now. We can continue refining it iteratively until it's sufficiently mature:
---------------
Daemons:
---------------
channeld
closingd
connectd
gossipd
hsmd
lightningd
onchaind
openingd
---------------
Related:
---------------
bitcoin
cli
cln-grpc
cln-rpc
db
wallet
wire
---------------
Extensions:
---------------
plugin-* (any string value after `plugin-` should be allowed like xpay, renepay, askrene...)
pyln-* (lightning, client, grpc-proto, proto, spec, testing)
tool-* (reckless, hsmtool, downgrade)
---------------
Others:
---------------
ci
common
contrib
devtools
docs
docker
github
global
meta
nit
nix
release
script
tests
@sangbida @ShahanaFarooqui This is a great list. Thanks! I'm happy you're interested in restoring clang-format and commit linting.
I "truely" tested clang-format today and they are quite intrusive. Now I am a bit wary of enabling it broadly :).
Slightly concerned about a conflict between the requirements if code passes prebuild checks, it should also pass the pre-commit hook and Automatic fixes for.... If pre-commit performs automatic changes to the code, it will also fail the hook run. Should the automatic fixes be applied prior to the commit, or in a separate process? For example, if clang-format runs, on a C file in the change set, it won't pass the check until it's feedback has been resolved.
My understanding is that we should run all prebuild checks (i.e., will cause the prebuild checks fail) in this stage. Then extra checks like clang-format or codespell can be added as manual-stage hooks so that they don’t block commits. @sangbida, please correct me if I am mistaken.
I assume this means you prefer flake8 over ruff, so I'll remove the existing ruff hook?
AFAIU, both Ruff and flake8 are Python-only tools. I may be mistaken, but if you and @cdecker prefer Ruff, then Ruff it is. I just want to ensure Ruff doesn’t miss any flake8 checks that would cause the hook to pass but the prebuild check to fail.
Would you like to supply a list of available subsystems? I can also just take a first guess based on Rusty's and the commit log...
Shared in my earlier comment above.
Given the present information, here's my current plan: Restore the clang-format hook, explore removing --dry-run and -Werror flags to accept its changes automatically.
Umm..How about we let clang-format and codespell take a back seat as manual hooks for now? Sorry for the mix-up earlier 😄. We can enable them once the codebase is more cleaned up accordingly.
Define which hooks should be run during the manual stage, meaning they aren't run automatically for each commit, but must be invoked explicitly. Could do this for codespell for now, to make it available, but not intrusive. clang-format might also be better used like this; it may be more intrusive than codespell.
Yes, agreed. These two seem like the right candidates for now, but feel free to include additional tools if needed.
Explore commitizen as an alternative to conventionalcommits.org. Either could be suitable. commitizen seems to have some other nice release process management features. 👀
It looks great, but is it not too big of a change at this stage?
A bit of work and a lot of testing ahead, but optimistic many of the requirements are already mostly addressed. Let me know if you have feedback or corrections.
Thank you so much. Truly appreciate it!!! ❤️