lightning icon indicating copy to clipboard operation
lightning copied to clipboard

`pre-commit` hooks demo

Open s373nZ opened this issue 9 months ago • 12 comments

This PR expands upon #7884 to demo some potential DX improvements through adding pre-commit hooks. It contains the following features:

  • Run shellcheck.
  • Re-implements make check-amount-access as a local pygrep hook.
  • ~Implements a clang-format hook to fail on suggestion warnings with additional configuration to sort includes.~
  • Re-implements make check-discouraged-functions as a local pygrep hook.
  • Runs codespell to check for common spelling errors.
  • Runs check-jsonschema to validate schemas and metaschemeas in the doc/ directory.
    • Last checked, this exposed a few invalidations in a few of the the schema definitions, and seems particularly useful.
  • Pretty-formats the JSON schemas
  • ~Checks Git commit message conforms to Conventional Commits standard, using a best-guess prototype for the list of application scopes.~

Many of these experiments output a lot of errors/warnings against the existing code. Generally, pre-commit should be only checking the current changeset and provoking the developer to address issues incrementally. Interested parties can run specific hooks on the entire codebase with pre-commit run --all-files [hook-id]

Particular checks and standards may not be applicable or aligned with the workflow of the maintainers. I'd be happy and interested to break this work into separate PRs for specific hooks where there is interest, or explore adding more.

Relates to #7765.

Checklist

Before submitting the PR, ensure the following tasks are completed. If an item is not applicable to your PR, please mark it as checked:

  • [x] The changelog has been updated in the relevant commit(s) according to the guidelines.
  • [x] Documentation has been reviewed and updated as needed.
  • [x] Related issues have been listed and linked, including any that this PR closes.

Changelog-None

s373nZ avatar Apr 01 '25 09:04 s373nZ

@s373nZ hi! noting your recent reference to this PR.... do you plan on returning to this PR anytime soon? I'm doing some spring cleaning of our open PRs

madelinevibes avatar Dec 08 '25 05:12 madelinevibes

Hi @madelinevibes! :wave: Updated this by rebasing against master. Happy to return to it if parts seem useful, but don't want to make too many assumptions about which checks would be acceptable to the team.

Marking this ready for review and removing the [WIP] to better flag it for someone to take a quick look, and hopefully provide initial feedback or further direction on requirements.

s373nZ avatar Dec 08 '25 12:12 s373nZ

Hey! Thank you for this awesome PR! I had been running this heavily vibecoded (and very verbose) precommit hook to fix up my nits but this looks very neat and tidy! The repos used for the most part also seem reliable and maintained. I would also add whitespace fixing and flake8 fixing in this PR as well, I find that most of the time my prebuild checks fail because of whitespace and flake8 errors :(

sangbida avatar Dec 08 '25 23:12 sangbida

Also, have you checked how long it takes to run this? Just wondering because we're making a fair few external requests.

sangbida avatar Dec 09 '25 01:12 sangbida

Rebased against master and added trailing-whitespace and end-of-file-fixer. Also responded to other review comments.

Also, have you checked how long it takes to run this? Just wondering because we're making a fair few external requests.

First run ALL FILES (fresh install):

pre-commit clean
time pre-commit run --all-files
Executed in   37.23 secs    fish           external
   usr time   47.16 secs    0.33 millis   47.16 secs
   sys time   22.12 secs    1.19 millis   22.12 secs

Regular run ALL FILES:

time pre-commit run --all-files
Executed in   14.27 secs    fish           external
   usr time   35.73 secs    0.00 micros   35.73 secs
   sys time   19.07 secs  759.00 micros   19.07 secs

Regular run on common/* (a supposedly massive change set):

time pre-commit run --files common/*
Executed in    1.43 secs    fish           external
   usr time    6.85 secs    1.15 millis    6.85 secs
   sys time    2.31 secs    1.05 millis    2.31 secs

Regular run on cli/* (a supposedly more minimal change set):

time pre-commit run --files cli/*
Executed in  574.59 millis    fish           external
   usr time  461.00 millis    0.03 millis  460.98 millis
   sys time  399.49 millis    1.02 millis  398.47 millis

During a commit, pre-commit only runs on the files in the changeset, so most of the time it is processing very few files. Hooks are installed into the system at ~/.cache/pre-commit so we aren't making many external requests except for the first run.

s373nZ avatar Dec 09 '25 13:12 s373nZ

It's always good to have an opinionated list. The downside is, we might all disagree with that list!

In particular, the conventionalcommits.org is crap. I hate "chore:" as a prefix. It's non-informative.

Prefixes should refer to subsystems, e.g:

  • lightningd: xxx
  • plugins: xxx
  • askrene: xxx
  • pytest: xxx
  • common: xxx
  • docs: xxx

If there's no logical subsystem, I sometimes omit it, and sometimes use "global". But usually I talk about what system, or subdaemon, or plugin is hit (even if, as a side effect, other code needs to be adapted!).

rustyrussell avatar Dec 10 '25 03:12 rustyrussell

Will this be integrated into the CI, or is it intended only for local testing? My local run is currently failing on almost all steps :(.

ShahanaFarooqui avatar Dec 10 '25 03:12 ShahanaFarooqui

  • Rebased against master.
  • REMOVED the conventional-commit and clang-format hooks because they are highly experimental and will likely be far more intrusive than helpful for now. Just let me know if you want them back! Having clang-format active is like opting in to fix every C file you touch and there are so many failures, I imagine it to be really annoying. Could be good to reintegrate after some configuration tuning or in a separate initiative. See below for conventional commits rationale.
  • Replaced shellcheck with shellcheck-py to avoid dependency on Docker per @sangbida's feedback.
  • Added @sangbida's script in https://github.com/ElementsProject/lightning/pull/8193#discussion_r2604904006 to tools/fix-style-errors (still uses clang-format FYI).
  • Added pre-commit to the dev group in pyproject.toml so it's installed along with other Python dependencies during a new development environment setup.
  • Added a section to the Contributor Workflow documentation to describe pre-commit and how to opt-in / install it to a local repo.
  • Replied to all the review feedback, any unresolved comments are awaiting your resolution, feedback or direction.

It's always good to have an opinionated list. The downside is, we might all disagree with that list!

@rustyrussell this list was a best-guess demo derived by scanning a few pages of the commit history in April and configuring based on observed conventions.

In particular, the conventionalcommits.org is crap. I hate "chore:" as a prefix. It's non-informative.

Noted, and removed it as a commit linter. The prior configuration should have required chore(subsystem) or fix(subsystem), which isn't what you're looking for. Would probably be way more annoying than helpful for you, as well as new contributors.

Prefixes should refer to subsystems

If the team wanted to formalize conventions for commit messages and add them to the developer documentation, it may be possible to leverage the Conventional Commit hook to enforce it. Although, it might not make sense to bend that tool too far and consider a customized implementation.

Will this be integrated into the CI, or is it intended only for local testing? My local run is currently failing on almost all steps :(.

@ShahanaFarooqui Definitely not ready for CI yet, IMO. An idea for a "mini road map" toward this might be something like:

  1. Reach internal consensus on which hooks to accept for now in this PR.
  2. Merge this PR.
  3. Open follow up issues / PRs to address outstanding failures for the accepted hooks (spelling, formatting, schema etc).
  4. Interested developers opt-in and test-drive it daily to discover bugs, quirks and improvement ideas.
  5. When all the checks are passing using --all-files (PRs in 3 addressed) and deemed non-intrusive, integrate into CI alongside existing checks.
  6. Incrementally replace the checks in make check with pre-commit hook equivalents.

I would be interested / curious to help out with the follow ups in 3 to start with.

s373nZ avatar Dec 10 '25 13:12 s373nZ

Hey! Here's a list of requirements that @ShahanaFarooqui and I came up with:

Must-haves:

  • Compatible with the existing codebase: if code passes prebuild checks, it should also pass the pre-commit hook.
  • Opt-in: developers should be able to commit "ugly" code if needed.
  • Detect shell scripting bugs (check-shellcheck).
  • Prevent direct access to amount_msat and amount_sat members (check-amount-access).
  • Flag usage of discouraged functions (check-discouraged-functions).
  • Enforce commit message format with subsystem: prefix.
  • Automatic fixes for:
    • Sorting #include statements in C files (check-hdr-include-order).
    • Formatting C files (clang-format).
    • Python code style (check-python-flake8).
    • JSON schemas (check-fmt-schemas).
    • Trailing whitespace, end-of-file issues, and extra lines.

Nice-to-haves:

  • Spelling fixes.
    • From @ShahanaFarooqui - I attempted to do this locally and after 100+ file fixed, I was just ~25% done. So yes, It's useful but should not be added at this time. We will revisit once the hook is ready to be added in the CI directly.

@s373nZ Let me know what you think?

sangbida avatar Dec 12 '25 05:12 sangbida

@sangbida @ShahanaFarooqui This is a great list. Thanks! I'm happy you're interested in restoring clang-format and commit linting. A few questions:

Slightly concerned about a conflict between the requirements if code passes prebuild checks, it should also pass the pre-commit hook and Automatic fixes for.... If pre-commit performs automatic changes to the code, it will also fail the hook run. Should the automatic fixes be applied prior to the commit, or in a separate process? For example, if clang-format runs, on a C file in the change set, it won't pass the check until it's feedback has been resolved.

Python code style (check-python-flake8).

I assume this means you prefer flake8 over ruff, so I'll remove the existing ruff hook?

Enforce commit message format with subsystem: prefix.

Would you like to supply a list of available subsystems? I can also just take a first guess based on Rusty's and the commit log...

Given the present information, here's my current plan:

  • Restore the clang-format hook, explore removing --dry-run and -Werror flags to accept its changes automatically.
  • Explore checking (and fixing?) the #include sort order outside of clang-format with a custom hook.
    • There is an interesting challenge in that the list of checked files is built dynamically from project's included Makefiles. Looking into a clean way to leverage the list into a hook process.
  • Define which hooks should be run during the manual stage, meaning they aren't run automatically for each commit, but must be invoked explicitly.
    • Could do this for codespell for now, to make it available, but not intrusive.
    • clang-format might also be better used like this; it may be more intrusive than codespell.
  • Explore commitizen as an alternative to conventionalcommits.org. Either could be suitable. commitizen seems to have some other nice release process management features. :eyes:

A bit of work and a lot of testing ahead, but optimistic many of the requirements are already mostly addressed. Let me know if you have feedback or corrections.

s373nZ avatar Dec 12 '25 14:12 s373nZ

@s373nZ Let's start with the following list of subsystem prefixes for now. We can continue refining it iteratively until it's sufficiently mature:

  ---------------
  Daemons:
  ---------------
    channeld
    closingd
    connectd
    gossipd
    hsmd
    lightningd
    onchaind
    openingd
  
  ---------------
  Related:
  ---------------
    bitcoin
    cli
    cln-grpc
    cln-rpc
    db
    wallet
    wire
  
  ---------------
  Extensions:
  ---------------
    plugin-* (any string value after `plugin-` should be allowed like xpay, renepay, askrene...)
    pyln-* (lightning, client, grpc-proto, proto, spec, testing)
    tool-* (reckless, hsmtool, downgrade)
  
  ---------------
  Others:
  ---------------
    ci
    common
    contrib
    devtools
    docs
    docker
    github
    global
    meta
    nit
    nix
    release
    script
    tests

ShahanaFarooqui avatar Dec 12 '25 18:12 ShahanaFarooqui

@sangbida @ShahanaFarooqui This is a great list. Thanks! I'm happy you're interested in restoring clang-format and commit linting.

I "truely" tested clang-format today and they are quite intrusive. Now I am a bit wary of enabling it broadly :).

Slightly concerned about a conflict between the requirements if code passes prebuild checks, it should also pass the pre-commit hook and Automatic fixes for.... If pre-commit performs automatic changes to the code, it will also fail the hook run. Should the automatic fixes be applied prior to the commit, or in a separate process? For example, if clang-format runs, on a C file in the change set, it won't pass the check until it's feedback has been resolved.

My understanding is that we should run all prebuild checks (i.e., will cause the prebuild checks fail) in this stage. Then extra checks like clang-format or codespell can be added as manual-stage hooks so that they don’t block commits. @sangbida, please correct me if I am mistaken.

I assume this means you prefer flake8 over ruff, so I'll remove the existing ruff hook?

AFAIU, both Ruff and flake8 are Python-only tools. I may be mistaken, but if you and @cdecker prefer Ruff, then Ruff it is. I just want to ensure Ruff doesn’t miss any flake8 checks that would cause the hook to pass but the prebuild check to fail.

Would you like to supply a list of available subsystems? I can also just take a first guess based on Rusty's and the commit log...

Shared in my earlier comment above.

Given the present information, here's my current plan: Restore the clang-format hook, explore removing --dry-run and -Werror flags to accept its changes automatically.

Umm..How about we let clang-format and codespell take a back seat as manual hooks for now? Sorry for the mix-up earlier 😄. We can enable them once the codebase is more cleaned up accordingly.

Define which hooks should be run during the manual stage, meaning they aren't run automatically for each commit, but must be invoked explicitly. Could do this for codespell for now, to make it available, but not intrusive. clang-format might also be better used like this; it may be more intrusive than codespell.

Yes, agreed. These two seem like the right candidates for now, but feel free to include additional tools if needed.

Explore commitizen as an alternative to conventionalcommits.org. Either could be suitable. commitizen seems to have some other nice release process management features. 👀

It looks great, but is it not too big of a change at this stage?

A bit of work and a lot of testing ahead, but optimistic many of the requirements are already mostly addressed. Let me know if you have feedback or corrections.

Thank you so much. Truly appreciate it!!! ❤️

ShahanaFarooqui avatar Dec 12 '25 19:12 ShahanaFarooqui