Move linting from CI to a Git hook
Rather than to prompt the developer about linting issues post CI and then having them fix the issues we could always run the formatter prior to a commit for a better developer experience.
This would be especially benefit for work that goes on in branches for a while before reaching master, for newcomers, and for people outside of the OpenRefine org as they do not trigger CI automatically.
As a bonus it would speed up CI.
A downside would be that in addition to git clone users would need to run git config to get started developing.
We have another issue (somewhere) that suggests to use IDE plugins with style guides which could be preferred and easier setup for devs?
That is a nice idea, but I would be slightly worried that the hook would take quite long (because it involves invoking maven, which is a beast). So I guess I would want to wrap that in a shell script which:
- only invokes Maven if there are staged changes to java files
- hides the flood of log output, only showing a relevant error if there is any (perhaps it is enough to hide stdout and keep stderr?)
As a first step before imposing the hook to everyone, I would be in favour of having a few contributors try it out on their own, publish the hook in the developer docs (so that people can adopt it). Only once we are happy with the developer experience I would consider shipping it by default (and even then, I think I would keep in the CI, since people might have older clones without the hook and it's taking a small amount of time compared to the rest).
A downside would be that in addition to git clone users would need to run git config to get started developing.
What would the git config step look like?
I have come up with the following git hook (to be added as .git/hooks/pre-commit, with executable flag):
(note: script edited to take into account some problems mentioned below)
#!/bin/bash
set -eu
if git status -s -uno | grep "\.java$" > /dev/null; then
echo "Linting java files, please wait..."
mvn formatter:format -q
git status -s -uno | sed -e 's/^...//g' | grep "\.java$" | xargs git add
fi
When no changes to Java files have been made, it is instant, otherwise it takes about 20 seconds (if the cache of the linter has not been previously cleared by a mvn clean).
Another possibility would be to have some automation built around GitHub (as a GitHub Action or App, for instance), to offer people to lint directly from the pull request discussion. For instance, by automatically pushing a linting commit on top of a PR. Not sure if this is really helpful because adding commits on top of a PR branch can be seen as rude (the PR author will need to pull from their branch before they make further changes, or potentially have to solve conflicts).
If we go down the Git hook route, then we could also install the hook automatically in .git/hooks via the ./refine script the first time it is run (which is apparently a common practice). This avoids the need for people to type a git config command manually.
I think the git hooks option is the best from a developer experience perspective, particularly if the hook can automatically be installed. From my experience, this kind of thing can and should happen with no input from the developer, which the hook approach achieves.
On a technical note, I don't think the -P grep flag is available on macOS. It complains when I run the script on my machine, and I've checked the Linux and macOS man pages. It's clearly listed on the former but there's no sign of it on the latter.
Thanks a lot for trying it out!
Actually I think the -P flag is not needed here, let me remove it (by editing the post above).