yellowbrick
yellowbrick copied to clipboard
Implement black code formatting as pre-commit hook
This PR fixes #456 which requested a pre-commit hook for black.
I have made the following changes:
- Added
.pre-commit-config.yamlto configure pre-commit hooks.
Setup and Examples
Since we are deciding how best to have people set up the hook, here is an example of how to do so with the YAML file included in the first commit:
pip install pre-commit
pre-commit install # run this from the yellowbrick folder after checking out this branch
To start with, I have provided the following hooks:
- removal of trailing whitespace
- fix end of file
- check that the YAML for the config is valid
- check that the commit won't be adding large files
- run
black
The pre-commit hooks run automatically once you run git commit. Note that you don't have to install black because the first time you run this, pre-commit will build a testing environment for the staged changes (it pulls down the version of black specified in the YAML for you). I installed the hook before making the commits here; below is the output of my first commit:
$ git add .pre-commit-config.yaml
$ git commit -m "Add .pre-commit-config.yaml with black."
[INFO] Initializing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Initializing environment for https://github.com/psf/black.
[INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
[INFO] Installing environment for https://github.com/psf/black.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Trim Trailing Whitespace.................................................Passed
Fix End of Files.........................................................Passed
Check Yaml...............................................................Passed
Check for added large files..............................................Passed
black................................................(no files to check)Skipped
[feature-pre_commit_config 622c818] Add .pre-commit-config.yaml with black.
1 file changed, 14 insertions(+)
create mode 100644 .pre-commit-config.yaml
Please note that you can bypass pre-commit hooks by running git commit -m "Your commit message" --no-verify.
You can use also use pre-commit on specific files even if you haven't changed them yet (i.e., you have nothing to commit yet):
pre-commit run --files yellowbrick/anscombe.py
More information and options can be found in the pre-commit docs.
TODOs and questions
Still to do:
- [ ] Add setup information to contributing docs.
- [ ] Potentially update
requirements.txtfiles (based on discussion with maintainers). - [ ] Other changes depending on answers to questions below.
Questions for the @DistrictDataLabs/team-oz-maintainers:
- [ ] After experimenting with this and reviewing #456, do you want to include the
pip installandpre-commitinstall commands in the contributors doc only? Or do you want to putpre-commitin therequirements.txtfile also (note that contributors will still need to run thepre-commit installcommand)? - [ ] If we are going to add
pre-committo therequirements.txt, which one would be appropriate – I see there are a few? - [ ] Are there any other checks you would like to include?
- [ ] Do you want to include the
blackbadge in the README? (You can see it on theblackREADME, if you aren't sure what it looks like).
CHECKLIST
- [X] Is the commit message formatted correctly?
- [ ] Have you noted the new functionality/bugfix in the release notes of the next release?
- [ ] Have you documented your new feature/functionality in the docs?
- [ ] Have you built the docs using
make html?
CC @bbengfort
@bbengfort - have you had a chance to try this out?
@stefmolin thanks for checking back in -- unfortunately, I haven't had a chance to take a look until just now; I was on holiday last week and it's been a busy week catching up again; so I definitely appreciate you're bumping this back to the top of my attention!
It looks like this is pretty straightforward - it's just an additional configuration in the repo; if contributors install pre-commit then this will run, but if they don't, then the configuration file is ignored. I think this is a really good compromise because the routine contributors can use it to check their work but it won't block newcomers from making contribs.
Based on this idea, I propose the following answers to your questions:
- Let's include the
pip installandpre-commitinstall commands in the contributors doc and add a commented-out line with the version inrequirements.txt; e.g. similar to the optional dependencies comment in that file. That will ensure contributors will install the correct version ofpre-commitbut make it optional. - I assume the list of hooks is here: https://pre-commit.com/hooks.html? I've added a few that I think would be good below.
- Sure, I'm happy to add a black badge to the README if you think that would be helpful.
Other useful checks:
check-json(our data files for downloading datasets are in json)check-merge-conflict(I was surprised at how often this happens)rst-backticks,rst-directive-colons, andrst-inline-touching-normalwould helps with docsgitleakscould be useful, though I'm not sure how well it works
Could we also add a GitHub Action check for pre-commit this would not be a required check for merging but would signal to the maintainers that code is correctly formatted in a PR.
Thank you so much for taking this issue on and for your thoughtful and complete PR!
Codecov Report
Merging #1269 (72ff6e7) into develop (b475614) will increase coverage by
0.13%. The diff coverage isn/a.
@@ Coverage Diff @@
## develop #1269 +/- ##
===========================================
+ Coverage 90.75% 90.88% +0.13%
===========================================
Files 93 93
Lines 5300 5301 +1
===========================================
+ Hits 4810 4818 +8
+ Misses 490 483 -7
| Impacted Files | Coverage Δ | |
|---|---|---|
| yellowbrick/model_selection/validation_curve.py | 98.14% <0.00%> (+0.03%) |
:arrow_up: |
| yellowbrick/classifier/threshold.py | 99.35% <0.00%> (+0.64%) |
:arrow_up: |
| yellowbrick/cluster/icdm.py | 95.27% <0.00%> (+0.78%) |
:arrow_up: |
| yellowbrick/regressor/residuals.py | 90.56% <0.00%> (+0.94%) |
:arrow_up: |
| yellowbrick/regressor/alphas.py | 95.60% <0.00%> (+1.09%) |
:arrow_up: |
| yellowbrick/model_selection/learning_curve.py | 100.00% <0.00%> (+1.88%) |
:arrow_up: |
| yellowbrick/regressor/prediction_error.py | 96.72% <0.00%> (+3.27%) |
:arrow_up: |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
Thanks for the feedback @bbengfort – I hope the holiday was restorative! I'll update with the additional checks you added. I have not used the gitleaks one, so I can't speak to its effectiveness; however, I think as a general rule you don't want to add too many checks either as they run in series and each one will add a little bit of time to execute. I also noticed that you have flake8 in your setup.cfg, so I will add that as well.
Sure, I'm happy to add a black badge to the README if you think that would be helpful.
I think it's nice to let potential contributors know what they can expect as far as coding standards you follow. If it helps you decide, scikit-learn has the badge in their README.
Could we also add a GitHub Action check for pre-commit this would not be a required check for merging but would signal to the maintainers that code is correctly formatted in a PR.
I haven't tried this personally, but I will take a look.
It was restorative thank you! If you think we shouldn't add extra checks then let's skip gitleaks; thanks for adding flake8!
I took a look at gitleaks, and you will either need to have Go or Docker installed, so there is a bit of overhead on that one – I left it off for now. To add flake8, I also had to update the flake8 config in setup.cfg, so that it would be compatible with black (more info here).
I updated the contributing docs and the requirements.txt file – let me know if those need any changes.
I'm going to experiment with the GitHub Action in my fork. Other than that, the only remaining thing is the black badge to the README: have you decided whether you want to include it?
I got the hooks working in GitHub actions. I set it to only look at files that changed in the PR, which made it much more difficult to configure, but otherwise, I imagine every PR would fail the check until every file was cleaned up :smile:
You can see that the CI check for "Linting" was performed for this PR, and it ran the checks for everything but JSON and Python files (output here) since there weren't any of those files here. To test, I purposely introduced some formatting issues in a file for black to find in my fork: you can see the results of that test (which fails since it finds those issues) here. If the checks fail, it will indicate what was changed so you can address it.
I added the black badge to the README as well.
Have you noted the new functionality/bugfix in the release notes of the next release?
Should I include an entry in the release notes for this PR?
@stefmolin never mind - I was able to get the PR merged in; no need to sync to develop!
@stefmolin oops, the linting GitHub actions check failed when I merged into develop :-) -- I know you included failures to demonstrate how the GitHub action worked; did I merge prematurely?
https://github.com/DistrictDataLabs/yellowbrick/runs/7878847521?check_suite_focus=true
@bbengfort - I'm happy I was able to help! It wasn't a premature merge, as those tests where in a separate branch in my fork. I think I know what is going on: the additional logic is to perform a diff across branches in a PR, but merging a branch in won't have those fields pulled in with the GitHub Actions. I'm learning about how to configure these actions through this PR, so it's a learning experience for me as well 😄
The linting test should only run on PRs, which I believe needs a separate workflow config. I will put in a new PR for that.
This brings up another question though: do you want to perform the linting checks on the whole repo upon merging? We can also add a badge to the README that displays whether the code base is currently passing the linting checks.
@stefmolin we're all about learning so I'm glad you're having a productive/applied learning experience :-). The new workflow makes sense to me - I agree that we really only want to run it on pull requests, not on branches or tags. I think the linting/pre-commit feature should be an optional feature for routine contributors to ensure they're submitting high-quality code; if we guard the whole codebase, even if it's optional, that big red X will appear on the main page and that doesn't look good to people who are thinking about using YB as a dependency.
@bbengfort - Thanks for the quick response. I will put in a PR for these changes then :smile:
@bbengfort - I implemented the changes discussed in https://github.com/DistrictDataLabs/yellowbrick/pull/1276.