yellowbrick icon indicating copy to clipboard operation
yellowbrick copied to clipboard

Implement black code formatting as pre-commit hook

Open stefmolin opened this issue 3 years ago • 4 comments

This PR fixes #456 which requested a pre-commit hook for black.

I have made the following changes:

  1. Added .pre-commit-config.yaml to configure pre-commit hooks.

Setup and Examples

Since we are deciding how best to have people set up the hook, here is an example of how to do so with the YAML file included in the first commit:

pip install pre-commit
pre-commit install      # run this from the yellowbrick folder after checking out this branch

To start with, I have provided the following hooks:

  • removal of trailing whitespace
  • fix end of file
  • check that the YAML for the config is valid
  • check that the commit won't be adding large files
  • run black

The pre-commit hooks run automatically once you run git commit. Note that you don't have to install black because the first time you run this, pre-commit will build a testing environment for the staged changes (it pulls down the version of black specified in the YAML for you). I installed the hook before making the commits here; below is the output of my first commit:

$ git add .pre-commit-config.yaml 
$ git commit -m "Add .pre-commit-config.yaml with black."
[INFO] Initializing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Initializing environment for https://github.com/psf/black.
[INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
[INFO] Installing environment for https://github.com/psf/black.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Trim Trailing Whitespace.................................................Passed
Fix End of Files.........................................................Passed
Check Yaml...............................................................Passed
Check for added large files..............................................Passed
black................................................(no files to check)Skipped
[feature-pre_commit_config 622c818] Add .pre-commit-config.yaml with black.
 1 file changed, 14 insertions(+)
 create mode 100644 .pre-commit-config.yaml

Please note that you can bypass pre-commit hooks by running git commit -m "Your commit message" --no-verify.

You can use also use pre-commit on specific files even if you haven't changed them yet (i.e., you have nothing to commit yet):

pre-commit run --files yellowbrick/anscombe.py
Screen Shot 2022-07-29 at 7 55 27 PM

More information and options can be found in the pre-commit docs.

TODOs and questions

Still to do:

  • [ ] Add setup information to contributing docs.
  • [ ] Potentially update requirements.txt files (based on discussion with maintainers).
  • [ ] Other changes depending on answers to questions below.

Questions for the @DistrictDataLabs/team-oz-maintainers:

  • [ ] After experimenting with this and reviewing #456, do you want to include the pip install and pre-commit install commands in the contributors doc only? Or do you want to put pre-commit in the requirements.txt file also (note that contributors will still need to run the pre-commit install command)?
  • [ ] If we are going to add pre-commit to the requirements.txt, which one would be appropriate – I see there are a few?
  • [ ] Are there any other checks you would like to include?
  • [ ] Do you want to include the black badge in the README? (You can see it on the black README, if you aren't sure what it looks like).

CHECKLIST

  • [X] Is the commit message formatted correctly?
  • [ ] Have you noted the new functionality/bugfix in the release notes of the next release?
  • [ ] Have you documented your new feature/functionality in the docs?
  • [ ] Have you built the docs using make html?

stefmolin avatar Jul 30 '22 00:07 stefmolin

CC @bbengfort

stefmolin avatar Jul 30 '22 00:07 stefmolin

@bbengfort - have you had a chance to try this out?

stefmolin avatar Aug 11 '22 00:08 stefmolin

@stefmolin thanks for checking back in -- unfortunately, I haven't had a chance to take a look until just now; I was on holiday last week and it's been a busy week catching up again; so I definitely appreciate you're bumping this back to the top of my attention!

It looks like this is pretty straightforward - it's just an additional configuration in the repo; if contributors install pre-commit then this will run, but if they don't, then the configuration file is ignored. I think this is a really good compromise because the routine contributors can use it to check their work but it won't block newcomers from making contribs.

Based on this idea, I propose the following answers to your questions:

  1. Let's include the pip install and pre-commit install commands in the contributors doc and add a commented-out line with the version in requirements.txt; e.g. similar to the optional dependencies comment in that file. That will ensure contributors will install the correct version of pre-commit but make it optional.
  2. I assume the list of hooks is here: https://pre-commit.com/hooks.html? I've added a few that I think would be good below.
  3. Sure, I'm happy to add a black badge to the README if you think that would be helpful.

Other useful checks:

  • check-json (our data files for downloading datasets are in json)
  • check-merge-conflict (I was surprised at how often this happens)
  • rst-backticks, rst-directive-colons, and rst-inline-touching-normal would helps with docs
  • gitleaks could be useful, though I'm not sure how well it works

Could we also add a GitHub Action check for pre-commit this would not be a required check for merging but would signal to the maintainers that code is correctly formatted in a PR.

Thank you so much for taking this issue on and for your thoughtful and complete PR!

bbengfort avatar Aug 11 '22 01:08 bbengfort

Codecov Report

Merging #1269 (72ff6e7) into develop (b475614) will increase coverage by 0.13%. The diff coverage is n/a.

@@             Coverage Diff             @@
##           develop    #1269      +/-   ##
===========================================
+ Coverage    90.75%   90.88%   +0.13%     
===========================================
  Files           93       93              
  Lines         5300     5301       +1     
===========================================
+ Hits          4810     4818       +8     
+ Misses         490      483       -7     
Impacted Files Coverage Δ
yellowbrick/model_selection/validation_curve.py 98.14% <0.00%> (+0.03%) :arrow_up:
yellowbrick/classifier/threshold.py 99.35% <0.00%> (+0.64%) :arrow_up:
yellowbrick/cluster/icdm.py 95.27% <0.00%> (+0.78%) :arrow_up:
yellowbrick/regressor/residuals.py 90.56% <0.00%> (+0.94%) :arrow_up:
yellowbrick/regressor/alphas.py 95.60% <0.00%> (+1.09%) :arrow_up:
yellowbrick/model_selection/learning_curve.py 100.00% <0.00%> (+1.88%) :arrow_up:
yellowbrick/regressor/prediction_error.py 96.72% <0.00%> (+3.27%) :arrow_up:

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Aug 11 '22 01:08 codecov[bot]

Thanks for the feedback @bbengfort – I hope the holiday was restorative! I'll update with the additional checks you added. I have not used the gitleaks one, so I can't speak to its effectiveness; however, I think as a general rule you don't want to add too many checks either as they run in series and each one will add a little bit of time to execute. I also noticed that you have flake8 in your setup.cfg, so I will add that as well.

Sure, I'm happy to add a black badge to the README if you think that would be helpful.

I think it's nice to let potential contributors know what they can expect as far as coding standards you follow. If it helps you decide, scikit-learn has the badge in their README.

Could we also add a GitHub Action check for pre-commit this would not be a required check for merging but would signal to the maintainers that code is correctly formatted in a PR.

I haven't tried this personally, but I will take a look.

stefmolin avatar Aug 12 '22 01:08 stefmolin

It was restorative thank you! If you think we shouldn't add extra checks then let's skip gitleaks; thanks for adding flake8!

bbengfort avatar Aug 12 '22 19:08 bbengfort

I took a look at gitleaks, and you will either need to have Go or Docker installed, so there is a bit of overhead on that one – I left it off for now. To add flake8, I also had to update the flake8 config in setup.cfg, so that it would be compatible with black (more info here).

I updated the contributing docs and the requirements.txt file – let me know if those need any changes.

I'm going to experiment with the GitHub Action in my fork. Other than that, the only remaining thing is the black badge to the README: have you decided whether you want to include it?

stefmolin avatar Aug 13 '22 01:08 stefmolin

I got the hooks working in GitHub actions. I set it to only look at files that changed in the PR, which made it much more difficult to configure, but otherwise, I imagine every PR would fail the check until every file was cleaned up :smile:

You can see that the CI check for "Linting" was performed for this PR, and it ran the checks for everything but JSON and Python files (output here) since there weren't any of those files here. To test, I purposely introduced some formatting issues in a file for black to find in my fork: you can see the results of that test (which fails since it finds those issues) here. If the checks fail, it will indicate what was changed so you can address it.

I added the black badge to the README as well.

Have you noted the new functionality/bugfix in the release notes of the next release?

Should I include an entry in the release notes for this PR?

stefmolin avatar Aug 13 '22 03:08 stefmolin

@stefmolin never mind - I was able to get the PR merged in; no need to sync to develop!

bbengfort avatar Aug 17 '22 12:08 bbengfort

@stefmolin oops, the linting GitHub actions check failed when I merged into develop :-) -- I know you included failures to demonstrate how the GitHub action worked; did I merge prematurely?

https://github.com/DistrictDataLabs/yellowbrick/runs/7878847521?check_suite_focus=true

bbengfort avatar Aug 17 '22 12:08 bbengfort

@bbengfort - I'm happy I was able to help! It wasn't a premature merge, as those tests where in a separate branch in my fork. I think I know what is going on: the additional logic is to perform a diff across branches in a PR, but merging a branch in won't have those fields pulled in with the GitHub Actions. I'm learning about how to configure these actions through this PR, so it's a learning experience for me as well 😄

The linting test should only run on PRs, which I believe needs a separate workflow config. I will put in a new PR for that.

This brings up another question though: do you want to perform the linting checks on the whole repo upon merging? We can also add a badge to the README that displays whether the code base is currently passing the linting checks.

stefmolin avatar Aug 17 '22 13:08 stefmolin

@stefmolin we're all about learning so I'm glad you're having a productive/applied learning experience :-). The new workflow makes sense to me - I agree that we really only want to run it on pull requests, not on branches or tags. I think the linting/pre-commit feature should be an optional feature for routine contributors to ensure they're submitting high-quality code; if we guard the whole codebase, even if it's optional, that big red X will appear on the main page and that doesn't look good to people who are thinking about using YB as a dependency.

bbengfort avatar Aug 17 '22 14:08 bbengfort

@bbengfort - Thanks for the quick response. I will put in a PR for these changes then :smile:

stefmolin avatar Aug 17 '22 16:08 stefmolin

@bbengfort - I implemented the changes discussed in https://github.com/DistrictDataLabs/yellowbrick/pull/1276.

stefmolin avatar Aug 17 '22 17:08 stefmolin