NeuroKit [Feature] Simplify code checks for contributors

[Feature] Simplify code checks for contributors

Open rostro36 opened this issue 1 year ago • 4 comments

Describe the solution you'd like Currently, to run the code checks, there are five lines that need to be run per file/directory. This is a bit clunky, especially for many files and it was also mentioned before in #321 .

This should not make a difference to the whole actions in GitHub or installation of the package for users.

These scripts are what I was talking about:

isort myfile.py -l 120 --balanced --multi-line 3 --lines-between-types 1 --lines-after-imports 2 --trailing-comma black myfile.py --line-length 120 docformatter myfile.py --wrap-summaries 120 --wrap-descriptions 113 --blank --in-place

flake8 myfile.py --max-line-length=127 --max-complexity=10 --ignore E303,C901,E203,W503 pylint myfile.py --max-line-length=127 --load-plugins=pylint.extensions.docparams --load-plugins=pylint.extensions.docstyle --variable-naming-style=any --argument-naming-style=any --reports=n --suggestion-mode=y --disable=E303 --disable=R0913 --disable=R0801 --disable=C0114 --disable=E203 --disable=E0401 --disable=W9006 --disable=C0330 --disable=R0914 --disable=R0912 --disable=R0915 --disable=W0102 --disable=W0511 --disable=C1801 --disable=C0111 --disable=R1705 --disable=R1720 --disable=C0301 --disable=C0415 --disable=C0103 --disable=C0302 --disable=R1716 --disable=W0632 --disable=E1136 --extension-pkg-whitelist=numpy

How could we do it? Doing a little bit of research, I found two kinds of solutions:

hook e.g. pre-commit or git hook
- ✔️ Pro - Can't forget as it is automatic - Automatically only looks at differences - Industry standard
- ❌ Contra
  - Gets executed every time, even for the smallest commits and can take minutes of waiting
  - Denies action (e.g. commit) if there is an error (I assume this can be disabled). This is especially problematic if the code already has legacy errors not from the new commit. This might frustrate contributors not used to it.
  - It is rather a side effect, that may not be too easy to understand.
script e.g. bash
- ✔️ Pro
  - Easier to understand for beginners
  - Still allows the action to go through and can be fixed later until found during merge checks.
- ❌ Contra
  - Easier to forget and also contributor might not check for all files.
  - Surely not elegant.
  - There is a reason why commit hooks are standard, that I might have not considered.

The main trade-off I found is about accessibility. If you have time to understand and learn about hooks, then they deliver the more automatic end product. I think in this scenario, the bar of entry should be kept as low as possible and the script solution is more accessible; problems are spotted during merge and can be fixed then.

Let me know your thoughts and I can look into it.

Aug 10 '22 18:08 rostro36

Hi 👋 Thanks for reaching out and opening your first issue here! We'll try to come back to you as soon as possible. ❤️ kenobi

Aug 10 '22 18:08 welcome[bot]

I agree that running the checks locally is clunky. However, to be honest, I don't even run them locally anymore myself: I just check the checks run by the GH actions in the PRs.

So I wonder wether the easiest route would not be to change the documentation to remove the instructions about running checks locally, but simply say, in essence, "open a PR and then keep an eye on the checks that will be run and adjust accordingly".

Or do you see some other reasons to have users to run checks before contributing?

Aug 11 '22 00:08 DominiqueMakowski

(also, note that I'm not very familiar with all Python/dev continuous integration at all so any opinion from someone with a comp. sci. background is more than welcome ☺️)

Aug 11 '22 00:08 DominiqueMakowski

The main goal surely is to have good code quality, which is achieved with the actions. My question now is how to make it as painless as sensible for the contributor.

A contributor always has to push and wait for the GitHub actions, which can be a bit annoying if there are many (simple) mistakes and adds up the amount of commits. (Also with some rather awkward commits.) The last effect can be squashed together though. Doing it locally can go much faster and more precise, so it might not check every file ever and also can skip some of the longer running tests if needed.

Also, I think personally it is good to at least mention it in the contributor docs and not be greeted out of nowhere with some linting error, without ever hearing of linting. This is also good to hear/learn about in general.

Further, GitHub actions have a monthly free quota, which, if overstepped, cost a bit of money. I don't know though if that is ever going to be a problem here.

Lastly, if the action fails because of some other file from a previous commit, the action might fail before it even goes to the latest changed file (which should not happen, but could).

Removing the documentation surely is the easiest route, but I think changing the essence of the documentation to "Here are a bunch of checks, they do that and that. Try to run them, if something fails, push it and others help you out." would be better, but maybe I am not the person to optimize for with already having a Masters in CS and some limited working experience.

Here are some strategies of other Python projects as reference:

Numpy → linting script
Scipy → optional(?) linting script
pandas → hooks
sklearn point 9. → optional hook
Django → hook

Aug 11 '22 19:08 rostro36

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

Oct 14 '22 02:10 stale[bot]

NeuroKit NeuroKit copied to clipboard

[Feature] Simplify code checks for contributors

NeuroKit
NeuroKit copied to clipboard