Baobab
Baobab copied to clipboard
Standardise and configure formatting, linting, and type checking
Currently in Baobab there is no standard formatting applied to the project which has resulted in each developer applying their personal preferences, their IDE defaults or what they configured themselves. This results in a code base that's harder to read and makes PRs diffs unnecessarily larger than they need to be which distracts from the logical code changes that were made.
Additionally given that we have many volunteers that have never worked a large scale software development project before, it's also important that we add some guardrails to the project (with linters, type checkers, etc) to point out common stylistic and syntactical errors and even potentially insecure code usages. Even for the more experienced, this is still very important to have since it just makes development easier and saves time (extra important since people are volunteering).
The proposed stack to solve the above would be as follows
Formatters
black for Python prettier for JavaScript
Linters
flake8 for Python along with the below plugins
- bugbear for additional sylistic warnings to flake8
- builtins to make sure builtins aren't used as arguments
- comprehensions to suggest better comprehensions
- fixme to flag temp dev notes like TODOs
eslint for JavaScript
Type Checkers
mypy for Python. This one would require us to start using type hints though. Not sure if this would add more or less complexity for a junior developer. flow for JavaScript
Security
bandit for Python to check for common security issues in the code
- flake8-bandit to integrate with flake8 to centralise the warning messages
Now to automatically setup and apply the above to the project, I propose to apply these libraries at three levels, at the IDE, Git commit time, and finally on the CI build. I know it's a little bit of overkill :)
Level 1: IDE
This would arguably be the most important time to flag the warnings/suggestions as the IDE can give immediate feedback inline. So we would need to override whatever the person has setup and apply workspace/project level settings to the project (by committing them to the repo so no manual configuration necessary for the developer). The problem is everyone uses a different IDE so perhaps we can prioritise setup according to the most common IDEs or perhaps just maybe ask our current set of volunteers what they use. @SMHall94 do you think we can ask them or would it just annoy people? Based on what I've seen people use and what I've read I think setting up VS Code and PyCharm would cover most people.
Level 2: Git Hook
At commit time, the checks should be run again in case they changed their IDE settings or used something else entirely and if it fails the commit will be rejected with a message indicating the problems that need to be fixed (with many of the hooks even doing the fix for you). With pre-commit we should be able to run all of the above (I still need to double check the JS ones but I don't see why not) with some nice additonal hooks like checking for merge conflicts, large files, private keys, etc.
Level 3: Add checks to the CI build
Then finally again we can run the checks during our CI build (I've checked they should all be possible). If the developer didn't try to bypass the IDE and git hooks, then these should always pass which should hopefully reduce the number for review cycles required. On this note though, what does everyone think about switching to GitHub Actions? I've played around with it and I'm really liking it. And I think it would be nice to centralise everything within the GitHub ecosystem. @avishkar58 I saw you did start to setup something up in #618 and #655
Let me know what everyones thoughts are on this solution. I think @jaderabbit and @avishkar58 can you let me know your thoughts on the Python libraries and @avishkar58 and @KaleabTessera on the JS ones
Also I think waiting until we move to Python 3 (especially the mypy checks) would be best so dependent on #454
Thanks for describing this so comprehensively! I did start adding linting and formatting but ran into issues with python2.7 support, so decided to wait until we upgrade
On Wed, 15 Apr 2020, 12:01 amritpurshotam, [email protected] wrote:
Currently in Baobab there is no standard formatting applied to the project which has resulted in each developer applying their personal preferences, their IDE defaults or what they configured themselves. This results in a code base that's harder to read and makes PRs diffs unnecessarily larger than they need to be which distracts from the logical code changes that were made.
Additionally given that we have many volunteers that have never worked a large scale software development project before, it's also important that we add some guardrails to the project (with linters, type checkers, etc) to point out common stylistic and syntactical errors and even potentially insecure code usages. Even for the more experienced, this is still very important to have since it just makes development easier and saves time (extra important since people are volunteering).
The proposed stack to solve the above would be as follows Formatters
black https://black.readthedocs.io/en/stable/ for Python prettier https://github.com/prettier/prettier for JavaScript Linters
flake8 https://flake8.pycqa.org/en/latest/ for Python along with the below plugins
- bugbear https://github.com/PyCQA/flake8-bugbear for additional sylistic warnings to flake8
- builtins https://github.com/gforcada/flake8-builtins to make sure builtins aren't used as arguments
- comprehensions https://github.com/adamchainz/flake8-comprehensions to suggest better comprehensions
- fxime https://github.com/tommilligan/flake8-fixme to flag temp dev notes like TODOs
eslint https://github.com/eslint/eslint for JavaScript Type Checkers
mypy https://github.com/python/mypy for Python. This one would require us to start using type hints though. Not sure if this would add more or less complexity for a junior developer. flow https://github.com/facebook/flow for JavaScript Security
bandit https://github.com/PyCQA/bandit for Python to check for common security issues in the code
- flake8-bandit https://github.com/tylerwince/flake8-bandit to integrate with flake8 to centralise the warning messages
Now to automatically setup and apply the above to the project, I propose to apply these libraries at three levels, at the IDE, Git commit time, and finally on the CI build. I know it's a little bit of overkill :) Level 1: IDE
This would arguably be the most important time to flag the warnings/suggestions as the IDE can give immediate feedback inline. So we would need to override whatever the person has setup and apply workspace/project level settings to the project. The problem is everyone uses a different IDE so perhaps we can prioritise setup according to the most common IDEs https://insights.stackoverflow.com/survey/2019#technology-_-most-popular-development-environments or perhaps just maybe ask our current set of volunteers what they use. @SMHall94 https://github.com/SMHall94 do you think we can ask them or would it just annoy people? Level 2: Git Hook
At commit time, the checks should be run again in case they changed their IDE settings or used something else entirely. With pre-commit https://pre-commit.com/ we should be able to run all of the above (I still need to double check the JS ones but I don't see why not) with some nice additonal hooks https://github.com/pre-commit/pre-commit-hooks like checking for merge conflicts, large files, private keys, etc. Level 3: Add checks to the CI build
Then finally again we can run all of the above as checks during our CI build (I've checked they should all be possible). If they didn't try to bypass the IDE and git hooks, then these should always pass. On this note though, what does everyone think about switching to GitHub Actions? I've played around with it and I'm really liking it. And I think it would be nice to centralise everything within the GitHub ecosystem. @avishkar58 https://github.com/avishkar58 I saw you did start to setup something up in #618 https://github.com/deep-learning-indaba/Baobab/pull/618 and #655 https://github.com/deep-learning-indaba/Baobab/pull/655
Let me know what everyones thoughts are on this solution. I think @jaderabbit https://github.com/jaderabbit and @avishkar58 https://github.com/avishkar58 can you let me know your thoughts on the Python libraries and @avishkar58 https://github.com/avishkar58 and @KaleabTessera https://github.com/KaleabTessera on the JS ones
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/deep-learning-indaba/Baobab/issues/664, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFV53AFZLTGR5SGWGOMJ5LRMWHYPANCNFSM4MIPHRGA .
This is excellent! To answer your question, @amritpurshotam I don't think people will be annoyed. It might be worth sending out a "contributor survey" - we can have people indicate their preferred IDE and then use it as a tally. We can mix the question in with other feedback questions we might find useful (such as feedback on our documentation etc. - similar to the Hackathon feedback form)
I will start drafting some questions and share them for review and further brainstorming.
To reduce the burden of supporting multiple configurations we could add a recommended IDE to our docs (Visual Studio Code?) and start with configuration files for that, and tell people if they want to use another IDE they should try to configure it similarly
Okay sure that sounds great @SMHall94 !
Yeah I think Visual Studio Code too unless anyone else feels differently. I think I've pretty much figured those settings out for VS Code so that should be good to go. I just need to figure out some details around Docker but don't think that should be too complicated. And if we're settling on a recommended IDE, I think it would be useful to also include a selection of extensions to use with it (~these will obviously have to be in a separate section for manual setup for the person~ seems we can commit them to the repo). For example I use the following
- Python (obviously :P)
- GitLens
- Docker
- autoDocstring (we'll need to settle on a format here then)
- Bracket Pair Colorizer (optional)
@avishkar58 Do you have any suggestions for extensions that would be useful for the JavaScript/ReactJS?
I started a document in case we decide to go through with the contributor feedback form :)
Should add standardisation of line endings as well to the project. See commits in #886 where line endings had to be manually updated to make sure the diffs were correct.
@smhall97 @amritpurshotam You two okay if I give this a go?
@KaleabTessera sure it's fine with me
@KaleabTessera That sounds good to me!