web-client-ui
web-client-ui copied to clipboard
Automatic code formatting for notebooks
As a user, I want automatic code formatting in my notebooks so that my code looks pretty.
I would like to have formatting applied by something like black: https://github.com/psf/black.
Ruff wasm Python formatter and linter:
- [ ] if we are building it, need an action to do so and keep it updated, else adopt an active npm version (https://github.com/dprint/dprint-plugin-ruff has automation to self update and publish new versions, so maybe use that one?)
- [ ] Should be lazy loaded
- [ ] New settings menu section: "Editor" with options: for formatting on save and lint rules (via a code editor with ruff linting rules they have on the playground?)
- [ ] Format button in notebook overflow, and context menu
- [ ] Settings must be configurable in enterprise by a prop so that orgs can enforce which linting rules apply globally
I think this is a great idea. +1 on black.
https://github.com/charliermarsh/ruff is just a linter, but I believe they are working on adding an autoformatter.
The list of projects using it is very impressive, so I can imagine it might replace black in the near future.
Big fan of anything written in rust! Looks like a neat project, I'll definitely keep an eye on it.
https://github.com/charliermarsh/ruff/issues/4041
I believe we have plans to add a Ruff formatter, but it's a ways off. For now, black is still required.
That's right. For now, we don't autofix line wrapping, and we do recommend using Ruff alongside Black.
So, it looks like the devs of ruff like black too; my guess is if they eventually get a Ruff formatter, it will have a black-compatible mode. I think black is still our best bet today.
Black is probably the best choice today, but yeah that could change in the future https://github.com/charliermarsh/ruff/issues/1904
We use prettier for our stuff, so them combining black plus some basic config options like prettier allows would be great.
https://github.com/jpadilla/black-playground https://black.vercel.app
could be useful starting point?
It could be useful to use bidirection plugins to support this use case.
Ruff has a wasm package (have to build it from src) that can be used to format and lint completely client-side
Not something I will currently continue working on (I have other priorities), but I made a quick POC using ruff as wasm. https://github.com/mattrunyon/web-client-ui/tree/ruff-wasm
I built the wasm from source and just included it as part of the branch. Looked at https://github.com/astral-sh/ruff/blob/main/playground/src/Editor/SourceEditor.tsx for how to call the linter/formatter and map the linter results to actions and markers.
I roughly implemented
- Format on save (only on ctrl+s shortcut)
- Linter markers (no debouncing or deferring)
- Code actions for quick fixing
Would still need to cleanup the code and add
- Configuring linter settings and format on save via user settings menu
- Action to add ignore comment for linter rules (noqa comments. See this part of ruff-lsp for some more info)
- Adding a range formatting provider. Probably just format all the lines before the start of the range, then all lines + range and insert the diff
Some issues on my branch
- Duplicate code actions. Providers might be registered for every open editor. For autocomplete this doesn't cause an issue as when 1 provider resolves, the others are ignored I think. But for code actions it seems to call all resulting in duplicated code actions.
- Building prod doesn't copy the wasm file into assets, so it doesn't load properly. The wasm file should probably be part of an assets folder included w/ the console package.
- Code action lightbulb seems to be off. Not sure if we're doing some CSS to move another element that also moves that button
Potentially related would be catching syntax errors on the web UI; I don't know if the code formatting tools considered would expose the appropriate syntax error hints, but that would also be a great addition.
Two questions we should decide an answer on:
- Do we enable format on save by default? (my opinion is yes)
- Are there any linting rules we turn on beyond the default? Probably, but not sure. We should have our Pythonistas look through the available rule sets.
@jnumainville for investigating any additional default rules sets that are commonly enabled
https://docs.astral.sh/ruff/settings/#lintflake8-implicit-str-concat I think we should add this rule by default
t = t.update([
"X=X+1" # missing comma at the end not flagged by other rulesets
"Y=Y+1"
])
a quick survey indicates to me that there isn't much of a consistently chosen ruleset https://github.com/pola-rs/polars/blob/f6b4f48f3f7a981f6d022b5ca8a50b40a467d74c/py-polars/pyproject.toml#L133 https://github.com/jupyter/notebook/blob/a1e25b92bf10ef13a760353837114db9b498f242/pyproject.toml#L243 https://github.com/pandas-dev/pandas/blob/529bcc1e34d18f94c2839813d2dba597235c460d/pyproject.toml#L190 https://github.com/vega/altair/blob/f1b4e2c84da2fba220022c8a285cc8280f824ed8/pyproject.toml#L166 https://github.com/matplotlib/matplotlib/blob/46b39ab5bea6668e56140da568bde60276f6f906/pyproject.toml#L139 https://github.com/scikit-learn/scikit-learn/blob/4449ded95bdc7468f7465a3e89ea1b90b1426dfd/pyproject.toml#L143 https://github.com/pytorch/pytorch/blob/56b271fd7a87e03dc6cb4329a702eb5f5031f3e3/pyproject.toml#L91
We should probably default to no style specific rules and focus only on rules that can show actual problems with the code (like inconsistent tabs/spaces). Or if they are style related, they should be to prevent easy to miss mistakes like the implicit string concatenation. That rule is technically style because it's valid Python.
Another style rule we should consider is warning for unused imports/variables.
Users would be able to add the other rules if they wanted to enforce something like mandatory pep8 naming conventions. We shouldn't enforce those on everyone by default unless we feel that strongly about it.
I've been looking at the ruff rules https://docs.astral.sh/ruff/rules
I think we should enable
- Pyflakes (F). These seems to be all actual errors or potential errors at a language level. Like
if-tuplewhich is always true - pycodestyle E999 syntax-error. Definitely useful
- pycodestyle E100 and E200 levels. These are all addressing extra/inconsistent whitespace and don't appear to be style choices, but things that could cause actual parse errors.
- pycodestyle E711 (none-comparison) might be useful. I'm not sure if
x == Nonecan cause issues in Python or it's just recommended to usex is None. If the latter then we maybe leave it off by default since it's just a style choice. - pycodestyle W291 (trailing-whitespace) may already be covered by the E100/200s
- pycodestyle W293 (blank-line-with-whitespace) may be covered by E100/200s
- pycodestyle W605 (invalid-escape-sequence)
- isort I002 (missing-required-import) I would need a Python person to tell me if this is actually useful or not in DH
- flake8-bugbear B002 (unary-prefix-increment-decrement). Unless this is caught by another syntax error rule
- flake8-bugbear B015 (useless-comparison). Should prevent accidental bugs/errors
- flake8-bugbear B016 (raise-literal). Might be caught by other rules. Prevents a
TypeError - flake8-bugbear B018 (useless-expression). Probably prevent accidental bugs
- flake8-bugbear B020 (loop-variable-overrides-iterator). Prevent accidental bugs
- flake8-bugbear B023 (function-uses-loop-variable). This is DEFINITELY an issue for developers writing dh.ui components with a loop creating components since Python does not create closures over the loop iterations
- flake8-bugbear B032 (unintentional-type-annotation)
- flake8-bugbear B035 (static-key-dict-comprehension). Usually not intended
- flake8-bugbear B909 (loop-iterator-mutation). Can cause infinite loops if you're not careful
- flake8-commas COM818 (trailing-comma-on-bare-tuple). Seems like it would catch accidental bugs
- flake8-implicit-string-concatenation (ISC). I need to check what combo of these rules we would want. There is also a preference (separate from the rules) we probably want on to disable allowing multiline concatenation
- pylint error (PLE). These seem like actual errors. Not sure how much overlap there is with pyflakes.
- Ruff-specific rules RUF001 (ambiguous-unicode-character-string)
- RUF021 (parenthesize-chained-operators). Prevent unclear and/or logic
- RUF027 (missing-f-string-syntax). Warn if missing the f before what looks like an fstring
Maybe enable, but might not be very useful. I need a Python person to provide more insight
- flake8-async
- flake8-bugbear B029 (except-with-empty-tuple). Not sure how common this would be in DH code
- flake8-builtins (A). All prevent shadowing builtins. Not sure if this is an easy trap in Python
- flake8-datetimez (DTZ). I'm not sure if we want to warn about any of these. For example, I'm not sure if omitting a timezone in Python leads to an error when you try to put it into an
Instantin a table or if it auto adds UTC. - flake8-logging (LOG). Idk if this applies to user code in DH at all
- flake8-return (RET503). Not sure if we want to enforce this. Could catch potential issues
- pandas-vet (PD). Not sure if these are just style things or actual mistakes in pandas
- Pylint PLC2401, PLC2403. Prevent somehow getting a non-ascii character in a variable?
- NumPy-specific rules (NPY). Would numpy users care about these?
- Perflint (PERF). Some potential performance gains? I disagree with the rules preferring dict comprehension over a for loop claiming comprehensions are more readable. I think only simple comprehensions are more readable.
Probably should not enable, but individual users might want to use these
- pycodestyle E700 rules are stylistic, but might be something many people want warnings on
- pep8-naming (N). I think these are all stylistic. There's not actual errors or bugs caused by anything here
- flake8-comprehensions (C4). If you use lots of comprehensions I imagine this could point out some optimizations
Formatter settings we should leave as default I think with the following exceptions
indent-widthshould match our monaco width. I think both default to 4quote-style. Default isdouble. We could usepreserve, but probably just leave as double and let users configure
The default is ["E", "F"] which encompasses all the E and F rules, I would propose we only add rules to those, not remove any.
Many of the E rules outside 1xx/2xx/9xx are more stylistic to me. If @jnumainville wants to chime in I'd like his thoughts on the other E rules.
Also, the default is ["E4", "E7", "E9", "F"]. I don't think E4 and E7 (4xx and 7xx) are actual problems with code
To address some specific points -
On one hand E731 is certainly one I don't think we'd want considering the usefulness of lambdas in React style code.
On the other hand:
E402 could be a code problem if you've accidentally imported something after the fact.
E721 is one that could be a problem if you're subclassing.
E722 could definitely be a code problem by swallowing up exceptions that you shouldn't.
More broadly though, the E rules are PEP8 - these are rules that most people are going to be used to and reasonably expect. I don't think it's unreasonable to have those in a basic rule set, even though some are stylistic choices. Or at least E4 and E7 (besides E731) are reasonable. Seems like E1, E2, E3 rules are unstable? E3 are ones I wouldn't argue as being important.
Are we planning to provide an easy way to turn off formatting and turn off linting (likely, two different settings)?
I think @jmao-denver and @chipkent should take a look specifically. I'm hoping that the settings we choose to use for web UI can be adopted by deephaven-core python development, ala https://github.com/deephaven/deephaven-core/issues/3638
Yes they will be toggles either in the notebook settings (like mini-map) or the user settings (like shortcuts).
Linting config will be user customizable. Formatting has a few options as well, but I think it's only indent size and single/double quotes for strings.
For aligning w/ deephaven-core Python development, the web UI might be a subset of the rules you want there. There are some rules that are more preference/style that we might want in deephaven-core, but not necessarily applied to all users of DH
- isort I002 (missing-required-import) https://docs.astral.sh/ruff/rules/missing-required-import/ looks like it is adding imports to the file. I would be concerned about this happening without careful review.
- flake8-async -> yes
- flake8-bugbear B029 -> yes
- flake8-builtins -> would be good to turn on. We may get a bunch of alerts where we missed things.
- flake8-datetimez (DTZ) -> probably not. I think we wrote the code to be flexible with what users do, so I'm sure we have some test cases where this will fail.
- lake8-logging (LOG) -> probably yes.
- flake8-return (RET503) -> RET501-RET503 yes. RET504-RET508 probably not.
- pandas-vet (PD) -> probably yes. Confirm with @jmao-denver
- Pylint PLC2401, PLC2403 -> yes
- NumPy-specific rules (NPY) -> yes
- Perflint (PERF) -> yes, but we may have failures that need to be cleaned up
- pycodestyle -> I would make a style assessment a separate effort since it will have more arguing and contention
- pep8-naming (N) -> I'm very much in favor of turning this on. We have tried to make the API PEP8 compliant and should continue to do this. Having said that, we may get a bunch of alerts that need to be cleaned.
- flake8-comprehensions (C4) -> We use a lot of comprehensions, so this is probably good as a performance audit. Not sure how many alerts we'll get.
DM me if you want me to dig in beyond the list you have.
@chipkent just to clarify the rules list I proposed was for the web UI to show to users in notebooks/terminal. I think most of what you added would probably not be annoying to users, but pep8-naming might be something we leave up to users to add if they want it. Or we can enable it by default and users can disable if they don't care about it. Just depends on what we want the default experience to be in the web UI.
We can also set some rules to warning (yellow squiggly underline) and others to error (red squiggly underline). That would be up to us to determine the defaults as ruff does not assign any default levels to rules.
If we're adding linting rules to deephaven-core then we can definitely be as opinionated as we want and should add things like pep8-naming.
The required imports rule is something you can customize. I think by default it does nothing when on because you have to configure a list of required imports. https://docs.astral.sh/ruff/settings/#lint_isort_required-imports
I see. My bad. Let me reasses the list with that in mind.
- isort I002 (missing-required-import) -> not sure. I'm a bit fuzzy on what this does. You would need to play with it.
- flake8-async -> yes
- flake8-bugbear B029 -> yes
- flake8-builtins -> A001 & A002 yes; A003 maybe
- flake8-datetimez (DTZ) -> no
- lake8-logging (LOG) -> yes.
- flake8-return (RET503) -> RET501-RET503 yes. Others no.
- pandas-vet (PD) -> PD002 no; PD901 no; others maybe with a lean to no. It would depend on how much info warnings provide. Confirm with @jmao-denver
- Pylint -> PLC2401 & PLC2403 I would lean to yes, but I have some concerns that international users might want it off. These are both preview features, so that might make us want to lean to no.
- NumPy-specific rules (NPY) -> yes
- Perflint (PERF) -> I could go either way. I might lean to yes to potentially avoid some perf problems that DH has to track down.
- pycodestyle -> E101, E113, E999 yes. Others are probably too style specific. These should avoid us support costs.
- pep8-naming (N) -> no, but it would be good to have an option to turn this on.
- flake8-comprehensions (C4) -> I lean to yes. I think this could save support time chasing perf problems. Should get @jmao-denver thoughts.
DM me if you want me to dig in beyond the list you have.
I002 we don't need. It lets you configure a list of imports that must occur in every file. So you could require every file do from __future__ import typings or import deephaven.special_sauce if that were required for every file
I am in agreement with @chipkent's preference. During my discussion with @mattrunyon , I raised the issue of Python version compliance/compatibility check, it seems that that's currently not available with Ruff and the rulesets. But one can argue that it is more important to DH than to the users.
Will add to a feature branch first, starting with this PR #2043
@rbasralian would like to make sure there is an option to disable/enable auto format on save.