param
param copied to clipboard
Reformat the source code
I'd like to get feedback about reformatting entirely the source code which I think would be beneficial for these reasons:
- Quite often when a PR is submitted to Param it includes a few format changes (either because this is done automatically by a code editor or because the developer couldn't resists "fixing" the format of the code surrounding their changes). This is not ideal for the sake of git history and doesn't help code reviews.
- I've opened two issues about improving the docstrings (https://github.com/holoviz/param/issues/588) and adding type hints to the source code (https://github.com/holoviz/param/issues/589). These changes if they happen (specially the second one) would touch a lot of code lines, I'd prefer this to be done on a clean basis, after reformatting (and to avoid what is mentioned in 1.).
- Param has linting enabled except that it's pretty loose! Linting is of a good help when writing code (at least to me), reformatting the source code would allow to remove most (if not all!) of the ignored exceptions listed below. I'd also like to see the tests being reformatted and linting applied to them.
[flake8]
# TODO tests should not be excluded (one day...)
include = setup.py param numbergen
exclude = .git,__pycache__,.tox,.eggs,*.egg,doc,dist,build,_build,tests,.ipynb_checkpoints,run_test.py
ignore = E114,
E116,
E126,
E128,
E129,
E2,
E3,
E4,
E5,
E731,
E701,
E702,
E703,
E704,
E722,
E741,
E742,
E743,
W503,
W504,
And here are a few points against reformatting the source code, with my opinion:
- The maintainers of Param may be used to how the source code looks and feels and don't want to see it changed not to disturb them or slow them down. Well this is a tradeoff, potential contributors to Param could also be disturbed and slowed down by how the source code currently looks and feels.
- Parameters of a Parameterized class are usually declared following a common style (not described anywhere to my knowledge) that gets messed up when an automatic code formatter is executed. While I think this could be a problem for a large source code like Panel (although that's debatable) this would here affect only
numbergen
. I think the number of people who readnumbergen
's code is pretty low, even more so now that it is documented. - Reformatting the source code will touch a lot of code lines and thus have an impact on git history/blame. Yet I believe it is healthier if it happens in one go, instead of slowly at each new PR.
Are we talking reformatting using Black? Then I'm all for it. Would make my life a lot easier.
My vote goes for line length 100. But I can live with the default 80 as well.
I have no strong opinion about using black
or another tool. I think that if this is ultimately done, the code formatter could be added as a pre-commit
step to avoid drifting.
I also prefer a longer line length than the one recommended by PEP8.
I'm happy for the Param source code to be reformatted in some consistent style, and happy for that to be done using Black with line length 100. Black should work fine for Param's own source code, but for libraries using Param extensively (as you note for numbergen), we'd have to figure out how to trick Black into having a reasonable formatting for parameter declarations (perhaps by postprocessing?). Luckily that's a separate issue.
Regarding your anticipated objections,
- Current Param maintainers primarily work on other projects, only very rarely needing to change Param itself. So we don't have strong internal models of Param's formatting, and find it just as annoyingly inconsistent as you do.
- Numbergen changes almost never, so we could either not mess with the numbergen files (it's a separate module, after all) or else do a one-time automatic formatting of the numbergen module but manually leaving the parameter declarations formatted as they are now.
- Agreed; a single reformatting PR is better than reformatting changes in every PR.
Nice to hear this positive feedback! I know code formatting can be a sensitive topic among developers so I was just being careful ;)
As for formatting numbergen
I could apply black
once and then fix the Parameter formatting manually. Then, if black
ends up as part of Param's CI (run by pre-commit for instance) numbergen
would be excluded from the files collected by black
, since as you say its code almost never changes.
I've sampled some Parameter definitions from numbergen
below:
lbound = param.Number(default=0.0,doc="inclusive lower bound")
time_dependent = param.Boolean(default=False, doc="""
Whether the given time_fn should be used to constrain the
results generated.""")
random_generator = param.Parameter(
default=random.Random((500,500)), doc=
"""
Random state used by the object. This may may be an instance
of random.Random from the Python standard library or an
instance of numpy.random.RandomState.
This random state may be exclusively owned by the object or
may be shared by all instance of the same class. It is always
possible to give an object its own unique random state by
setting this parameter with a new random state instance.
""")
mean = param.Number(default=0.0, doc="""Mean value""")
choices = param.List(default=[0,1],
doc="List of items from which to select.")
Their formatting, in particular the doc
attribute, isn't consistent. What's the recommended format?
On black
, I'm not sure there's a way to trick it, from what I know it's not so configurable. We could have a preprocessor that adds fmt: skip
to each Parameter declaration to have it ignored by black
. Or as you say a postprocessor that reformats Parameter declaration as we want. If it does exist, maybe such tool could live in Param itself.
I don't think Black itself is configurable to do it; it's designed not to be, which is generally not a bad thing but in this case highly inconvenient. We don't currently have a postprocessor, but my preferred format if it's achievable would be, for these examples:
lbound = param.Number(default=0.0, doc="""
Inclusive lower bound.""")
time_dependent = param.Boolean(default=False, doc="""
Whether the given time_fn should be used to constrain the
results generated.""")
random_generator = param.Parameter(default=random.Random((500,500)), doc="""
Random state used by the object. This may may be an instance
of random.Random from the Python standard library or an
instance of numpy.random.RandomState.
This random state may be exclusively owned by the object or
may be shared by all instance of the same class. It is always
possible to give an object its own unique random state by
setting this parameter with a new random state instance.""")
mean = param.Number(default=0.0, doc="""
Mean value.""")
choices = param.List(default=[0,1], doc="""
List of items from which to select.""")
Here I didn't try reformatting anything for a particular line length, which would follow whatever is enforced on the rest of the file.
This formatting is designed to achieve several goals:
- Make it simple to scan the list of parameters by name (making the names stand out on the left)
- Be vertically compact, so that skimming through to find a parameter of interest will be quick
- Make the docstring clearly separate from the other arguments (always on a new line)
Other formatting approaches generally fail to satisfy one of these constraints.
Just saw that Github can now ignore commits, which could be relevant if a big reformat is done.
https://docs.github.com/en/repositories/working-with-files/using-files/viewing-a-file#ignore-commits-in-the-blame-view