param icon indicating copy to clipboard operation
param copied to clipboard

Reformat the source code

Open maximlt opened this issue 2 years ago • 6 comments

I'd like to get feedback about reformatting entirely the source code which I think would be beneficial for these reasons:

  1. Quite often when a PR is submitted to Param it includes a few format changes (either because this is done automatically by a code editor or because the developer couldn't resists "fixing" the format of the code surrounding their changes). This is not ideal for the sake of git history and doesn't help code reviews.
  2. I've opened two issues about improving the docstrings (https://github.com/holoviz/param/issues/588) and adding type hints to the source code (https://github.com/holoviz/param/issues/589). These changes if they happen (specially the second one) would touch a lot of code lines, I'd prefer this to be done on a clean basis, after reformatting (and to avoid what is mentioned in 1.).
  3. Param has linting enabled except that it's pretty loose! Linting is of a good help when writing code (at least to me), reformatting the source code would allow to remove most (if not all!) of the ignored exceptions listed below. I'd also like to see the tests being reformatted and linting applied to them.
[flake8]
# TODO tests should not be excluded (one day...)
include = setup.py param numbergen
exclude = .git,__pycache__,.tox,.eggs,*.egg,doc,dist,build,_build,tests,.ipynb_checkpoints,run_test.py
ignore = E114,
         E116,
         E126,
         E128,
         E129,
         E2,
         E3,
         E4,
         E5,
         E731,
         E701,
         E702,
         E703,
         E704,
         E722,
         E741,
         E742,
         E743,
         W503,
         W504,

And here are a few points against reformatting the source code, with my opinion:

  1. The maintainers of Param may be used to how the source code looks and feels and don't want to see it changed not to disturb them or slow them down. Well this is a tradeoff, potential contributors to Param could also be disturbed and slowed down by how the source code currently looks and feels.
  2. Parameters of a Parameterized class are usually declared following a common style (not described anywhere to my knowledge) that gets messed up when an automatic code formatter is executed. While I think this could be a problem for a large source code like Panel (although that's debatable) this would here affect only numbergen. I think the number of people who read numbergen's code is pretty low, even more so now that it is documented.
  3. Reformatting the source code will touch a lot of code lines and thus have an impact on git history/blame. Yet I believe it is healthier if it happens in one go, instead of slowly at each new PR.

maximlt avatar Jan 13 '22 10:01 maximlt

Are we talking reformatting using Black? Then I'm all for it. Would make my life a lot easier.

My vote goes for line length 100. But I can live with the default 80 as well.

MarcSkovMadsen avatar Jan 13 '22 10:01 MarcSkovMadsen

I have no strong opinion about using black or another tool. I think that if this is ultimately done, the code formatter could be added as a pre-commit step to avoid drifting.

I also prefer a longer line length than the one recommended by PEP8.

maximlt avatar Jan 13 '22 11:01 maximlt

I'm happy for the Param source code to be reformatted in some consistent style, and happy for that to be done using Black with line length 100. Black should work fine for Param's own source code, but for libraries using Param extensively (as you note for numbergen), we'd have to figure out how to trick Black into having a reasonable formatting for parameter declarations (perhaps by postprocessing?). Luckily that's a separate issue.

Regarding your anticipated objections,

  1. Current Param maintainers primarily work on other projects, only very rarely needing to change Param itself. So we don't have strong internal models of Param's formatting, and find it just as annoyingly inconsistent as you do.
  2. Numbergen changes almost never, so we could either not mess with the numbergen files (it's a separate module, after all) or else do a one-time automatic formatting of the numbergen module but manually leaving the parameter declarations formatted as they are now.
  3. Agreed; a single reformatting PR is better than reformatting changes in every PR.

jbednar avatar Jan 14 '22 17:01 jbednar

Nice to hear this positive feedback! I know code formatting can be a sensitive topic among developers so I was just being careful ;)

As for formatting numbergen I could apply black once and then fix the Parameter formatting manually. Then, if black ends up as part of Param's CI (run by pre-commit for instance) numbergen would be excluded from the files collected by black, since as you say its code almost never changes.

I've sampled some Parameter definitions from numbergen below:

    lbound = param.Number(default=0.0,doc="inclusive lower bound")
    
    time_dependent = param.Boolean(default=False,  doc="""
       Whether the given time_fn should be used to constrain the
       results generated.""")

    random_generator = param.Parameter(
        default=random.Random((500,500)), doc=
        """
        Random state used by the object. This may may be an instance
        of random.Random from the Python standard library or an
        instance of numpy.random.RandomState.
        This random state may be exclusively owned by the object or
        may be shared by all instance of the same class. It is always
        possible to give an object its own unique random state by
        setting this parameter with a new random state instance.
        """)

    mean = param.Number(default=0.0, doc="""Mean value""")

    choices = param.List(default=[0,1],
        doc="List of items from which to select.")

Their formatting, in particular the doc attribute, isn't consistent. What's the recommended format?

On black, I'm not sure there's a way to trick it, from what I know it's not so configurable. We could have a preprocessor that adds fmt: skip to each Parameter declaration to have it ignored by black. Or as you say a postprocessor that reformats Parameter declaration as we want. If it does exist, maybe such tool could live in Param itself.

maximlt avatar Jan 14 '22 21:01 maximlt

I don't think Black itself is configurable to do it; it's designed not to be, which is generally not a bad thing but in this case highly inconvenient. We don't currently have a postprocessor, but my preferred format if it's achievable would be, for these examples:

    lbound = param.Number(default=0.0, doc="""
        Inclusive lower bound.""")
    
    time_dependent = param.Boolean(default=False, doc="""
        Whether the given time_fn should be used to constrain the
        results generated.""")

    random_generator = param.Parameter(default=random.Random((500,500)), doc="""
        Random state used by the object. This may may be an instance
        of random.Random from the Python standard library or an
        instance of numpy.random.RandomState.
        This random state may be exclusively owned by the object or
        may be shared by all instance of the same class. It is always
        possible to give an object its own unique random state by
        setting this parameter with a new random state instance.""")

    mean = param.Number(default=0.0, doc="""
        Mean value.""")

    choices = param.List(default=[0,1], doc="""
        List of items from which to select.""")

Here I didn't try reformatting anything for a particular line length, which would follow whatever is enforced on the rest of the file.

This formatting is designed to achieve several goals:

  • Make it simple to scan the list of parameters by name (making the names stand out on the left)
  • Be vertically compact, so that skimming through to find a parameter of interest will be quick
  • Make the docstring clearly separate from the other arguments (always on a new line)

Other formatting approaches generally fail to satisfy one of these constraints.

jbednar avatar Jan 14 '22 22:01 jbednar

Just saw that Github can now ignore commits, which could be relevant if a big reformat is done.

https://docs.github.com/en/repositories/working-with-files/using-files/viewing-a-file#ignore-commits-in-the-blame-view

hoxbro avatar Mar 23 '22 15:03 hoxbro