Avoid empty lines in code
We want to avoid empty lines within functions and methods. E.g., in
def func(x):
y = x + 1
print(y)
the empty line needs to be either removed:
def func(x):
y = x + 1
print(y)
or filled in with a comment:
def func(x):
y = x + 1
# Print the result.
print(y)
An acceptable (but undesirable) filler is an empty comment:
def func(x):
y = x + 1
#
print(y)
We should add a Linter step to
- check for such empty lines (of course, it should ignore empty lines between the functions, etc)
- either (1) fix the problem, by (1.1) removing the empty line or (1.2) filling it with an empty comment
- or (2) issue a complaint so that the user can fix it themselves
The benefit of (1) is that we guarantee no empty lines in code as long as Linter is run. The benefit of (2) is that we give the user a chance to add a meaningful comment. @gpsaggese what do you Linter should do here?
P.S. I would say, this is a good second/third issue for the interns once they have gotten their feet wet with something very simple first.
I would remove the lines for simplicity.
I'm experimenting with some LLM prompts to automatically add comments to chunks of codes that need it. The goal is to get linte + LLMs to keep our code in nice shape, ofc a human should be in the loop. We can't let the machine run amok (yet)
@allenmatt10 this doc should tell you everything that you need for creating a new Linter step. As always, a good rule of thumb is looking at how the existing Linter scripts are structured and following their example.
Yes, let's do it in the linter without LLMs. After a few tries, LLMs seem to have problems understanding the concept of "empty line," in the same way, they are "not good" at math. Also it's too expensive (and uncool) to pass all the code to the LLM for a simple operation that can be Python-ized.
@allenmatt10 pls make a proposal of how to solve the problem.
It's ok to make some simplifying assumptions, e.g.,
- Add a switch to check for empty lines (and assert) and one to remove them
- @sonniki we can use the check to make sure the code has been linted
- Look for a function (starting with
def) and just remove all the empty lines - IIRC the linter does a pass of
blacklast adding empty lines when needed (e.g., in functions inside other functions) - Add an option to replace empty lines in a function with
#. Then we can have an LLM go through and add comments for chunks of code instead of the empty line.
The goal is to remove 90% of the empty lines that people use to separate chunks of codes. It doesn't have to be perfect.
Next steps:
- Propose a solution / plan of attack
- Create some unit tests to describe the desired approach in terms of TDD
- Do 2-3 PRs to add functions little by little
Effort: IMO ~1-2 days
Proposed solution:
- To implement a script to remove empty lines from inside function bodies by detecting function definitions
def. - Empty lines outside functions and around surrounding code will be preserved, as will empty lines within docstrings for readability.
- Unit tests include processing functions nested inside classes.
- Will add option to replace empty lines with
#.
The plan is correct although doesn't really add much info to the specs.
The goal is to have a shared understanding of what the script will do and how will do it.
Let's start with the "what": you can file a PR with a bunch of examples with interesting use cases (before and after) to make sure we agree on the I/O behavior. Makes sense?
Yes that makes sense. Proceeding with drafting few use cases in a PR now.
Done for now, follow-ups can be filed if needed