helpers Avoid empty lines in code

We want to avoid empty lines within functions and methods. E.g., in

def func(x):
    y = x + 1
    
    print(y)

the empty line needs to be either removed:

def func(x):
    y = x + 1
    print(y)

or filled in with a comment:

def func(x):
    y = x + 1
    # Print the result.
    print(y)

An acceptable (but undesirable) filler is an empty comment:

def func(x):
    y = x + 1
    #
    print(y)

We should add a Linter step to

check for such empty lines (of course, it should ignore empty lines between the functions, etc)
either (1) fix the problem, by (1.1) removing the empty line or (1.2) filling it with an empty comment
or (2) issue a complaint so that the user can fix it themselves

The benefit of (1) is that we guarantee no empty lines in code as long as Linter is run. The benefit of (2) is that we give the user a chance to add a meaningful comment. @gpsaggese what do you Linter should do here?

P.S. I would say, this is a good second/third issue for the interns once they have gotten their feet wet with something very simple first.

Mar 03 '25 13:03 sonniki

I would remove the lines for simplicity.

I'm experimenting with some LLM prompts to automatically add comments to chunks of codes that need it. The goal is to get linte + LLMs to keep our code in nice shape, ofc a human should be in the loop. We can't let the machine run amok (yet)

Mar 05 '25 23:03 gpsaggese

@allenmatt10 this doc should tell you everything that you need for creating a new Linter step. As always, a good rule of thumb is looking at how the existing Linter scripts are structured and following their example.

Apr 08 '25 19:04 sonniki

Yes, let's do it in the linter without LLMs. After a few tries, LLMs seem to have problems understanding the concept of "empty line," in the same way, they are "not good" at math. Also it's too expensive (and uncool) to pass all the code to the LLM for a simple operation that can be Python-ized.

@allenmatt10 pls make a proposal of how to solve the problem.

It's ok to make some simplifying assumptions, e.g.,

Add a switch to check for empty lines (and assert) and one to remove them
- @sonniki we can use the check to make sure the code has been linted
Look for a function (starting with def) and just remove all the empty lines
IIRC the linter does a pass of black last adding empty lines when needed (e.g., in functions inside other functions)
Add an option to replace empty lines in a function with #. Then we can have an LLM go through and add comments for chunks of code instead of the empty line.

The goal is to remove 90% of the empty lines that people use to separate chunks of codes. It doesn't have to be perfect.

Next steps:

Propose a solution / plan of attack
Create some unit tests to describe the desired approach in terms of TDD
Do 2-3 PRs to add functions little by little

Effort: IMO ~1-2 days

Apr 09 '25 16:04 gpsaggese

Proposed solution:

To implement a script to remove empty lines from inside function bodies by detecting function definitions def .
Empty lines outside functions and around surrounding code will be preserved, as will empty lines within docstrings for readability.
Unit tests include processing functions nested inside classes.
Will add option to replace empty lines with #.

Apr 09 '25 22:04 allenmatt10

The plan is correct although doesn't really add much info to the specs.

The goal is to have a shared understanding of what the script will do and how will do it.

Let's start with the "what": you can file a PR with a bunch of examples with interesting use cases (before and after) to make sure we agree on the I/O behavior. Makes sense?

Apr 09 '25 22:04 gpsaggese

Yes that makes sense. Proceeding with drafting few use cases in a PR now.

Apr 09 '25 22:04 allenmatt10

Done for now, follow-ups can be filed if needed

Apr 17 '25 11:04 sonniki