RAiDER icon indicating copy to clipboard operation
RAiDER copied to clipboard

Refactoring notes for cleaner code

Open Askaholic opened this issue 4 years ago • 2 comments

Some simple things which I'm noticing while reading the code. I think these things will go a long way in cleaning up the code.

  1. Place "high level" functions at the TOP of the file and "low level" functions towards the bottom. This should generally coincide with the order in which the functions are called, so for example:
def some_high_level_function():
    """Reads stuff from files and calls other functions"""

    data = read_from_files()
    some_process(data)


def read_from_files():
    ...


def some_process(data):
    ...

This allows someone who is reading the code to follow the flow a lot more clearly. Right now it's a bit of a "find in project" situation for every function.

  1. Never commit commented out code. Ever. Just delete it, it is saved in the git history if we ever do need to go back to it (chances are we wont). For experimental features, use a new branch and merge changes back to dev once they have been solidified.

  2. Keep "read from file" operations separate from data processing. This will also help with the "production readiness" of the project. As I illustrated in point 1, if some third party user had data stored in a weird format, they could write their own read_from_files function which loads the data into the correct data structure, and call some_process with that correctly formatted data. This will also help us write code to support multiple data formats out-of-the-box more easily.

Askaholic avatar Jun 24 '20 22:06 Askaholic

Sounds good.

@Askaholic @jlmaurer with respect to module dependencies that might change depending on the model. Would you say require everything and have the import a the top of the code or allow for function level imports too?

dbekaert avatar Jun 25 '20 13:06 dbekaert

I was thinking about point 3 overnight, and came to the conclusion that it might not always be possible. For instance when working with really large files, we might not be able to load them entirely into memory, and instead need to load them in chunks. In this case it's more appropriate to support only hdf5 format and load slices as needed. Otherwise it would be a bit of a software engineering task to write a wrapper interface to do lazy loading, which I think is overkill at this point.

Module imports should always be at the top of the file. If you want to have an optional dependency, the standard solution is to do something like this:

try:
    import my_optional_dependency
except ImportError:
    my_optional_dependency = None

If you want to be nice to your users, you will add an error message in the places where you use the dependency:

if my_optional_dependency is None:
    raise RuntimeError("You must install 'my_optional_dependency' to call this function!")

# Do something with my_optional_dependency...

Askaholic avatar Jun 25 '20 17:06 Askaholic