openfast icon indicating copy to clipboard operation
openfast copied to clipboard

Python interface design

Open rafmudaf opened this issue 1 year ago • 7 comments

Design considerations for Python interface library + adjacent Python tooling

Most of the Python infrastructure to interact with OpenFAST was initially introduced as part of the regression testing suite in 2017. At that time, the Python infrastructure consisted of scripts to drive simulations by calling the executables through the os and sys modules, basic pre and post processing functions, and tools to evaluate results in the test suite. This slowly grew with the needs of the regression test infrastructure. Then, a Python interface was added to the FAST Library and additional Python interfaces were created for module-specific libraries. Recently, the openfast_io package to interact with input and output files was incorporated into the OpenFAST repository. Many of these tools have been added in ad hoc efforts when there's been a specific need, but I have not seen a high level design for the Python tooling.

I'm opening this issue to propose design considerations and ideas for the Python infrastructure with the goal of moving toward a cohesive, consolidated, and effective Python front-end for OpenFAST. Please comment on this issue to discuss anything here, and I will update the issue post as decisions are made in the comments.

Purpose of Python infrastructure

Although the Python infrastructure within the OpenFAST repository was initially created to drive the regression tests, the potential impact goes far beyond that. In my opinion, the purpose of the Python infrastructure should be to provide a front end to OpenFAST so that users can interact with the compiled modules and glue code through Python scripts and integrate OpenFAST in-code with the Python ecosystem. This gives a group of users who are familiar with Python but less familiar with shell scripting access to OpenFAST. Beyond that, it enables tight integration with powerful libraries within the Python ecosystem such as Numpy, Scipy, and the AI/ML tools.

Design intent

Installation: Ultimately, the Python infrastructure should be consolidated into a single package that supports most of the user-level needs of OpenFAST. The package should be installed through typical Python methods. Currently, that means installation through pip either from the local clone of the repository or via PyPI and within a Python virtual environment like conda.

Components: The eventual consolidated Python package should include the following components:

  • Input / output parsing
  • Input generation
  • Simulation execution
  • Outer-level module coupling (similar to the current glue codes)

UI: As a Python package, the standard conventions of the Python ecosystem should be followed. I love the Zen of Python and refer to it often, and I encourage all developers of the OpenFAST Python infrastructure to read it. Especially with respect to the UI, readability and predictability are critical. Some additional considerations:

  • As much as practical, let Python handle errors and raise standard Python errors (i.e. no sys.exit)
  • Avoid monkey patching and side effects within the OOP architecture
  • Design intuitive, user-friendly APIs that are easy to grok (read and understand) and use
  • Consider how parts of the API will work together

Proposed architecture

  • pyOpenFAST
    • io
    • fast
    • fast.farm
    • aerodyn
      • __init__
      • init
      • step
      • end
    • moordyn
    • other modules...

A simple script might look something like this:

from pyOpenFAST.fast import FastLib
from pyOpenFAST.io import OutputHandler

fast_lib = FastLib("input file)
fast_lib.init()
fast_lib.step()
fast_lib.end()

outputs = OutputHandler(fast_lib.outputs)
outputs.do_some_post_processing()

rafmudaf avatar Apr 18 '25 15:04 rafmudaf

Responding to https://github.com/OpenFAST/openfast/pull/2735#issuecomment-2815413990:

@rafmudaf Could you update the documentation for running the regression tests that use the pyOpenFAST module? It took me a bit to figure out that GitHub actions does pip install glue-codes/python/. and then executePythonRegressionCase.py imports it. Perhaps a better alternative would be to have executePythonRegressionCase.py add glue-codes/python/. to sys.path so it's always using the version directly from the repo. Sorry if this is the wrong place for this feedback, I figured this PR was related to https://github.com/OpenFAST/openfast/pull/2720.

@deslaughter No problem, I'm happy to update the docs. Is this already documented somewhere for me to update? Otherwise, where's a good place to put it? Some options are to add a README to the Python package, add it to the docstring of the script, or add it somewhere in the docs site.

As for importing the package to the regression test scripts, if you've installed the Python package with the editable install flag (pip install -e glue-codes/python/.), then it will create a link to the repository rather than copy the source files into the Python installation directory. This means it'll always use the version of the library that is in your repo.

In general, installing a package through an environment is preferable to manually modifying the Python system path since the latter is not portable (if the script moves, the path breaks). While most users will probably not move these particular execution scripts, it's likely that the patterns used in them will be copied to other places so I think it's worth demonstrating a good practice. That being said, let me know what you think is best for the sustainability and usability of OpenFAST, and I can make the change.

rafmudaf avatar Apr 18 '25 16:04 rafmudaf

@rafmudaf I agree that doing an editable install of the module makes sense, I didn't know this was possible. Maybe a note could be added in the test execution scripts describing what to do, or a link to the appropriate documentation added. The goal is to keep users from getting frustrated when trying to run the test suite and it generating a difficult to diagnose error.

Could this module be added as part of the main requirements.txt file in some way? Then we could just tell users to install that before running CTest.

deslaughter avatar Apr 22 '25 13:04 deslaughter

I don't think I fully understand what is being proposed here. Is this going to break the way I currently run tests locally (no python environment, multiple build directories on multiple local clones)?

andrew-platt avatar Apr 22 '25 15:04 andrew-platt

I can adapt to this new method ;)

andrew-platt avatar Apr 22 '25 16:04 andrew-platt

I'm not completely sure I'm understanding exactly the goals of this conversation, but it appears that it is along the lines of integrating all the codes we have and having some sort of consolidated "toolbox", thus avoid code (and effort) duplication. I've heard compelling reasons for pretty much all the possible scenarios, and I'm glad for this topic to be brought up again. I'll add my 2c to the conversation.

I think most involved parties have a goal with these tools and I think it's important we all talk about our own goals first before considering a possible solution. For example, for me it would nice to have everything from openfast_toolbox to be under openfast. Alignment of tags/releases would be straightforward. I also do not love the idea of having duplicate code and I think having everything under a single repo could prevent it from happening. I understand other folks want only bits of the toolbox for their own needs, and I would imagine that some other people have thoughts and/or requirements in terms of how how installable these tools are.

Finally, just a small personal preference: I don't like the name pyOpenFAST as it implies a full-featured OpenFAST that happens to be written in python. We had an old discussion on naming here.

rthedin avatar May 01 '25 22:05 rthedin

Before jumping into naming considerations or scope, we should settle a more abstract question: what's the need?

As I see it, there are three primary user needs for Python infrastructure:

  • Preprocessing of OpenFAST input data to create a large set of cases or dynamically create cases based on some objective
  • Post processing of OpenFAST output data to analyze results
  • Provide Python-based access to specific subroutines within OpenFAST glue codes and modules to create new simulators and integrate OpenFAST into the Python ecosystem

@rthedin Do you have anything to change, add, or takeaway from this?

rafmudaf avatar May 02 '25 15:05 rafmudaf

I'm not completely sure I'm understanding exactly the goals of this conversation

Just to clarify, the goal of this discussion thread is to collaboratively design the Python infrastructure within OpenFAST rather than independently adding a variety of features to a variety of Python projects that interact with OpenFAST. I wouldn't constrain thoughts to what already exists. Instead, this is an opportunity to discuss what's needed and what's possible and identify specific requirements. When we have that, we can design the scope and strategy and start moving toward an implementation that satisfies the need and potential.

I proposed my thoughts in the initial discussion post as a starting point. As things evolve in the comments, I'll update the OP so that we end up with a design document.

rafmudaf avatar May 02 '25 15:05 rafmudaf