PySolid icon indicating copy to clipboard operation
PySolid copied to clipboard

Python implementation

Open piyushrpt opened this issue 1 year ago • 11 comments

  1. pysolid.py_solid is the python only implementation of pysolid
  2. The entire folder should be a drop-in replacement of pysolid
  3. point_pyimpl.py and grid_pyimpl.py have been added to tests. These essentially are same as the original tests except they use py_solid and have a couple of minor changes related to the function signatures in py_solid.

Summary by Sourcery

Implement a Python-only version of the 'pysolid' library, named 'py_solid', to replace the original implementation. Update import statements to use relative paths and add new tests to verify the functionality of the new implementation.

New Features:

  • Introduce a Python-only implementation of the 'pysolid' library, named 'py_solid', which serves as a drop-in replacement for the original 'pysolid'.

Enhancements:

  • Refactor import statements in 'pysolid/init.py' to use relative imports for better modularity.

Tests:

  • Add new test files 'point_pyimpl.py' and 'grid_pyimpl.py' to validate the functionality of the 'py_solid' implementation, ensuring compatibility with existing tests.

piyushrpt avatar Nov 11 '24 05:11 piyushrpt

Reviewer's Guide by Sourcery

This pull request implements a pure Python version of the solid Earth tides calculation functionality, previously implemented in Fortran. The implementation includes core calculation functions and wrappers to maintain API compatibility with the existing codebase. The changes are organized into a new py_solid package that serves as a drop-in replacement for the original implementation.

Class diagram for the new py_solid module

classDiagram
    class py_solid {
        +calc_solid_earth_tides_grid(datetime, dict, float, bool, bool) np.ndarray
        +plot_solid_earth_tides_grid(np.ndarray, np.ndarray, np.ndarray, datetime, str, bool, bool)
        +calc_solid_earth_tides_point(float, float, datetime, datetime, int, bool, bool) tuple
        +plot_solid_earth_tides_point(np.ndarray, np.ndarray, np.ndarray, np.ndarray, list, str, bool, bool)
        +plot_power_spectral_density4tides(np.ndarray, float, str, int, int)
    }
    class solid {
        +solid_point(LLH, date, int) tuple
        +solid_grid(datetime, npt.ArrayLike, npt.ArrayLike) npt.ArrayLike
    }
    class LLH {
        +float lat
        +float lon
        +float hte
        +geoxyz() XYZ
    }
    class XYZ {
        +float x
        +float y
        +float z
        +enorm8() float
        +rot3(float) XYZ
        +rot1(float) XYZ
        +rge(LLH) XYZ
        +lhsaaz() XYZ
    }
    py_solid --> solid
    solid --> LLH
    solid --> XYZ
    LLH --> XYZ

File-Level Changes

Change Details Files
Added pure Python implementation of solid Earth tides calculation core functionality
  • Implemented core calculation functions for solid Earth tides in Python
  • Added data structures and constants required for calculations
  • Implemented utility functions for time system conversions
  • Added functions for coordinate system transformations
  • Implemented mathematical models for tidal force calculations
src/pysolid/py_solid/solid.py
Created Python wrappers for point-based calculations
  • Implemented functions for calculating tides at a single point
  • Added plotting utilities for point-based results
  • Defined tidal constituent data structures
  • Added power spectral density analysis functionality
src/pysolid/py_solid/point.py
Created Python wrappers for grid-based calculations
  • Implemented functions for calculating tides on a spatial grid
  • Added plotting utilities for grid-based results
  • Implemented grid interpolation functionality
src/pysolid/py_solid/grid.py
Added package initialization and integration with existing codebase
  • Created package initialization file
  • Updated imports to include new Python implementation
  • Maintained backward compatibility with existing API
src/pysolid/__init__.py
src/pysolid/py_solid/__init__.py
Added test files for the Python implementation
  • Created test for point-based calculations
  • Created test for grid-based calculations
  • Added reference data for validation
tests/point_pyimpl.py
tests/grid_pyimpl.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in the pull request body to generate a PR summary at any time. You can also use this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

  • Contact our support team for questions or feedback.
  • Visit our documentation for detailed guides and information.
  • Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai[bot] avatar Nov 11 '24 05:11 sourcery-ai[bot]

PR summary

This Pull Request introduces a Python-only implementation of the pysolid library, named py_solid, which serves as a drop-in replacement for the original Fortran-based implementation. The purpose of this change is to provide a more accessible and maintainable version of the library by eliminating the dependency on Fortran. The PR includes modifications to import statements to use relative paths, ensuring better modularity. Additionally, new test files (point_pyimpl.py and grid_pyimpl.py) are added to validate the functionality of the py_solid implementation, ensuring compatibility with existing tests. The impact of this PR is significant as it simplifies the installation process and enhances the maintainability of the library.

Suggestion

To further improve this PR, consider adding documentation that highlights the differences between the original and the new Python-only implementation, if any, and provide guidance on how users can transition to using py_solid. Additionally, ensure that performance benchmarks are conducted to compare the Python implementation with the original Fortran version, as this could be a concern for users relying on the library for computationally intensive tasks.

Disclaimer: This comment was entirely generated using AI. Be aware that the information provided may be incorrect.

Current plan usage: 0.00%

Have feedback or need help? Discord Documentation [email protected]

codeautopilot[bot] avatar Nov 11 '24 05:11 codeautopilot[bot]

Thank you for the PR, @piyushrpt! I am occupied by work this week, but I will try to get to this PR, mainly the testing, as said in https://github.com/insarlab/PySolid/discussions/99, at the weekend. @scottstanie, please feel free to step in whenever you got a chance.

yunjunz avatar Nov 11 '24 15:11 yunjunz

All the unit tests etc have been updated. All we now need is someone to setup a bunch of tests - random times, locations etc and compare the 2 sets of numbers. There is no rush to this.

The python code can also be further cleaned up as suggested by the bot here - but that would make the python implementation look significantly different from the original fortran code - which may / may not be desired. I will leave it up to you both - no strong preferences.

piyushrpt avatar Nov 11 '24 23:11 piyushrpt

I started to check it out and converted the script in tests/ to one using pytest and no plotting. One odd point on the relative speeds:


In [15]: %time tide_e, tide_n, tide_u = pysolid.calc_solid_earth_tides_grid( grid_ref_data["dt_obj"], grid_ref_data["atr"], verbose=False)
CPU times: user 36.6 ms, sys: 1.45 ms, total: 38.1 ms
Wall time: 37.1 ms

In [16]: %time tide_e, tide_n, tide_u = py_solid.calc_solid_earth_tides_grid( grid_ref_data["dt_obj"], grid_ref_data["atr"], verbose=False)
CPU times: user 415 ms, sys: 10.8 ms, total: 425 ms
Wall time: 419 ms

grid speed looks fine for python 👍 I can live with a 10x slow down if it's still just half a second for one date's output

In [13]: %time dt_out, tide_e, tide_n, tide_u = pysolid.calc_solid_earth_tides_point( point_ref_data["lat"], point_ref_data["lon"], point_ref_data["dt_obj0"], point_ref_data["dt_obj1"], verbose=False,)
PYSOLID: calculate solid Earth tides in east/north/up direction
PYSOLID: lot/lon: 34.0/-118.0 degree
PYSOLID: start UTC: 2020-11-05T12:00:00
PYSOLID: end   UTC: 2020-12-31T00:00:00
PYSOLID: time step: 60 seconds
CPU times: user 245 ms, sys: 2.58 ms, total: 248 ms
Wall time: 248 ms

In [14]: %time dt_out, tide_e, tide_n, tide_u = py_solid.calc_solid_earth_tides_point( point_ref_data["lat"], point_ref_data["lon"], point_ref_data["dt_obj0"], point_ref_data["dt_obj1"], verbose=False,)
PYSOLID: calculate solid Earth tides in east/north/up direction
PYSOLID: lot/lon: 34.0/-118.0 degree
PYSOLID: start UTC: 2020-11-05T12:00:00
PYSOLID: end   UTC: 2020-12-31T00:00:00
PYSOLID: time step: 60 seconds
CPU times: user 22.7 s, sys: 38 ms, total: 22.7 s
Wall time: 22.7 s

somehow the point one is much slower? haven't tried to see why, maybe the time loop is just slower...

scottstanie avatar Dec 18 '24 22:12 scottstanie

The slow down for single point is expected since it has to recompute sun / moon positions for each time epoch. When running stuff on a grid for a single epoch - these are done only once.

piyushrpt avatar Dec 18 '24 22:12 piyushrpt

ok gotcha, so for the insar application that only calculates ephemerides a handful times, it's not a real problem

scottstanie avatar Dec 18 '24 22:12 scottstanie

The python code could be used as a starting point for cython - and this speed could be gained back. The primary goal was to remove dependence on fortran

piyushrpt avatar Dec 18 '24 22:12 piyushrpt

Here's a first pass at a validation script. Takes about 15 minutes to run, seems to have agreement for a global run for a few times per year from 2010 to 2024.

image

There's the question Piyush raised in the discussion about best ways to handle the switch to not disrupt, but also have a pure-python version for easier installs (pip install PyPySolid?).

scottstanie avatar Dec 19 '24 00:12 scottstanie

Maybe path of least resistance

  1. Include python implementation as is in subfolder called py_solid or pyimpl or something that makes more sense.
  2. Change __init__ to import from python implementation if fortran one is not available. Use fortran one always when available.
  3. Add an option to install pysolid[pyonly] that can skip the fortran stuff during installation. Let default be the usual fortran stuff + python implementation but python stuff is not used. This will allow for side-by-side comparisons to be done easily as well with the default /full installation.

piyushrpt avatar Dec 19 '24 03:12 piyushrpt

Something like this in CMakeLists.txt in addition to just detecting the extension in __init__.py. Dont have a system with a fortran compiler to test this out

cmake_minimum_required(VERSION 3.17.2...3.29)
project(${SKBUILD_PROJECT_NAME} LANGUAGES C)

include(CheckLanguage)
check_language(Fortran)
if(CMAKE_Fortran_COMPILER)
  enable_language(Fortran)

...
else()
   message(STATUS "Skipping building Fortran extension")
endif()

piyushrpt avatar Dec 20 '24 05:12 piyushrpt