pymsis icon indicating copy to clipboard operation
pymsis copied to clipboard

Downloaded SW file is not refreshed if it is out of date

Open mananapr opened this issue 11 months ago • 6 comments

The package automatically downloads SW-All file from Celestrak in case it is not present in the package directory. However, it won't refresh the downloaded file in case an updated version is present on Celestrak.

The existing condition can be updated as follows -

def _load_f107_ap_data() -> dict[str, npt.NDArray]:
    """Load data from disk, if it isn't present go out and download it first."""
    if not _F107_AP_PATH.exists() or datetime.datetime.now() > datetime.datetime.fromtimestamp(os.path.getmtime(_F107_AP_PATH)) + datetime.timedelta(days=1):
        download_f107_ap()

mananapr avatar Nov 21 '24 14:11 mananapr

I have some hesitations about this because if someone is just using the package for historical purposes this would be unnecessary and provide a lot of data downloading even if they don't need the most recent data. I did put in documentation that a user can explicitly update this themselves here: https://swxtrec.github.io/pymsis/reference/generated/pymsis.utils.download_f107_ap.html#pymsis.utils.download_f107_ap You can call that yourself, but it isn't automatic. Is there a mode you're using this in where this single-day cadence would be helpful? Also, we are dropping predicted values. Would you prefer to get those values as well?

Some other suggestions/thoughts

  1. Only do this if the requested date in get_f107_ap is beyond the end of the current file, it shouldn't depend on the file download time. Also, I believe that file updates ~every 3 hours with new Ap values, so a 3-hour cadence might be better than 24-hour.
  2. We should probably have an "append" / update function so we aren't thrashing a 2.8 MB download every time if possible.
  • There is a csv starting 2019 file that would reduce the file size (~300 kB) https://www.celestrak.org/SpaceData/
  • We might need to change to a numpy savez file or some other format rather than a text file internally to make this easier to keep growing.

greglucas avatar Nov 22 '24 14:11 greglucas

Also, we are dropping predicted values. Would you prefer to get those values as well?

No, I am fine with them not being in the final file

Some other suggestions/thoughts 1. Only do this if the requested date in get_f107_ap is beyond the end of the current file, it shouldn't depend on the file download time. Also, I believe that file updates ~every 3 hours with new Ap values, so a 3-hour cadence might be better than 24-hour. 2. We should probably have an "append" / update function so we aren't thrashing a 2.8 MB download every time if possible.

* There is a csv starting 2019 file that would reduce the file size (~300 kB) https://www.celestrak.org/SpaceData/

* We might need to change to a numpy savez file or some other format rather than a text file internally to make this easier to keep growing.

Very fair suggestions. I'll see if I can update the PR to include these changes

mananapr avatar Nov 25 '24 06:11 mananapr

I have updated my code to call download_f107_ap when required. Thanks for helping with this

mananapr avatar Nov 25 '24 07:11 mananapr

@mananapr, are you interested in opening a new PR(s) for some of those suggestions? I think there are some valid requests here, so I think we should reopen this issue.

greglucas avatar Dec 02 '24 16:12 greglucas

@mananapr, are you interested in opening a new PR(s) for some of those suggestions? I think there are some valid requests here, so I think we should reopen this issue.

Yes, I would like to work on these issues when I have some time

mananapr avatar Dec 04 '24 06:12 mananapr

Hi @greglucas - I've opened a PR to address the issues discussed. Let me know what you think

mananapr avatar Jan 06 '25 10:01 mananapr