earthkit-data icon indicating copy to clipboard operation
earthkit-data copied to clipboard

Invalidate cache files when they are corrupted

Open malmans2 opened this issue 2 years ago • 0 comments

Is your feature request related to a problem? Please describe.

Cache files can get corrupted, especially because their paths are listed in public attributes of the source classes. (For EQC, we are running on a shared VM where users have a broad range of expertise).

Similarly to #202, I think cache files should be invalidated when they're modified.

Describe the solution you'd like

In the snippet below, s2 should trigger a new request.

import earthkit.data

args = ("cds", "reanalysis-era5-single-levels")
kwargs = dict(
    variable=["2t", "msl"],
    product_type="reanalysis",
    area=[50, -10, 40, 10],  # N,W,S,E
    grid=[2, 2],
    date="2012-05-10",
    time="12:00",
)

s1 = earthkit.data.from_source(*args, **kwargs)

with open(s1.path, "w") as f:
    pass

s2 = earthkit.data.from_source(*args, **kwargs)
s2.to_xarray()  # NotImplementedError: earthkit.data.readers.text.TextReader.to_xarray()

Describe alternatives you've considered

No response

Additional context

No response

Organisation

B-Open / CADS-EQC

malmans2 avatar Oct 17 '23 08:10 malmans2