banyan-julia icon indicating copy to clipboard operation
banyan-julia copied to clipboard

Invalidating cached samples

Open calebwin opened this issue 2 years ago • 0 comments

When a call to a read_* or write_* (e.g., read_hdf5 or write_parquet) function is made, we are simply assigning a Location to a Future. As part of constructing a Location we take a random sample of the data at the location along with some metadata but we also cache this location locally so that we don't have to recompute it.

Naturally, on a call to a write_* function, we invalidate the cached sampled if a sample was cached for the location being written to. But we should also store the date that a sample was taken so that we can determine whether S3 data was written to more recently which would invalidate the cache.

Also, we should expose keyword arguments for Location constructors to callers of the read_* or write_* functions.

calebwin avatar Aug 13 '21 12:08 calebwin