kartothek
kartothek copied to clipboard
wip: use distributed.Queue as storage for metadata poc
With the intention of improving resilience,
instead of storing the intermediate results (partition metadata) on dask workers we could submit those to a more central instance (an event bus or even simpler the dask scheduler) this way the jobs would be safe from worker failures
cc @fjetter