distributed-dataset
distributed-dataset copied to clipboard
Nicer API for writing a `Backend`
Currently the API to write a Backend is quite limited, it forces us to create a new process, and does not look pretty. With a nicer API:
- We can use #5 to get rid of the Python wrapper we're currently using
- We can create a backend which uses threads and an in memory shuffle store, it should be considerably faster than
localProcessBackendandlocalTmpShuffleStore.