tilequeue
tilequeue copied to clipboard
Support writing to multiple S3 buckets
In order to be resilient to region failure, and to get tiles closer to the clients downloading them, it would be helpful to be able to write tiles into multiple buckets.
It would be simple to support a new type of store, perhaps called "multis3" or just switched when name
is a list, which wraps a list of S3
objects and writes from first to last, reading from the last.
The reading from last is important so that get-before-put doesn't think that a tile written to only one of the buckets is okay. Alternatively, if we want to get more complex, we could do read repair by:
- Reading from the first bucket - if no tile, return
None
. - Read from second through last buckets, if no tile then copy the one from the first bucket.
- Return the tile.
I'm not sure whether this is worthwhile - it's a lot of extra complexity to save the work of re-rendering the tile. My feeling is that, while for some expensive tiles that would be worthwhile, the majority of tiles are so cheap to re-render that it's not worth the read repair...?
I'm not sure whether this is worthwhile
I haven't thought through the details, but my gut reaction is to just do the simplest thing, which sounds like it's just checking the get before put on the last location. IIRC we have some retries built in, ie if a write fails we will try n times before giving up. I'd imagine that this practically should cover nearly all common failures, and if there's an edge case it should be fine to re-render the odd tile. But I don't feel strongly about this, and can see the argument for optimizations too :)