bindle icon indicating copy to clipboard operation
bindle copied to clipboard

Scale out

Open itowlson opened this issue 2 years ago • 0 comments

The current storage providers are server filesystem (about to be removed) and local embedded database (which sits over a filesystem directory). This means that - barring some shared file system shenanigans - a Bindle server stands in isolation.

Shared file system shenanigans are certainly an option. For example, Azure allows virtual disks to be shared between virtual machines (https://docs.microsoft.com/en-us/azure/virtual-machines/disks-shared) though it looks like you need additional setup to support a shared file system (at least that was my impression, it wasn't very clear from the docs).

We could also look at other providers. For example:

  • OCI. OCI already provides content addressable layers which might map nicely to parcels. But support for storing generic artifacts in OCI registries seems a bit patchy; I'm not sure of the situation and politics here.

  • AWS S3 / Azure blob storage. We would need to define a mapping from invoice and parcel IDs to blob URLs, such as the SHA. (Using the SHA would also give good distribution characteristics for partitioning!) Operators would need to BYO storage account but this would also allow them to configure it how they wanted (e.g. availability).

    • A possible refinement of this is to store invoices in table storage, rather than as TOML blobs, which could enable auditing features such as establishing which invoices use a given parcel.

All these remote stores are likely to be slow so caching on the local filesystem (or even in memory for small blobs such as invoices) would be a must. It looks like we have a LRU cache but the module is still under development.

itowlson avatar Dec 14 '21 03:12 itowlson