L1-node icon indicating copy to clipboard operation
L1-node copied to clipboard

Chunk nginx cache

Open willscott opened this issue 1 year ago • 3 comments

The current translation looks like

flowchart LR
    A[Caboose] -->|/file.big?bytes=0:1024| cr["car range"]
    subgraph L1
    cr -->|/file.big| nx["nginx caching"]
    nx -.->|"`*on cache miss*`"| l1["l1 shim"]
    l1 --> l["lassie"]
    end
    l --> o["Origin"]

We should have the nginx caching layer include a directive like

slice 10m;

This will update the sequence to look like:

flowchart LR
    A[Caboose] -->|/file.big?bytes=0:1024| cr["car range"]
    subgraph L1
    cr -->|/file.big| nx["nginx caching"]
    nx -.-> |/file.big\nrange:0-10meg| l1["l1 shim"]
    l1 --> |/file.big?bytes=0:10m| l["lassie"]
    end
    l --> o["Origin"]

There are 3 things needed for this to land:

  • [ ] inclusion of the slice directive in nginx
  • [ ] translation from the range header to bytes argument in l1 shim
  • [ ] support for bytes in lassie

willscott avatar Jun 02 '23 06:06 willscott

There's actually another wringle here that's going to be tricky. The request from the [nginx caching]->[l1 shim] is for a slice of a car file (http byte range) While the request from [l1 shim]->[lassie] only will support entity-bytes for selecting blocks from a car file.

that translation is not something we have code for, and i wonder if it is complexity we want the shim to have to handle. if we don't we'd need different backing cache objects for the different chunks of origin-fetched files, and will need to think more about how we manage the saturn cache strategy.

willscott avatar Jun 16 '23 07:06 willscott

Hm.. it this gets too hairy (and it already pretty hairy), may be worth considering if it is less/equal amount of glue code if cache is built around blocks, and not HTTP responses.

If the nginx module always stored individual blocks, and assembled CARs and CAR-slices on the fly based on some index, CAR and fallback block requests would share the same block cache, and we would remove DoS surface that comes from CARs that have duplicate blocks, or storing same blocks multiple times due to different CAR flavours.

lidel avatar Jun 16 '23 20:06 lidel

I think i agree that it's worth considering. I'm pretty reluctant to support that scope for a first iteration of rhea to launch because it will take a very substantial effort and complexity to get a cache that performs at the level of nginx.

willscott avatar Jun 18 '23 06:06 willscott