Added offset and count parameters to pathsend extension
Background Context
I read #469 and discussed with @gi0baro in this thread https://github.com/emmett-framework/granian/discussions/157 about trying to get this as part of the ASGI spec. The implementation here is just a straight copy of the discussion in the relevant thread. It didn't seem that people had too much of a disagreement of what it'd look like, and if we did we could hash it out in this PR.
Links:
- https://github.com/django/asgiref/issues/423: Issue describing problems with the current zerocopysend
- #469: Issue describing potential improvements to pathsend (what this PR implements)
- https://github.com/emmett-framework/granian/discussions/157: discussion that led me to making this PR in this repo
(draft)
This extends the http.response.pathsend extension to support partial file sends by adding optional offset and count parameters, enabling HTTP Range request support.
The parameters mirror the existing http.response.zerocopysend API:
offset: byte position to start reading from (default: beginning)count: number of bytes to send (default: to end of file)
This allows ASGI frameworks to implement range requests without loading the bytes themselves, offloading the operation to the server.
Servers may advertise range support capability by setting {"ranges": True} in the extension scope dict, allowing frameworks to detect this capability.
The application remains responsible for setting appropriate headers (Content-Range, Content-Length, Accept-Ranges) in the response.
Fixes #469
Implementation
This original comment also proposed an alternative that uses range instead of offset+count. I don't really have strong opinions one way or another about which we use, but I see the trade-offs as:
rangeis just one key, and its explicit about the start/end bytes. If we were to do this, we'd need to make a decision about being inclusive/exclusive with the ending byte (I think exclusive ending byte is common in most situations, but http range GETs are inclusive, so we should likely match that: docs)- offset/count matches what we have for zerocopysend, so we're staying consistent within ASGI
I'm also in the process of a branch on granian where I'm going to do a quick example implementation just to give an example. I'll likely use granian + starlette to showcase what this extended pathsend could look like
First time contributing to and submitting a PR to asgiref, so please feel free to badger me about bits that I'm missing!
Cc @Kludex
cc @gi0baro
My 2 cents: the 2 keys vs single tuple key won't really change that much, the latter just simplify the message parsing on the server side.
I would say the offset and count naming doesn't really sound appealing to me though: while I understand the consistency argument, I don't really feel we need to make different extensions consistent with each other in terms of messages format, in fact pathsend already differs from zerocopysend by the fact it has the additional boolean in the extension scope dict.
I could argue offset and count make sense for zerocopysend given the object in the message is a file descriptor, thus it's quite clear we're talking about bytes in regarding of that descriptor. On the other hand, I don't see such words to be super-meaningful when we're talking about a string path: it's not very straightforward "count" is the number of bytes that should be read from the file at that path. This might be even more confusing coupled with the "ranges" naming in the extension scope.
I'd say if we use "ranges" in scope extension, maybe "range_start" and "range_len" are more explicit keys to have in the message. But maybe this is just me, and the actual proposal is perfectly fine for the majority of people involved.
re: offset/count
I think I'm sold that offset/count aren't the greatest or most intuitive; they were probably originally chosen to match os.sendfile/sendfile
re: consistency argument
I do think it's a little confusing between the two extensions, but my reading of the linked threads is that zerocopysend is a bit "dead", so it's probably not worth matching?
re: ranges/range_start/range_len
I do like the idea that all of the fields are prefixed similarly (range*) to make it clearer they're related. I also think I prefer having a dict like range_start/range_len rather than a tuple[int, int], just because it's explicit what the field is doing.
Here's some more alternatives:
- start / length
- byte_start / byte_length
- start / size
- position / length
Or any permutations of the above, including with the range_* prefix.
Anyone else have strong opinions that wants to weigh in?
My personal preference is range_start and range_len as of now