RESUMABLE: proposal - requirements on tying HTTP requests together create a fragile design and this should stop the document leaving the WG for publication
@gstrauss provided the following feedback on https://lists.w3.org/Archives/Public/ietf-http-wg/2025JulSep/0094.html wrt Section 4.2.2.
Not only is this fragile, but it imposes a requirement that the server maintain state for exclusive access to the upload resource, and have a new request affect a different request -- which are big implementation requirements -- tying together multiple HTTP requests which are supposed to be independent in HTTP. ** This is a huge failure and should immediately disqualify this draft from proceeding passed last call. **
Related is the fragile behavior specified in 4.3 Offset Retrieval: "The client MUST NOT perform offset retrieval while creation (Seciton 4.2) or appending (Section 4.4) is in progress as this can cause the previous request to be terminated by the server as described in Section 4.6."
More detail as to why this is fragile would help. Hyperbolic statements do not.
More detail as to why this is fragile would help. Hyperbolic statements do not.
Section 4.6 Concurrency
The RECOMMENDED approach is as follows: If an upload resource receives a new request to retrieve the offset (Section 4.3), append representation data (Section 4.4), or cancel the upload (Section 4.5) while a previous request for creating the upload (Section 4.2) or appending representation data (Section 4.4) is still ongoing, the resource SHOULD prevent race conditions, data loss, and corruption by terminating the previous request before processing the new request.
I think I said it well in #3159
An idempotent HEAD request to retrieve the offset should not cause a non-idempotent PATCH request -- on the same server or on another server -- to be cancelled. Specifying such recommended behavior does not strike me as in the spirit of independent HTTP requests, and also not idempotent HEAD requests.
This would not be necessary to recover from hung requests with my alternative proposal to remove the complexity and restrictions imposed by having the server report Upload-Offset to the client. Instead, the client should send reasonably-sized chunks and should resend the chunk (perhaps in multiple smaller chunk requests) if sending the chunk fails (or hangs).
The proposal to keep uploading chunks of data that might have been received and processed (in whole or in-part) sounds to me like it could create conditions where a client spends bandwidth on uploading duplicate data that is not necessary, and a server spends resources handling such requests.
Does your proposal offer any way to avoid such resource spending by allowing a a client to get an accurate view of precisely how much data has been received and processed by the server at any point in time?
My proposal side-steps all the complexity by suggesting that the client adjust the data chunk size of the representation that it chooses to send. By not requiring serial upload, the client can choose how much data to pipeline in how many requests. Yes, failed requests will need to be resent in my proposal. Multiple requests can be sent with smaller chunk sizes.
So the answer to my question
Does your proposal offer any way to avoid such resource spending by allowing a a client to get an accurate view of precisely how much data has been received and processed by the server at any point in time?
Is no?
So the answer to my question
Does your proposal offer any way to avoid such resource spending by allowing a a client to get an accurate view of precisely how much data has been received and processed by the server at any point in time?
Is no?
Incorrect: If the client is uploading sequentially, then sending a HEAD request to the server for the temporary resource gives the client an accurate view of precisely how much data has been received.
However, in my proposal, there is no need to ask the server with a separate request and RTT cost.
The client already knows how much data has been received by the server by the successful HTTP response(s) to each request containing a chunk of the representation sent by the client. It does not matter if the client is uploading sequentially, in parallel, and/or pipelining.
What is the client is sending in parallel? Is there a way to query the precise ranges that have been processed and received?
A server can respond prior to receiving the full payload of a request. You would probably want to state a server MUST NOT do that in your design.
What is the client is sending in parallel? Is there a way to query the precise ranges that have been processed and received?
A server can respond prior to receiving the full payload of a request. You would probably want to state a server MUST NOT do that in your design.
That is an implementation detail and could be supported. If the server stores the temporary resource in a file on a filesystem which supports holes, then it would be possible to send back to the client a list of ranges of data, and by extension, the holes. In another possible implementation, if the temporary resource was stored in a database, the ranges could be reported from the database.
Aside: I do not understand what you mean by "processed" as in both resumable uploads and my temporary resource proposal, the temporary resource is collected until Upload-Complete: ?1 (or equivalent) before being submitted to the target resource for processing. I am responding to your question about portions of the representation "received".
Aside: I do not understand what you mean by "processed" as in both resumable uploads and my temporary resource proposal, the temporary resource is collected until Upload-Complete: ?1 (or equivalent) before being submitted to the target resource for processing.
Your design, IIUC, requires a 2xx to confirm the chunks are received in whole. It's all or nothing. Draft 09 supports capturing whatever message content was received and processed by the server prior to the upload creation or append failing, no 104 or 200 is strictly necessary, since the offset is retrievable.
Client strategies for dealing with these situation are probably not too dissimilar - follow up chunk size, potential for backtracking, etc.
Your design, IIUC, requires a 2xx to confirm the chunks are received in whole. It's all or nothing.
One might say very HTTP-like.
The difference from all-or-nothing is that my temporary resource proposal allows the client to send the data in chunks in multiple requests. Partial-PUT is already an established means to do this. An intelligent client might measure the RTT for the temporary resource creation request and then use that timing, as well as subsequent partial-PUT requests to calculate BDP and adapt chunk sizes and number of requests.
Draft 09 supports capturing whatever message content was received and processed by the server prior to the upload creation or append failing, no 104 or 200 is strictly necessary, since the offset is retrievable.
When the offset is not updated by 104 (due to 104 not being passed), then a full RTT is required to recover from each and every request failure. Draft 09 requires sequential upload. Combined, in the face of unstable/flaky/changing connections, Draft 09 will likely exhibit much worse performance recovering. On top of that, the protocol is more complex than my proposal.
In the face of no network connectivity issues and low packet loss, there will not be a huge difference between the solutions, except that Draft 09 does have an advantage of fewer round trips for tiny request bodies. I conjecture that if an application can tolerate a resubmission (e.g. POST with the same request body is "safe"), then an intelligent client may try to send a tiny request body all at once, falling back to resumable uploads only on failure. In that case, neither Draft 09 or my solution are used, and so the Draft 09 advantage for tiny requests is lessened. Additionally, my proposal could optionally be extended to accept an initial request body in the temporary resource creation request, further reducing the Draft 09 perceived advantage for tiny requests.
The issue with your proposed design is, it can consume additional bandwidth due to needing to send duplicated data that was already delivered to server but not confirmed as such by a 200 response. Eating an RTT to save on bandwidth is a reasonable tradeoff to make.
Client environments may not be in control of request dispatch to their connection pool, nor able to access information about the network (such as TCP or QUIC connection statistics). Parallelizing chunk uploads for the same resource can lead to all requests being pooled into the same connection, with little control over the client-side send scheduling. Sharing bandwidth over multiple chunks means that when a connection terminates, each chunk would have made slower parallel progress. If a client picks large chunks, the impact of a connection failure in this design is high, because the bandwidth spent to send those chunks is wasted until a server returns a 200.
Using fallback and/or smaller chunks increases the total count of requests required, which is also a tradeoff because some deployment models bill by request count. A server operator can limit this, to some extent, by declaring a minimum chunk size. However, there will then be tension between the operator and the client needs.
It would be remiss to not mention that fetching the upload offset, as described in 09, is also a request that would count against total requests. However, the nature of the request is different (no request message content) , which may factor into costs calculations on the server side.
Client environments may not be in control of request dispatch to their connection pool, nor able to access information about the network (such as TCP or QUIC connection statistics). Parallelizing chunk uploads for the same resource can lead to all requests being pooled into the same connection, with little control over the client-side send scheduling. Sharing bandwidth over multiple chunks means that when a connection terminates, each chunk would have made slower parallel progress. If a client picks large chunks, the impact of a connection failure in this design is high, because the bandwidth spent to send those chunks is wasted until a server returns a 200.
I have no doubt that each individual point that you make is true somewhere, but together I am less certain. If parallelizing will lead to terrible behavior, don't parallelize. If pipelining will lead to terrible behavior, don't pipeline. Send sequentially. However, if a software stack is going to implement resumable uploads, then I hope it might recognize the pattern and not choose the most suboptimal steps. If not, the client should send those requests one-at-a-time, which would effectively slow my proposal down to the speed of resumable uploads. However, without those constraints, my proposal could be much faster. If the parallelization behavior is desirable, it could utilize the network bandwidth with multiple requests of multiple chunks of data, potential over multiple different network paths, and would still be able to recover if some of those requests failed.
Using fallback and/or smaller chunks increases the total count of requests required, which is also a tradeoff because some deployment models bill by request count. A server operator can limit this, to some extent, by declaring a minimum chunk size. However, there will then be tension between the operator and the client needs.
These sound like they heavily influenced the design choices. Are these documented somewhere? I'm sorry I must admit that I have not followed every IETF meeting where Resumable Uploads was discussed, but I have tried to read the meeting notes from most of the meetings. If request count is billed in such a way that more smaller requests would have a substantial impact, then that sounds like some businesses may want to change to a provider with a fairer resource-based billing.
How rampant is such billing that is billed in such a way that would have a substantial impact for larger requests which are split into, say, 256KB chunks versus 2MB chunks? While I can now see why the draft tries to get as much bytes written as possible, the billing problem is not a technical problem, and so the billing problem should not be addressed with a technical kludge. The billing problem is still a problem, but I think there are different pressures which might lead to better solutions.
It sounds like parralelization is an optimization that can be fragile.
There's a lot of open questions and tradeoffs here for both the current design and the proposal.
I suggest the way to make progress is to implement and gather data on the respective approaches under varying network conditions such as low/high RTT, low/high bandwidth, network loss / disconnects, client disconnects, and server disconnects.
There are several client and server implementations of draft 09. If we have one or more implementations of the proposed design, it would allow us to more concretely compare the two. The testbed can be a collaborative effort.