handler: support HTTP range requests in GET /:upload
Some discussion around API design in the quoted (since-closed) PR.
Out of interest, I did some searching and found #432 where you were against implementing
Rangesupport in tusd - have you changed your mind on this?
Yes, I did change my opinion here. We have recently been involved in the HTTP working group with the goal of making a IETF standard for resumable uploads. In the end, we hope that resumable uploads are one component in the HTTP toolbox that interacts well with the existing HTTP mechanism. Since then I am more interested in making sure that tusd works better with concepts, such as range requests.
In any case, we should probably move this conversation to an issue to sketch out API design if you're in favour of this in principle - closing this PR for now.
Great idea, feel free to open an issue if you are still interested in working on this :)
Originally posted by @Acconut in https://github.com/tus/tusd/issues/1048#issuecomment-1881180105
As mentioned in #1048, I prefer having explicit support for fetching ranged content inside the storage implementations, so that we can implement efficiently. My first idea was to add a new interface for storage, such as:
type Range struct {
start int64
end int64
}
type RangedDownloadsUpload interface {
GetRangedReaders(ctx context.Context, ranges []Range) ([]io.ReadCloser, error)
}
The slices are necessary because the Range header can include multiple ranges.
However, after looking at the source code of Go's implementation for ranged GET requests (see https://cs.opensource.google/go/go/+/refs/tags/go1.21.6:src/net/http/fs.go;l=169-355), I am more hesitant with this approach. Go implements additional logic for preventing abuse and ensuring wide compatibility with even odd clients. If we go with my above proposal, we would have to duplicate this logic inside tusd as well, which I would like to avoid.
As an alternative, we could go with a more lazy route and basically pass the request and response into the storage to let it handle the ranged request as efficiently as possible:
type ServableUpload interface {
ServeContent(ctx context.Context, w ResponseWriter, r *Request) (error)
}
The filestore could simply use http.ServeContent internally for this. The s3store (and similar cloud storages) could implement the interface by relaying the Range header to S3 and then just piping the response from S3 directly to the end-user's client. The logic for handling ranges is then outsourced to the cloud storage. A downside would be that this way, we also inherit the limitations of the cloud storages. For example, S3 does not support multiple ranges in one request. This limitation would then also be passed to the end-user. However, overall I like this approach for now.
Let me know what you think!
KISS. I am unsure if multiple range requests are really common in a request, and I already had to deal with http.ServeContent myself and would rather not duplicate/have core code be borrowed.
If needed, revisit when someone asks, and then/or make a 3rd interface.
Let the backends deal with the problem, and don't try to be too smart/know better than the storage systems without a good reason.
Agreed. I am not sure when I will have time to work on this. If anybody is interested in starting this, let me know and I can help :)
@Acconut im looking at your suggestion and would like to know how ServableUpload would connect to the Upload interface given it has a GetReader.
The data stores could implement an additional interface that turns a regular Upload into a ServableUpload, similar to how we already have TerminaterDataStore;
type ServableUpload interface {
ServeContent(ctx context.Context, w ResponseWriter, r *Request) (error)
}
type ServerDataStore interface {
AsServableUpload(upload Upload) ServableUpload
}
If a data store does not implement ServerDataStore, the handler can always fall back to the existing GetReader function. Does that make sense?
The data stores could implement an additional interface that turns a regular Upload into a ServableUpload, similar to how we already have TerminaterDataStore;
type ServableUpload interface { ServeContent(ctx context.Context, w ResponseWriter, r *Request) (error) } type ServerDataStore interface { AsServableUpload(upload Upload) ServableUpload }If a data store does not implement
ServerDataStore, the handler can always fall back to the existingGetReaderfunction. Does that make sense?
Do we then want
func (store *StoreComposer) UseServableUpload(ext ServableUpload) {
store.UsesServableUpload = ext != nil
store.ServableUpload = ext
}
as well?
then call it conditionally like the others?
Yes, the composer needs to be adjusted as well, but take the data store interface instead of the additional interface for the upload:
func (store *StoreComposer) UseServer(ext ServerDataStore) {
store.UsesServer = ext != nil
store.Server = ext
}
(although we would have to think about the use of "server". While it's custom in Go to use -er suffix for interfaces, the use of "server" can easily be misunderstood here)
Support for ranged GET requests in filestore and s3store has just landed.