msgraph-sdk-dotnet
msgraph-sdk-dotnet copied to clipboard
Allow writing to request stream for uploading content
Is your feature request related to a problem? Please describe. Developers should be able to write to the request stream for uploads so they can generate content 'on the fly' without needing to buffer it in memory.
Describe the solution you'd like
An overload for ContentRequestBuilder.PutAsync
which accepts a Stream -> ()
delegate . That is:
public async Task PutAsync(Action<Stream> write, Action<ContentRequestBuilderPutRequestConfiguration>? requestConfiguration = default) { }
Describe alternatives you've considered
Buffering content in-memory by using a MemoryStream
and using the existing overload which accepts a Stream
.
Additional context
There is an overload for PutAsync
which accepts an existing stream which is useful for pull-based streaming where the content already exists. For example, from a file.
https://github.com/microsoftgraph/msgraph-sdk-dotnet/blob/7a2be45d2cf37f18a32cc9a60d0edf441fd23a08/src/Microsoft.Graph/Generated/Drives/Item/Items/Item/Content/ContentRequestBuilder.cs#L67-L79
The overload I'm proposing would be better for push-based streaming where you're generating content because it doesn't require you to hold it in-memory. For example:
- constructing a zip archive from many files[^1]
- serializing an object graph[^2]
I understand this SDK is generated. I'm guessing this kind of overload will either need to be supported by Kiota or added as an extension manually (if Kiota's interfaces actually allow access to the request stream.)
[^1]: Streaming Zip on ASP.NET Core https://blog.stephencleary.com/2016/11/streaming-zip-on-aspnet-core.html An article which demonstrates the push-pull distinction with zip archives and ASP.NET Core. [^2]: Uploading data with HttpClient using a "push" model https://thomaslevesque.com/2013/11/30/uploading-data-with-httpclient-using-a-push-model/ An article which demonstrates the push-pull distinction with serialization ASP.NET.
Transferred to https://github.com/microsoft/kiota/issues/2413 as this would require the change from Kiota.
@andrueastman, I don’t think this should be closed just because it requires a change in Kiota because:
- making that change in Kiota is but a step in delivering this enhancement
- I’d still want this enhancement if this library is rewritten yet again using some other generator
Hey @NickDarvey.
I hear you on this. I don't there are plans to move from Kiota anytime soon though. Will re-open for now and keep it dependent on https://github.com/microsoft/kiota/issues/2413
Hi @NickDarvey, Thanks for bringing this up and for using the dotnet SDK. I'd be curious to know more about your scenario.
You'd effectively like to have an inversion of control for the stream argument so in case your application "is handling something large, that's not structured and that's not coming from network/storage/... it doesn't need to be buffered in memory".
- structured payloads (JSON/text/form/and soon multipart in case of Graph) will have a modeled representation, which means the put method accepts a model instead.
- unstructured large payloads are likely to come from storage/network which will provide a stream and avoids to load the whole thing in memory.
- unstructured payloads that are the results of another lib (let's say your app is calling a lib generating PDFs or CSVs) will likely already provide you a stream that's in memory.
Can you come up with such an example please?
Additionally, I'm curious about the choice of an Action<Stream> and not Action<Task<Stream>> or Task<Stream> can you expand on that topic too please?
Hi @baywet,
Can you come up with such an example please?
Here's an example, in code, paraphrased from actual code in our application.
open System.IO
open System.IO.Compression
let forExample (client : Microsoft.Graph.GraphServiceClient) = task {
// A function which generates content on-the-fly.
// Here it takes strings as input, but really it might be taking stream
// which, in turn, are also generated on-the-fly.
let zip (texts : (string * string) seq) (stream : Stream) =
use archive = new ZipArchive (stream, ZipArchiveMode.Create, leaveOpen = true)
for (name, text) in texts do
let entry = archive.CreateEntry name
use stream = entry.Open ()
use writer = new StreamWriter (stream)
writer.Write text
// Some example content.
let examples = [
"a", "The Coulibiac of Salmon from Guy Savoy"
"b", "Vissa D'arte from Tosca"
"c", "Cote du Rhone Chateauneuf de Pape '47"
]
let item = client.Drives["abc"].Items["xyz"].Content
// What I have to do now.
use stream = new MemoryStream ()
do zip examples stream
stream.Position <- 0
do! item.PutAsync stream
// What I want to do (this enhancement request).
do! item.PutAsync (fun stream -> zip examples stream)
}
These two articles describe the value of the push model of streaming and give further examples in more detail:
-
Streaming Zip on ASP.NET Core https://blog.stephencleary.com/2016/11/streaming-zip-on-aspnet-core.html An article which demonstrates the push-pull distinction with zip archives and ASP.NET Core.
-
Uploading data with HttpClient using a "push" model https://thomaslevesque.com/2013/11/30/uploading-data-with-httpclient-using-a-push-model/ An article which demonstrates the push-pull distinction with serialization ASP.NET.
Additionally, I'm curious about the choice of an Action and not Action<Task> or Task can you expand on that topic too please?
Accepting a Stream -> Task
would make more sense actually so one could use the async methods on Stream. (I believe that's a Func<Stream,Task>
delegate C# world.)
Thanks for providing the additional context here. It does feel like an edge case because in the vast majority of cases the application would already have a stream coming from the outside (network/storage) or from another library (document generation, etc...) which could be piped in the compression stream. And the compression stream reference would be passed to the generated API surface.
If we wanted to enable an inversion of control for that scenario, we'd need to revamp the whole API surface, from the generation, but including the request information property and the serialization writer interface which would represent a breaking change. Additionally the request adapter implementation would need to be updated to use a PushContentStream instead. While this would save memory pressure for all scenarios, it'd make things like retry handling more complex as the content length would not be known in advance anymore
This is something we might do for kiota v2 and a later major version of this SDK, but for the time being this scenario will be limited to the current implementation you've provided.
@andrueastman when you have a couple of minutes, would you mind exploring the buffering aspect for retry handling an put together a POC with the PushStream trying to access the content multiple times please?
Just to follow up @baywet
The retry handler will not retry on requests that are not buffered. This check is used to validate if a clone of a request can be made so that a replay of the request is made.
https://github.com/microsoft/kiota-http-dotnet/blob/58fb5f4ad6edbaf54dfe184fa2d3b8538bc558f2/src/Extensions/HttpRequestMessageExtensions.cs#L81
Thanks for confirming. This means we'd effectively loose all retry capabilities. And it'd only benefit the case of aggregating streams as in any other case a stream is either already directly coming from the source or already in memory.
After more réflexion, we're unlikely to do this at all at this point.
It doesn't have to be implemented in a way that would make you lose retry capabilities everywhere, it could just mean that you can't have retry handling when you're using push-based streams. It's a reasonable trade-off if you care about memory usage. If you care about retry, you have to buffer it first and provide the buffered stream.
It would benefit the case of [push] streams yes, but it doesn't have to come at the cost of pull-based streams. (The existing pull-based stream overload should still exist, and work as-is).
The challenge with implementing both is that we'd be duplicating a lot of the infrastructure:
- the Body property on the request information
- the request adapter implementation
- some of the generated code, which would come at a binary size trade-off
I'll let @maisarissi our PM talk to that, but it's unlikely we'd invest for this edge case at this point. Not unless we see significant customer demand for it.
Maybe there ought to be a general “escape hatch” so that when a developer encounters a limitation like this they can always fallback to adding content to the HttpRequestMessage themselves.
This is already possible today with the core part of the SDK. See the readme of this repository. https://github.com/microsoftgraph/msgraph-sdk-dotnet-core
This is already possible today with the core part of the SDK
Ah, but that doesn't have the RequestInformation-request-building bits.