PowerPlatform-DataverseServiceClient
PowerPlatform-DataverseServiceClient copied to clipboard
High memory usage when doing a RetrieveMultiple for 4 minutes
Hi!
I'm trying to figure out why is there a high memory usage peak that lasts for 4 minutes when doing RetrieveMultiple.
There's a LOH memory usage of 400MB (approx) that disappears after 4 minutes. See image 1 (before) and image 2 (after):
I've done a Snapshot to see what's actually going on with the memory usage and I've found out that there are 385MB of Byte[] that is somehow stucked there for those 4 minutes (see image 3 and image 4).
What the program does is using RetrieveMultiple with PagingInfo to retrieve 30.173 records by pages of size 5.000 records. That's why there are 6 Byte[] of size 60MB (approx). So I'm guessing these LOH objects stored there for 4 minutes are the responses from those queries.
I've checked the code around these lines and I haven't seen any Dispose() on the HttpResponseMessage: https://github.com/microsoft/PowerPlatform-DataverseServiceClient/blob/dc278e33f6ad4e1b13335d70bb1ee53c6f8d9235/src/GeneralTools/DataverseClient/Client/ConnectionService.cs#L1964
Wouldn't it be necessary in order to tell the GC that the memory allocated for it can be released?
Thanks
Hi, @Ivan-Colomer What you're seeing is the GC moving though its process, the only operation in the 1.0.9 and older client that use the WebAPI ( and thus the response payload you call out ) are Create/Update and Delete operations. All other operations save for a few specific messages go through a WCF protocol currently at a lower level which does have the GC collects invoked. That code is mostly managed by the underlying Xrm.Sdk.dll bits, some of which we are moving into the client directly in preparation for a protocol shift.
We will see if we can set this up and confirm this is just the GC doing 'its thing' to verify.
Few specific questions to help us get your repro correct, What Host OS and .net version are you using? (it matters unfortunately) And the pattern your suing is around 30 rec's at 5meg-ish a group?
t
Hi @MattB-msft ,
Although I see some GC collects when doing the RetrieveMultiple, these Byte[] arrays doesn't get collected until it passes some minutes (and doing a GC collect manually after that period of time). My main guess was that there is a reference or something to them which doesn't get disposed properly.
I'm very concerned about this topic because I plan to publish an application into an Azure App Service and the ram memory usage is something quite important there.
Currently, the Host OS is a Win 10 Pro x64 and the .Net version is .NET 6.0.8, but I'm planning to run this app on a Linux OS on Azure App Service.
Regarding your last question, the pattern I'm seeing is around 67MB for each RetrieveMultiple of 5.000 records.
Thanks again,
thanks, we will have a look at it and see if we can run down whats going on there.
Hi @Ivan-Colomer , just curious, which tool did you use to track memory usage?
Thanks!
@Ivan-Colomer The line:
var json = await sResp.Content.ReadAsStringAsync().ConfigureAwait(false);
already reads all the content and buffer it into memory, so there is no need to dispose the HttpResponseMessage.
For large response of HttpClient, it is recommended to use HttpCompletionOption.ResponseHeadersRead to prevent buffering the content, and then you can handle the HttpContent by yourself, for example: reuse a pool of small size byte array
https://docs.microsoft.com/en-us/dotnet/api/system.net.http.httpcompletionoption?view=net-6.0
Hi @Ivan-Colomer , just curious, which tool did you use to track memory usage?
Thanks!
Hi, I used jetBrains dotMemory 2022.2.1
@Ivan-Colomer The line:
var json = await sResp.Content.ReadAsStringAsync().ConfigureAwait(false);
already reads all the content and buffer it into memory, so there is no need to dispose the HttpResponseMessage.
For large response of HttpClient, it is recommended to use HttpCompletionOption.ResponseHeadersRead to prevent buffering the content, and then you can handle the HttpContent by yourself, for example: reuse a pool of small size byte array
https://docs.microsoft.com/en-us/dotnet/api/system.net.http.httpcompletionoption?view=net-6.0
You're right, the data would have been buffered automatically and the disposing of the HttpResponseMessage can be skipped, even though it would be nice to always dispose the HttpResponseMessage once you have finished with using it.
As @MattB-msft said, the RetrieveMultiple operation is sent through a WCF, so the problem must be somewhere else.
@Ivan-Colomer when calling retrieve multiple, Are you using RetrieveMultipleAsync or just RetrieveMultiple ?
@Ivan-Colomer when calling retrieve multiple, Are you using RetrieveMultipleAsync or just RetrieveMultiple ?
I can confirm the memory issue. It doesn't take minutes, it happens from the first query performed and grows proportionally per query. The issue occurs when using RetrieveMultipleAsync or ExecuteAsync using a RetrieveMultipleResponse. I does not occur when using their synchronous counterparts.
I do not know what cause the issue but it leaves one uncollected UTF8BufferedMessageData hanging around per executed query. Not perfect for an Azure environment where you pay for memory consumption :)
@Ivan-Colomer when calling retrieve multiple, Are you using RetrieveMultipleAsync or just RetrieveMultiple ?
I can confirm the memory issue. It doesn't take minutes, it happens from the first query performed and grows proportionally per query. The issue occurs when using RetrieveMultipleAsync or ExecuteAsync using a RetrieveMultipleResponse. I does not occur when using their synchronous counterparts.
I do not know what cause the issue but it leaves one uncollected UTF8BufferedMessageData hanging around per executed query. Not perfect for an Azure environment where you pay for memory consumption :)
You're right! It happens when using RetrieveMultipleAsync(). It happens to me that after some minutes it gets cleaned-up (probably due to GC things...).
@Ivan-Colomer when calling retrieve multiple, Are you using RetrieveMultipleAsync or just RetrieveMultiple ?
Yes, I'm using RetrieveMultipleAsync(). Hope you can find where the issue is.
There was a somewhat active discussion that is going on this subject on the .net team as it relates to use of the underlying XML type buffer. There was a fix checked into .net 7 that may impact this, ( we have not tested specifically for it yet ), See: https://github.com/dotnet/runtime/commit/d964b638f1ad47a729ab2e6cf4d6822f76ea3e4f
Can you recompile for .net 7 ( the GA version ) and see if you see this issue persisting? I did not mention this before as .net 7 was not released.
Thank you, I have now tested with 7.0.100. The issue remains identical.
Maybe my test was insufficient? I created a new .Net7 console application, referenced the latest nuget package for DataServiceClient and tested RetrieveMulitpleAsync.
the issue fix was in the framework itself, so it should have been picked up.. thanks for the validation of what we were seeing here.
We will continue to poke at this and see if we can lock it down.
Ok, Update on this.. We think we have finally got this locked down, the .net 6 codebase made it a bit easier to spot. The next update for the Dataverse Service Client should resolve this, and we will close this bug with that update.