PowerPlatform-DataverseServiceClient icon indicating copy to clipboard operation
PowerPlatform-DataverseServiceClient copied to clipboard

High memory usage when doing a RetrieveMultiple for 4 minutes

Open Ivan-Colomer opened this issue 2 years ago • 6 comments

Hi!

I'm trying to figure out why is there a high memory usage peak that lasts for 4 minutes when doing RetrieveMultiple.

There's a LOH memory usage of 400MB (approx) that disappears after 4 minutes. See image 1 (before) and image 2 (after): 1

2

I've done a Snapshot to see what's actually going on with the memory usage and I've found out that there are 385MB of Byte[] that is somehow stucked there for those 4 minutes (see image 3 and image 4). 3

4

What the program does is using RetrieveMultiple with PagingInfo to retrieve 30.173 records by pages of size 5.000 records. That's why there are 6 Byte[] of size 60MB (approx). So I'm guessing these LOH objects stored there for 4 minutes are the responses from those queries.

I've checked the code around these lines and I haven't seen any Dispose() on the HttpResponseMessage: https://github.com/microsoft/PowerPlatform-DataverseServiceClient/blob/dc278e33f6ad4e1b13335d70bb1ee53c6f8d9235/src/GeneralTools/DataverseClient/Client/ConnectionService.cs#L1964

Wouldn't it be necessary in order to tell the GC that the memory allocated for it can be released?

Thanks

Ivan-Colomer avatar Aug 17 '22 09:08 Ivan-Colomer

Hi, @Ivan-Colomer What you're seeing is the GC moving though its process, the only operation in the 1.0.9 and older client that use the WebAPI ( and thus the response payload you call out ) are Create/Update and Delete operations. All other operations save for a few specific messages go through a WCF protocol currently at a lower level which does have the GC collects invoked. That code is mostly managed by the underlying Xrm.Sdk.dll bits, some of which we are moving into the client directly in preparation for a protocol shift.

We will see if we can set this up and confirm this is just the GC doing 'its thing' to verify.

Few specific questions to help us get your repro correct, What Host OS and .net version are you using? (it matters unfortunately) And the pattern your suing is around 30 rec's at 5meg-ish a group?

t

MattB-msft avatar Aug 21 '22 01:08 MattB-msft

Hi @MattB-msft ,

Although I see some GC collects when doing the RetrieveMultiple, these Byte[] arrays doesn't get collected until it passes some minutes (and doing a GC collect manually after that period of time). My main guess was that there is a reference or something to them which doesn't get disposed properly.

I'm very concerned about this topic because I plan to publish an application into an Azure App Service and the ram memory usage is something quite important there.

Currently, the Host OS is a Win 10 Pro x64 and the .Net version is .NET 6.0.8, but I'm planning to run this app on a Linux OS on Azure App Service.

Regarding your last question, the pattern I'm seeing is around 67MB for each RetrieveMultiple of 5.000 records.

Thanks again,

Ivan-Colomer avatar Aug 22 '22 05:08 Ivan-Colomer

thanks, we will have a look at it and see if we can run down whats going on there.

MattB-msft avatar Aug 23 '22 01:08 MattB-msft

Hi @Ivan-Colomer , just curious, which tool did you use to track memory usage?

Thanks!

jordimontana82 avatar Aug 23 '22 10:08 jordimontana82

@Ivan-Colomer The line:

var json = await sResp.Content.ReadAsStringAsync().ConfigureAwait(false); 

already reads all the content and buffer it into memory, so there is no need to dispose the HttpResponseMessage.

For large response of HttpClient, it is recommended to use HttpCompletionOption.ResponseHeadersRead to prevent buffering the content, and then you can handle the HttpContent by yourself, for example: reuse a pool of small size byte array

https://docs.microsoft.com/en-us/dotnet/api/system.net.http.httpcompletionoption?view=net-6.0

8ggmaker avatar Aug 23 '22 15:08 8ggmaker

Hi @Ivan-Colomer , just curious, which tool did you use to track memory usage?

Thanks!

Hi, I used jetBrains dotMemory 2022.2.1

@Ivan-Colomer The line:

var json = await sResp.Content.ReadAsStringAsync().ConfigureAwait(false); 

already reads all the content and buffer it into memory, so there is no need to dispose the HttpResponseMessage.

For large response of HttpClient, it is recommended to use HttpCompletionOption.ResponseHeadersRead to prevent buffering the content, and then you can handle the HttpContent by yourself, for example: reuse a pool of small size byte array

https://docs.microsoft.com/en-us/dotnet/api/system.net.http.httpcompletionoption?view=net-6.0

You're right, the data would have been buffered automatically and the disposing of the HttpResponseMessage can be skipped, even though it would be nice to always dispose the HttpResponseMessage once you have finished with using it.

As @MattB-msft said, the RetrieveMultiple operation is sent through a WCF, so the problem must be somewhere else.

Ivan-Colomer avatar Aug 24 '22 13:08 Ivan-Colomer

@Ivan-Colomer when calling retrieve multiple, Are you using RetrieveMultipleAsync or just RetrieveMultiple ?

MattB-msft avatar Oct 21 '22 22:10 MattB-msft

@Ivan-Colomer when calling retrieve multiple, Are you using RetrieveMultipleAsync or just RetrieveMultiple ?

I can confirm the memory issue. It doesn't take minutes, it happens from the first query performed and grows proportionally per query. The issue occurs when using RetrieveMultipleAsync or ExecuteAsync using a RetrieveMultipleResponse. I does not occur when using their synchronous counterparts.

I do not know what cause the issue but it leaves one uncollected UTF8BufferedMessageData hanging around per executed query. Not perfect for an Azure environment where you pay for memory consumption :)

petersk80 avatar Nov 10 '22 22:11 petersk80

@Ivan-Colomer when calling retrieve multiple, Are you using RetrieveMultipleAsync or just RetrieveMultiple ?

I can confirm the memory issue. It doesn't take minutes, it happens from the first query performed and grows proportionally per query. The issue occurs when using RetrieveMultipleAsync or ExecuteAsync using a RetrieveMultipleResponse. I does not occur when using their synchronous counterparts.

I do not know what cause the issue but it leaves one uncollected UTF8BufferedMessageData hanging around per executed query. Not perfect for an Azure environment where you pay for memory consumption :)

You're right! It happens when using RetrieveMultipleAsync(). It happens to me that after some minutes it gets cleaned-up (probably due to GC things...).

@Ivan-Colomer when calling retrieve multiple, Are you using RetrieveMultipleAsync or just RetrieveMultiple ?

Yes, I'm using RetrieveMultipleAsync(). Hope you can find where the issue is.

Ivan-Colomer avatar Nov 11 '22 09:11 Ivan-Colomer

There was a somewhat active discussion that is going on this subject on the .net team as it relates to use of the underlying XML type buffer. There was a fix checked into .net 7 that may impact this, ( we have not tested specifically for it yet ), See: https://github.com/dotnet/runtime/commit/d964b638f1ad47a729ab2e6cf4d6822f76ea3e4f

Can you recompile for .net 7 ( the GA version ) and see if you see this issue persisting? I did not mention this before as .net 7 was not released.

MattB-msft avatar Nov 17 '22 21:11 MattB-msft

Thank you, I have now tested with 7.0.100. The issue remains identical.

Maybe my test was insufficient? I created a new .Net7 console application, referenced the latest nuget package for DataServiceClient and tested RetrieveMulitpleAsync.

petersk80 avatar Nov 18 '22 09:11 petersk80

the issue fix was in the framework itself, so it should have been picked up.. thanks for the validation of what we were seeing here.

We will continue to poke at this and see if we can lock it down.

MattB-msft avatar Nov 18 '22 21:11 MattB-msft

Ok, Update on this.. We think we have finally got this locked down, the .net 6 codebase made it a bit easier to spot. The next update for the Dataverse Service Client should resolve this, and we will close this bug with that update.

MattB-msft avatar Feb 14 '23 17:02 MattB-msft