SimpleIdServer icon indicating copy to clipboard operation
SimpleIdServer copied to clipboard

SCIM mongodb performance issue

Open gabrielemilan opened this issue 3 years ago • 16 comments

Hello guys, I noticed performance issue in case of a lot of users assigned to specific group (500 for example). It takes 8 min to add a new user to this group calling PATCH

This piece of code makes an update foreach user inside the group: image

So delete update many many times. I have a question on that, I saw that now there is a collection representationAttributes, is this collection just temporary? Because in my case now is empty.

Thanks.

gabrielemilan avatar Aug 18 '21 17:08 gabrielemilan

Hello,

I made the following changes in the branch "release/2.0.0" :

  1. The collection "SCIMRepresentationAttribute" has been removed.
  2. The class "RepresentationReferenceSync" has been updated in order to return only the modified entities (not the 500 users).
  3. The SCIMRepresentationCommandRepository/Update operation is using "ReplaceOneAsync" instruction.

I'm going to make some investigation next week to check if the performance can be improved. Anyway, with the modifications the performance is better :)

simpleidserver avatar Aug 18 '21 20:08 simpleidserver

Hello, thank you very much.

Meanwhile, I have a question, I saw that the lifetime of SCIMRepresentationCommandRepository is singleton. Could there be problems with _session added last time? I mean conflict problem or something similar. Is it safe? Or is better to set lifetime of SCIMRepresentationCommandRepository as scoped?

Thanks.

gabrielemilan avatar Aug 19 '21 06:08 gabrielemilan

Hello,

You're right, the configuration present in the "MongoDB" project is not ideal and can be a risk. In the current implementation, there is no issue with the "_session" because it is created by each HTTP REQUEST via the instruction : "using (var transaction = await _scimRepresentationCommandRepository.StartTransaction())". The class "ServiceCollectionExtensions" has been updated in the branch "release/2.0.0".

Kind regards,

SimpleIdServer

simpleidserver avatar Aug 19 '21 12:08 simpleidserver

Ok :-)

Now the performance is better.

I noticed that this query takes a lot of time in case of multiple ids:

image

I don't know if can help.

Thanks

gabrielemilan avatar Aug 19 '21 13:08 gabrielemilan

Ok :-)

Now the performance is better.

I noticed that this query takes a lot of time in case of multiple ids:

image

I don't know if can help.

Thanks

Moreover the foreach below run a query for each user already present in a group. Every time to get the same User schema document. I suppose this is the reason why the response time of a patch call, to add a user in a group, is proportional to the number of users already inside the group.

image

With a little modification just for test

image

the results are:

image

Hope this test can help

Thanks in advance Francesco

francesco-scullino avatar Aug 19 '21 15:08 francesco-scullino

Indeed the issue could be on the foreach:

image

gabrielemilan avatar Aug 19 '21 15:08 gabrielemilan

Hello,

Thank you for your feedback ! I already noticed this issue, i'm going to fix it next week.

Kind regards,

Simpleidserver

simpleidserver avatar Aug 19 '21 15:08 simpleidserver

Hello,

First of all, thank you about your feedback :) We made some the following changes in the branch "release/2.0.0" in order to fix the performance issue :

  • Return the representations concerned by the changes. The FindSCIMRepresentationByIds function returns only the representations impacted by the HTTP request.
  • Update the FindSCIMRepresentationByIds in order to improve the performance.

simpleidserver avatar Aug 24 '21 12:08 simpleidserver

Hello @simpleidserver team as usual, thanks for your reactivity.

I tested the new version and the performance increased a lot. :) You can compare the following diagram with the one in the previous comment:

image

Anyway is there an ascending trend of response time based on the number of users already in the group. I cannot evaluate the reason this time. This could be a problem in cases like mine, where a script add thousand of users in a group.

Thanks for your great job Francesco

francesco-scullino avatar Sep 08 '21 12:09 francesco-scullino

Hello,

We discovered a performance issue in the RepresentationReferenceSync class. When a group is patched with a new user, the following instruction takes more than 806ms to execute newIds = newIds.Where(i => !oldIds.Contains(i)).ToList();. This instruction has been replaced by newIds = newIds.Except(oldIds).ToList(); and it takes 10ms to execute.

The changes have been committed in the branch "release/2.0.1" can-you please check if the performance are better ?

https://github.com/simpleidserver/SimpleIdServer/commit/0b5bedeb636f0dc9d8e446bbff348c3f1a13d096

Kind regards,

SimpleIdServer

simpleidserver avatar Sep 08 '21 13:09 simpleidserver

Hi, you are right, the performace increased again, but the ascending trend is in place again:

image

I suppose some logic optimization it needs to be more scalable.

Kind regards, Francesco

francesco-scullino avatar Sep 08 '21 16:09 francesco-scullino

This problem occurs when the MongoDb document (Group) becomes larger. In fact it takes more time to load and update the document. Until we find a solution to fix this problem, the recommendation consists to execute a PATCH operation on the User to assign to a group. We execute 300 PATCH operations on a user & we didn't notice performance problem :

PatchUser

Patch operation:

{
 "schemas": [ "urn:ietf:params:scim:api:messages:2.0:PatchOp"],
 "Operations": [ 
 	{ "op": "add", "path": "groups", "value": [ { "value" : "b3c7c7c7-f192-41ca-bbeb-66b294085278" } ] }
 ]
}

simpleidserver avatar Sep 09 '21 14:09 simpleidserver

Hello, is there any update on this? Unfortunately execute a PATCH operation on the User to assign to a group is not feasible in our use case.

Thank you very much Francesco

francesco-scullino avatar Sep 21 '21 07:09 francesco-scullino

Hello,

I didn't find a valid solution to improve the performance of the PATCH operation. This operation is made of several blocks & some of them can have an impact on the performance :

  • Get group : the group becomes larger when there are more users assigned to this group. Therefore the operation will take more time to execute.
Nb Users Execution time
1 2ms
300 17ms
  • ApplyPatch : this operation applies changes on the group like : AddUser, UpdateTitle etc... The existing members must be fetched to check if there is no duplicate. When this list becomes larger then it will take more time to execute.
Nb Users Execution time
1 2ms
300 14ms
  • Sync : Assign the group to the user.
  • SaveChanges : When data is larger then it will take more time to execute.
Nb Users Execution time
1 1ms
300 17ms

I made some modifications in the "Sync" operation in order to improve the performance. https://github.com/simpleidserver/SimpleIdServer/commit/ffeb31b5133daac8c060bda91b8664675b4f57f0

Nb Users Execution time
300 40ms (before)
300 14ms (after)

Even if the operation will take more time to execute, the performance is acceptable. As you can see in the graph below, it takes less than 150ms for each operation to assign a user.

SimpleIdServer Scim Benchmark ScimBenchmark-AddUserToGroup-Job-EEXIWH-cummean

Can-you please fetch the latest changes of the branch "release/2.0.1" & identify where you have performance problem in the class "PatchRepresentationCommandHandler. I used the project "SimpleIdServer.Scim.Benchmark" to add one group & assign 300 users & add "Stopwatch" to calculate the execution time of each block.

simpleidserver avatar Sep 21 '21 13:09 simpleidserver

Hello, I think we do a different test. Using the last changes from branch release/2.0.1 we have better performance but not comparable with yours.

We measure performance from a client perspective measuring the response time of the patch endpoint.

francesco-scullino avatar Oct 14 '21 15:10 francesco-scullino

Hello,

In fact, I'm a bit surprised about the performance problem you have. The previous benchmark image, I sent, has been generated by using this algorithm.

Can-you please check if this algorithm is correctly reproducing your scenario, if it's the case, can-you please execute the benchmark "SimpleIdServer.Scim.Benchmark.csproj", compress the directory "BenchmarkDotNet.Artifacts" into a zip file and send it ?

Kind regards,

SimpleIdServer

simpleidserver avatar Oct 15 '21 13:10 simpleidserver