metacatui
metacatui copied to clipboard
Truncated group when adding a user to a large group
Describe the bug We have a group with ~263 members that takes a long time to load in the group page. We had a case where a user was added to the group before it was fully loaded in the UI. This cause the group to lose about 1/3 of the users because it was submitted before it was done laoding
To Reproduce
Expected behavior The group should not be saved until it has been fully loaded in the UI
Thanks @vchendrix
Even better, making an addition to the group should not require loading the whole group in the client, and should be idempotent.
@robyngit Any initial thoughts on where this originates?
@vchendrix and @mbjones, I've taken some time to investigate the issue, and here's what I've found:
- Despite the appearance of pagination in the
GroupListView
, all data is loaded and parsed at once. - Parsing and displaying the group info, even with groups much larger than 263 members, seems pretty performant. The long load time might be related to the speed of the response from the CN or the internet connection.
- The
UserGroup
collection currently employs theCNIdentity.getSubjectInfo
method.getSubjectInfo
returns information about all people within the group, in addition info about all associated groups (potentially all other groups that the group members are part of?). - It makes sense that if the group were somehow only partially loaded, and a user was able to update the group, then the group would lose all the members that were not yet loaded. This is because we use
NIdentity.updateGroup
to add new members. This method necessitates sending the entire group, not just the new members to be added - it completely replaces the given group info with whatever new info is sent. There doesn't appear to be an API for adding a single new member. - Also, in MetacatUI, if a group is marked as "pending", then a new group is created. This could also be a part of the issue.
Here's what I propose we do:
- Explore switching to using
CNIdentity.listSubjects
to paginate results and load user info only as required. However, we would still need to load the entire group before sendingupdateGroup
. - Prevent sending an
updateGroup
request until we have loaded all group information from the accounts service. If a user tries to add a group member before the group has loaded, we can keep track of all group members to be added, and send anupdateGroup
request with these new members as soon as the group is loaded.
In regards to:
and should be idempotent
@mbjones, is your suggestion that we additionally ensure that we do not attempt to add a user if that user is already a member of the group?
Also, @vchendrix, if it would be possible to share the response from getSubjectInfo
for this particular problem group, that would help me figure out what's going on in this case.
Great summary, thanks @robyngit
| is your suggestion that we additionally ensure that we do not attempt to add a user if that user is already a member of the group?
I was suggesting that adding a user to a group should not require sending the whole group list -- I should be able to call add(Matt)
repeatedly, and it would be idempotent in the sense that, if Matt is not a member of the group, he gets added, but if he is already a member, it just stays the same and sends back success. It sounds like that is not how our group API works though, so probably worth a discussion.
@robyngit Thanks for the thorough description! The group in question is https://cn.dataone.org/cn/v2/accounts/CN%3Dess-dive-users%2CDC%3Ddataone%2CDC%3Dorg. When loading the groups page it is the last of my groups to load (The reponse took 28 seconds to load for this url). Let me know if you need further details.
Javascript console for loading the group page sorted by the time
Thank you @vchendrix, this is clearly the root of the issue... It also took 10-30 seconds for that link to load for me and a few people on the dev team. What we need to do is:
- [ ] Implement a DataONE API for adding members to a group by sending only the new members (e.g. the new
<person>
info), instead of needing to send the entire updated group. - [ ] Look into whether we can use CNIdentity.listSubjects to get paginated results, if so, use that.
For the time being we can:
- [ ] Prevent users from updating groups before the entire group has loaded (and indicate that the group info is being loaded).
@robyngit We reviewed this issue on the ESS-DIVE call today, and concluded that there are deeper problems than just incomplete loading of the subject list. Madison demonstrated the page load problem, and showed that the group editing widget does not show up at all until the full /accounts
request has returned with a 200 success. So, the API call returned fully, and you can page through the results in the UI, and yet the update of the group still causes data loss. So there may be something deeper in the MetacatUI model that is causing loss when submitting a large group.
One thing we did test was substituting CNIdentity.listSubjects for CNIdentity.getSubjectInfo as you proposed above. In testing, getSubjectInfo takes more than 45seconds to return, whereas a query using
listSubjects` returns the whole list in < 0.9s. Here's the query I ran:
curl -s https://cn.dataone.org/cn/v2/accounts?query="CN=ess-dive-users,DC=dataone,DC=org"
The problem with this query is it only returns the subject identifier of the group members (their ORCIDs), and not their name or email etc. So, to use this, we discussed a multi-stage process to get a group to be updated:
- query
listSubjects
to get the whole list of members fast (possibly using paging, but that may not be needed) - for the current "page" of users to be displayed, issue a call to
getSubjectInfo
for just that small set of users. you now should have everything you need to display the first page of users in the group - when the user navigates to a new page in the group list, fetch the name and email info needed with
getSubjectInfo
. Also consider pre-fetching more of that in a background thread (this could be done if we cache user data in a model and only refresh it if we don't have it or it changes). - if the user wants to add a new person, they type the orcid and we fetch the subjectInfo for them and cache it in the model, then we submit the group update call
With this approach, you still have to retrieve the whole group list to submit changes, but this is now fast because you don't have to retrieve all of the subjectInfo for every user -- all you need is the list of ORCIDs to add a user to that list. Let's discuss please.
@mbjones thanks for detailing this so clearly. This sounds like a good plan for displaying large groups and I think we should proceed with this fix. We could even display the orcids immediately after getting a response from CNIdentity.listSubjects
and then update user names and details as we get information from getSubjectInfo
.
It still leaves the question of how we handle updating a large group.
So, the API call returned fully, and you can page through the results in the UI, and yet the update of the group still causes data loss. So there may be something deeper in the MetacatUI model that is causing loss when submitting a large group.
I'll look into what is going on here.