dataverse 10116 incomplete matadata label setting

What this PR does / why we need it: Fixed the bug where incomplete metadata label was shown on a published dataset and visible for everybody. The label is now only shown for draft dataset or when the new dataverse.api.show-label-for-incomplete-when-published feature is enabled, but only for the published datasets that the users can edit (e.g., when you are logged in, and you are a contributor for a given published dataset with incomplete metadata).

Which issue(s) this PR closes:

Closes #10116

Is there a release notes update needed for this change?: Yes

Dec 07 '23 15:12 ErykKul

coverage: 20.59% (-0.01%) from 20.603% when pulling c40838c6b5005596cb284a4ae2f9f5bba59f6560 on ErykKul:10116_incomplete_matadata_label_setting into a329f293f3e198074b71dce794444d3b3850cdf9 on IQSS:develop.

Dec 07 '23 15:12 coveralls

Hello and thank you for this fix. We also observed this behavior in v5.14 because the required metadata has changed over time.

Looking at the developments of this ticket, I see that it is not possible to have an overview of the published datasets concerned for an administrator (this is however specified in the doc), perhaps this is normal?

Indeed, "my Data" will not display the dataset if the administrator is not the depositor. In addition, it is not possible to do a global search (for example with datasetValid:false) because the published datasets are automatically indexed with datasetValid = true

https://github.com/IQSS/dataverse/blob/7a3ee97362942aa45a102d13bba6fc8777f44651/src/main/java/edu/harvard/iq/dataverse/search/IndexServiceBean.java#L787-L798

Perhaps we need to modify the condition on the draft versions to obtain a consistent search?

Thanks a lot Steven.

Mar 25 '24 16:03 stevenferey

@stevenferey Good catch on the indexing part! Thanks! I think I over-fixed it while making sure that published datasets never show as incomplete to regular users. I will fix it and retest it to see if it works as intended.

As for the filters part and datasets you see as administrator, it might be because of the roles that are assigned to the administrator account? You could have accounts with different roles, even one specific to detecting the incomplete datasets. I think it might be the Curator role that shows all datasets in my data tab and lets you edit them? I am not sure. In our installation these are the roles assigned to the admin account (and I can see all datasets end edit them in my data tab while logged in as admin):

The filter does work too, but indeed I see only draft datasets with incomplete metadata. I overlooked that it does not show any published datasets with incomplete metadata as we do not have any. I will create some on my test installation and fix the problem.

Mar 26 '24 09:03 ErykKul

Thank you for the feedback and future adjustments,

Indeed, an administrator with "contributor" rights on a dataverse displays the draft datasets with incomplete metadata in the "my data" page. But no published dataset with incomplete metadata because it is impossible to identify at the moment.

Mar 26 '24 13:03 stevenferey

@ErykKul - I assigned you since it looks like you were going to make additional changes. If that's not true or that's a separate PR, let me know so we can move this to Ready For QA. Otherwise, just assign me when you've made the changes and I'll re-review.

Apr 04 '24 17:04 qqmyers

Yes, I need to make some changes first. Easy to do, but then I need to test it thoughtfully. I rill do it after my vacation after 15th. This PR is not urgent, once ready, I will let you know.

Apr 06 '24 11:04 ErykKul

In addition, we identified bad behavior with the "my Data" page and the active "metadata validity" filters:

my_data

In this case, the user's dataverse is not displayed. The display of dataverses should not be impacted by the activation or not of "metadata validity" filters? Thanks

Apr 10 '24 15:04 stevenferey

@qqmyers I think it is working as it should now. I had to remove the permission wrapper from the mydata bean, but it seems OK, since it is your data, and the incomplete metadata labels on published datasets are turned on, then you can see them. Also, validation of published dataset after metadata was changed was tricky, but I think it works now as it should. I think that this PR is ready for QA.

May 06 '24 13:05 ErykKul

@qqmyers Thanks for reviewing! I did some fixes and gave the explanation on the strange part. Can you re-review?

May 07 '24 12:05 ErykKul

It turns out that I was the only one using the "isValid" method in DatasetVersion, so I changed it to keep the logic centralized. I added comments to make it more clear what is happening there.

May 07 '24 15:05 ErykKul

I retested it: collection now do show up in my data, incomplete and complete (draft and published) datasets have correct labels when "dataverse.ui.show-validity-label-when-published" is enabled, and published datasets do not have incomplete labels when they are disabled for published datasets.

May 07 '24 15:05 ErykKul

You need to reindex for the changes to take effect.

May 07 '24 15:05 ErykKul

@ErykKul This looks good. The only concern I have is that the show-validity-label defaults to false. I think if I have edit rights, I would want to know if I have a published dataset that needs attention - especially since it should be a rare occurrence. I would argue for a default of 'true' then if it gets to be too much you could shut it off with false.

May 14 '24 13:05 sekmiller

@sekmiller Sounds good, I changed the default to true.

May 14 '24 14:05 ErykKul