dataverse icon indicating copy to clipboard operation
dataverse copied to clipboard

GDCC/8924 - Check archival status

Open qqmyers opened this issue 2 years ago • 1 comments

What this PR does / why we need it: Remove any html tags from supplied archival status (new APIs being added in 5.12). Also updates the Jsoup class we use in general.

Which issue(s) this PR closes:

Closes #8924

Special notes for your reviewer:

FWIW: The Jsoup update appears to just be a classname change but the docs do say the old class will be retired soon.

Suggestions on how to test this: Use the archival status API (must be superuser) to put an HTML message and confirm the result from doing a get (or what's displayed) does not include html.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

qqmyers avatar Aug 19 '22 16:08 qqmyers

Coverage Status

Coverage decreased (-0.006%) to 20.042% when pulling 0909f7e0c89d4c13335f6c637575363f6b9ae790 on GlobalDataverseCommunityConsortium:GDCC/8605-add-archival-status into 6ca1c9d873218cb1e28f8443e7e242912ae09c8f on IQSS:develop.

coveralls avatar Aug 19 '22 16:08 coveralls

[My first foray into QA]

It looks good overall. the plain vanilla path does what it purports to do. If there are normal exceptions - dataset not found, version not available, user not found, user without proper permissions there are good error messages returned.

The only thing that I can see that needs to be addressed is that a JsonParsingException is not being caught and so the response code is 500 and the user only gets empty braces in the console. Looking at the logs reveals the exception.

sekmiller avatar Sep 02 '22 13:09 sekmiller

Is that new? I would guess it is from https://github.com/IQSS/dataverse/blob/094c90a074d7bea4d884123690b5bb26f85d5284/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java#L3326 and the framework returning 500 before our code is invoked. If so, I could change to having the method start with a string so we can parse the initial json ourselves and return a better error.

So - basically the question is - is the use case sending bad json in or sending in valid json that, after html tags are stripped, becomes bad?

qqmyers avatar Sep 02 '22 14:09 qqmyers

You're right. It's not new. The other methods in Datasets that catch the parsing exceptions are taking in strings and parsing them inside the method. Let me think about having a case where stripping the tags would yield a parsing exception - if that could happen.

sekmiller avatar Sep 02 '22 14:09 sekmiller

My guess is that's hard to do, but I can add a change to use string and catch the parse exception post sanitization. (It probably makes more sense to avoid converting to Json twice anyway.)

qqmyers avatar Sep 02 '22 14:09 qqmyers