dataverse icon indicating copy to clipboard operation
dataverse copied to clipboard

dataverse_json with "metadataLanguage":"undefined" cannot be imported

Open landreev opened this issue 2 years ago • 6 comments

Starting v5.7, it’s possible to define metadataLanguage for a dataset. This value also gets exported in dataverse_json. When it’s not defined, "metadataLanguage":"undefined" is exported. However, it appears that this “undefined” value is not accepted on import: Specified metadatalanguage not allowed. As an example: None of the datasets in our own prod. have the metadata language defined; and all the datasets have been re-exported since 5.7. So the impact of this is that none of our prod. dataverse_json exports are importable.

It appears that the code on the import side that's actually throwing the exception has only been added in 5.11; so that must be the reason nobody has noticed/complained yet:

if(mdl==null || settingsService.getBaseMetadataLanguageMap(new HashMap<String,String>(), true).containsKey(mdl)) {
          dataset.setMetadataLanguage(mdl);
        }else {
            throw new JsonParseException("Specified metadatalanguage not allowed.");
        }

landreev avatar Jul 29 '22 15:07 landreev

When the metadatalanguage value was switched from null meaning undefined to 'undefined' meaning undefined, the DDI export was changed to check that , e.g. https://github.com/IQSS/dataverse/blob/751a0088fff3e486359335042615b63b1a03c1f1/src/main/java/edu/harvard/iq/dataverse/export/ddi/DdiExportUtil.java#L166-L171 but it looks like the json wasn't. (ORE works as well, but doesn't use that util method). One possible fix would be to suppress the output in the json via this method (refactored out of the DDI util class).

qqmyers avatar Jul 29 '22 16:07 qqmyers

This will impact exports. This is something that is important to get into 5.12. Leonid would like to see this in.

mreekie avatar Aug 03 '22 18:08 mreekie

@landreev @qqmyers There isn't a util for the json exporter, so does it make sense to update the printer so that the metdataLanguage field is only added if it's something other than undefined or do we want/need it to be undefined for purposes other than export? In other words is this a possible solution (in the printer)?

public static JsonObjectBuilder json(Dataset ds) {
    JsonObjectBuilder bld = jsonObjectBuilder()
            .add("id", ds.getId())
            .add("identifier", ds.getIdentifier())
            .add("persistentUrl", ds.getPersistentURL())
            .add("protocol", ds.getProtocol())
            .add("authority", ds.getAuthority())
            .add("publisher", BrandingUtil.getInstallationBrandName())
            .add("publicationDate", ds.getPublicationDateFormattedYYYYMMDD())
            .add("storageIdentifier", ds.getStorageIdentifier());
    String mdLang = ds.getMetadataLanguage();
    if (mdLang != null && !mdLang.equals(DvObjectContainer.UNDEFINED_METADATA_LANGUAGE_CODE)) {
        bld.add("metadataLanguage", ds.getMetadataLanguage());
    }
    return bld;
}

`

sekmiller avatar Aug 10 '22 19:08 sekmiller

~yeah - I would suggest moving the Util method to some other class and calling it from both DDI and the json printer, but I don't know what ~Util class it should go in (just StringUtil? DvObjectContainer where UNDEFINED_METADATA_LANGUAGE_CODE is defined?). (It's not a very big method but the bug is because we/I had different code in two places.) In any case, I think/agree that not putting it in the export in the first place is a good idea (we don't list all the other undefined fields either).

qqmyers avatar Aug 10 '22 20:08 qqmyers

This looks good. Once you make a PR I can review and/or QA it quickly.

landreev avatar Aug 22 '22 15:08 landreev

Ok. I was just rerunning the integration tests after getting the latest from Dev.

sekmiller avatar Aug 22 '22 15:08 sekmiller