metacatui
metacatui copied to clipboard
500 error when updating valid EML 2.1.1 documents with missing schemaLocation
Describe the bug
After upgrading ESS-DIVE's underlying metacatUI to 2.14.0 on data-stage.ess-dive.lbl.gov, in testing older data packages updates we realized that when an EML document that is on 2.1.1 is missing the schemaLocation, metacatUI doesn't handle updates for those document. For example, an EML document that starts with the following node:
<?xml version="1.0" ?>
<eml:eml packageId="ess-dive-23ad472d099a6f3-20200903T192228119072" system="ess-dive" xmlns:cit="eml://ecoinformatics.org/literature-2.1.1" xmlns:doc="eml://ecoinformatics.org/documentation-2.1.1" xmlns:ds="eml://ecoinformatics.org/dataset-2.1.1" xmlns:eml="eml://ecoinformatics.org/eml-2.1.1" xmlns:prot="eml://ecoinformatics.org/protocol-2.1.1" xmlns:res="eml://ecoinformatics.org/resource-2.1.1" xmlns:stmml="http://www.xml-cml.org/schema/stmml-1.1" xmlns:sw="eml://ecoinformatics.org/software-2.1.1" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
MetacatUI shows the following error:
Error inserting or updating document: ess-dive-27a90361bad844c-20200903T192229411859 since <?xml version="1.0"?><error>SchemaLocation: schemaLocation value = 'https://eml.ecoinformatics.org/eml-2.2.0 https://eml.ecoinformatics.org/eml-2.2.0 https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd' must have even number of URI's.</error>
To Reproduce Steps to reproduce the behavior:
-
Go to this data package on ESS-DIVE stage: https://data-stage.ess-dive.lbl.gov/view/ess-dive-27a90361bad844c-20200903T192229411859
-
Try to update it through the UI
Expected behavior MetacatUI should
Screenshots

Desktop (please complete the following information):
- Version [e.g. 22] 2.14.0
Thanks @helbashandy That is a weird bug, as schemaLocation is entirely optional in XML, and should be totally ignored by our tools, including MetacatUI. The error you quoted shows a malformed schemaLocation attribute with three componenet URIs rather than the expected 2. So I think the issue is in determining where that schemaLocation is being injected if its not in your original document. I don't have permission to view the document so I couldn't check it myself. @gothub can you take a quick look at the document and verify there is no schemaLocation in it, and also see if MetacatUI somehow injects schemaLocation when it is missing? If so, this could be a very quick fix.
@gothub Have you gotten a chance to take a look at this?
@gothub have you looked at this "Priority Critical" issue yet to at least deduce where the malformed schemaLocation is being introduced?
This was happening because MetacatUI was setting the schema location to both the values that are set in editorSerializationFormat and editorSchemaLocation in the appModel, where the values default to:
editorSerializationFormat: "https://eml.ecoinformatics.org/eml-2.2.0",
editorSchemaLocation:
"https://eml.ecoinformatics.org/eml-2.2.0 https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd",
So the schema location would be: https://eml.ecoinformatics.org/eml-2.2.0 https://eml.ecoinformatics.org/eml-2.2.0 https://eml.ecoinformatics.org/eml-2.2.0/eml.xsd. 3 values, so invalid.
When an existing schemaLocation existed, then it would append these 3 values.
It wasn't happening for new EML docs, because we hardcoded the schemaLocation rather than setting it based on app config values. Fixed in the linked PR.
Thanks, @robyngit -- glad to see this finally resolved. If schemaLocation is entirely missing from an incoming document, does everything still work properly after your fix? That field is both optional and should generally get ignored, so it would be good to know that it can be completely omitted.