Problems with the new import configuration
I have some problems transferring the current xml based configuration to the new import configuration (https://github.com/kitodo/kitodo-production/pull/5038). Right now i cannot make some catalogs work which worked before. It seems that some of the features got lost which originally ( https://github.com/kitodo/kitodo-production/pull/3374) allowed for the usage of arbitrary search interfaces. It would be good if those features could be ported to the new interface as well to make use of catalogs which worked before
I have some questions in this context:
- The xml based configuration had a way to specfiy custom url parameters. This allowed to add parameters to the URL in additon to those which are given by the user query. How can something like that be done in the new interface?
<urlParameters>
<param name="format" value="oai_mods" />
</urlParameters>
- What does Query delimiter actually stand for and what has to go there?

- Is it ensured that the mapping files are actually applied in the right order if i want to chain multiple transformations?

- How can a prefix for the identifier be specfied:
<identifierParameter prefix="prefix:" value="id" />
- The xml based configuration had a way to specfiy custom url parameters. This allowed to add parameters to the URL in additon to those which are given by the user query. How can something like that be done in the new interface?
<urlParameters> <param name="format" value="oai_mods" /> </urlParameters>
When the urlParameters element was first added to the kitodo_opac.xml catalog configurations, it was only intended for standard URL parameters like version, operation and recordSchema for SRU interfaces and verb and metadataPrefix for OAI interfaces. For this reason, I removed the option to add arbitrary URL parameters in the ImportConfiguration object and instead replaced it with a defined set of parameters for the individual search interface types (SRU, OAI etc.).
But I see now that having the option to add custom URL parameters is not only useful but even required in some cases, so I think we should try to re-enable this option.
- What does Query delimiter actually stand for and what has to go there?
The query delimiter is an optional character in which the query part of the URL can be enclosed. This was necessary for some SRU interfaces like the "LfULG - DiGAS" where the query would need to be enclosed in " characters in order for the interface to process the query successfully.
In the kitodo_opac.xml configuration file, this optional delimiter could be configured using the following element:
<queryDelimiter>"</queryDelimiter>
- Is it ensured that the mapping files are actually applied in the right order if i want to chain multiple transformations?
Yes, the mapping files are saved in a list internally, preserving the order in which they have been assigned to the import configuration and ensuring they are applied in the correct order to the imported metadata file. In fact, an exception will be thrown if you try to save a an ImportConfiguration where
- the input metadata format of the first of a sequence of mapping files does not correspond to the metadata format of the ImportConfiguration, or
- the output metadata format of the last file in the sequence of assigned mapping files is not "Kitodo", or
- the output format of one mapping file in the sequence of mapping files does not correspond to the input format of the next mapping file
- How can a prefix for the identifier be specfied:
<identifierParameter prefix="prefix:" value="id" />
You are right, I missed this feature when implementing the new ImportConfiguration class. I will try to re-add it before the next release.
Thank you for your prompt replies. My question about the order of the mapping files came from my test with the SRU interface of the zdb: https://services.dnb.de/sru/zdb?version=1.1&operation=searchRetrieve&query=zdbid%3D2825456-9&recordSchema=MARC21-xml
In the xml-based configuration i specified two mapping files:
- the marc2mods-mapping of the library of congress (https://www.loc.gov/standards/mods/v3/MARC21slim2MODS3-7.xsl , you probably also need a local copy of https://www.loc.gov/standards/marcxml/xslt/MARC21slimUtils.xsl for that to work)
- the standard mods2kitodo mapping
XML config:

This worked. I cannot make it work with the new configuration. The url is constructed correctly but at some point a parser exception is thrown
ConfigException / XPathException / SAXParseException: Premature end of file.
I therefor assume that something is not working correctly with the mappings.
I checked again and the problem seems to have to do with the missing ordering of the entries in the database table importconfiguration_x_mappingfile.
I changed the following by hand in mappingfile:

to:

so that the MARC converter comes first. Now everything works.
By doing that the order of the entries in importconfiguration_x_mappingfile has the MARC mapping first and the transformation works correctly:

We probably need an "order"-column in importconfiguration_x_mappingfile to ensure that the mappings are applied in the correct order at runtime.
@solth Another question:
What is the purpose of the xpath configuration for the parent?:

I had the assumption that the definition, which element in the returned xml form the catalague is the parent element's identifier is controlled by the definition in the ruleset. (By setting the higherlevelIdentifier)

What is the specific purpose of giving an XPath to the parent element here? Is this used in a different context than the higherlevelIdentifier?
What is the specific purpose of giving an XPath to the parent element here? Is this used in a different context than the
higherlevelIdentifier?
The Parent element - XPath setting is used to extract the catalog ID of the parent record of the imported record from the imported XML document.
The metadata configured as higherLevelIdentifier in the ruleset defines the internal metadata field in which the parent ID is saved.