clarin-dspace icon indicating copy to clipboard operation
clarin-dspace copied to clipboard

OAI-PMH oddities in dim metadataformat

Open kosarko opened this issue 4 months ago • 5 comments

curl --no-progress-meter "https://lindat.mff.cuni.cz/repository/server/oai/request?verb=GetRecord&metadataPrefix=dim&identifier=oai:lindat.mff.cuni.cz:20.500.12801/3901488-03" | xmllint --format -

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="static/style.xsl"?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
  <responseDate>2025-08-04T14:24:17Z</responseDate>
  <request verb="GetRecord" identifier="oai:lindat.mff.cuni.cz:20.500.12801/3901488-03" metadataPrefix="dim">https://lindat.mff.cuni.cz/repository/server/oai/request</request>
  <GetRecord>
    <record>
      <header>
        <identifier>oai:lindat.mff.cuni.cz:20.500.12801/3901488-03</identifier>
        <datestamp>2025-06-24T19:43:00Z</datestamp>
        <setSpec>com_20.500.12801_1</setSpec>
        <setSpec>com_20.500.12800_1</setSpec>
        <setSpec>col_20.500.12801_3</setSpec>
      </header>
      <metadata>
        <dim:dim xmlns:dim="http://www.dspace.org/xmlns/dspace/dim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:doc="http://www.lyncode.com/xoai" xsi:schemaLocation="http://www.dspace.org/xmlns/dspace/dim http://www.dspace.org/schema/dim.xsd"><dim:field mdschema="dc" element="contributor" qualifier="author">Veselý, Bohumil</dim:field><dim:field mdschema="dc" element="date" qualifier="accessioned">2021-10-18T20:39:15Z</dim:field><dim:field mdschema="dc" element="date" qualifier="available">2021-10-18T20:39:15Z</dim:field><dim:field mdschema="dc" element="date" qualifier="issued" lang="*">0000</dim:field><dim:field mdschema="dc" element="identifier" qualifier="other">3901488-03</dim:field><dim:field mdschema="dc" element="identifier" qualifier="uri">http://hdl.handle.net/20.500.12801/3901488-03</dim:field><dim:field mdschema="dc" element="description">Librarian Bohumír Lifka on Bohumil Veselý's balcony.</dim:field><dim:field mdschema="dc" element="language" qualifier="iso">zxx</dim:field><dim:field mdschema="dc" element="publisher">Národní filmový archiv</dim:field><dim:field mdschema="dc" element="rights" lang="*">Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)</dim:field><dim:field mdschema="dc" element="rights" qualifier="uri" lang="*">http://creativecommons.org/licenses/by-nc-nd/4.0/</dim:field><dim:field mdschema="dc" element="rights" qualifier="label" lang="*">PUB</dim:field><dim:field mdschema="dc" element="subject">klobouk smeknutí</dim:field><dim:field mdschema="dc" element="subject">Galerie osobností</dim:field><dim:field mdschema="dc" element="subject">Places::Praha::Nové Město::Školská::pavlač Bohumila Veselého</dim:field><dim:field mdschema="dc" element="subject">People::Lifka Bohumír (1900-1987)</dim:field><dim:field mdschema="dc" element="title">Bohumír Lifka (librarian)</dim:field><dim:field mdschema="dc" element="type">clip</dim:field><dim:field mdschema="local" element="approximateDate" qualifier="issued">cca 1930-1965</dim:field><dim:field mdschema="local" element="contact" qualifier="person">Technical;Contact;[email protected];Národní filmový archiv</dim:field><dim:field mdschema="local" element="has" qualifier="files" lang="*">yes</dim:field><dim:field mdschema="local" element="branding">NFA</dim:field><dim:field mdschema="local" element="refbox" qualifier="format">{title}, {publisher}, {repository}, {pid}.</dim:field><dim:field mdschema="local" element="language" qualifier="name">Nolinguistic content</dim:field><dim:field mdschema="local" element="files" qualifier="size" lang="*">241650171</dim:field><dim:field mdschema="local" element="files" qualifier="count" lang="*">2</dim:field><dim:field mdschema="metashare" element="ResourceInfo#ContentInfo" qualifier="mediaType">video</dim:field>restricted</dim:dim>
      </metadata>
    </record>
  </GetRecord>
</OAI-PMH>

especially:

qualifier="mediaType">video</dim:field>restricted</dim:dim>

This actually seem to be two issues:

1. for some reason the xoai metadataFormat (the source for the xslt transformations) has:

            <element name="access-status">
              <field name="value">restricted</field>
            </element>

restricted doesn't feel right for this item; the collections this is in, has Anonymous in the DEFAULT READ group, but also other groups:

Image

(investigate from: https://github.com/ufal/clarin-dspace/blob/f88207792ad2bde02b28a9b7a834823f5ffc71dd/dspace-oai/src/main/java/org/dspace/xoai/app/plugins/AccessStatusElementItemCompilePlugin.java#L55)

2. the apply-templates in dim xslt

(https://github.com/ufal/clarin-dspace/blob/54805a83181cca4d214d1cbc3ae3fd772cdf2ec3/dspace/config/crosswalks/oai/metadataFormats/dim.xsl#L11) and I guess a default text template, makes the access-status appear in dim:dim (this seems similar to what was happening in https://github.com/DSpace/DSpace/issues/6556)

kosarko avatar Aug 04 '25 14:08 kosarko

As far as (1)

<element name="access-status">
  <field name="value">restricted</field>
</element>

This refers surprisingly to a bitstream resource policy that is, in this case restricted for "anonymous" users.

Available values are:

metadata.only - item has no bitstreams open.access - bitstream has no READ restriction (available for anonymous user) restricted - bitstream READ access is set for some user group(s) - not for anonymous user embargo - bitstream READ access is set for anonymous user but with embargo (valid until some date) unknown - bitstream READ access is set for some explicit user only (not for any group)

In case of embargo, the embargo date is available also in XOAI metadata format:

<element name="access-status">
    <field name="value">embargo</field>
    <field name="embargo">2025-08-08</field>
</element>

Note that the "access-status" element is related only to the first bitstream from the item's ORIGINAL bundle.

As far as (2) (You're are right @kosarko) This is the result of the default xsl:template, that is

<xsl:template match="text()|@*">
  <xsl:value-of select="."/>
</xsl:template>

In the fix, the default template was overwritten with an EMPTY template that doesn't produce any output:

<xsl:template match="text()|@*"/>

This template has the lowest priority, and is applied only when other templates are not applied.

See the Default Templetes and also the Templates Priorities

kuchtiak-ufal avatar Aug 06 '25 14:08 kuchtiak-ufal

Pull request: https://github.com/ufal/clarin-dspace/pull/1254

kuchtiak-ufal avatar Aug 06 '25 14:08 kuchtiak-ufal

I've found out this is also tracked in upstream https://github.com/DSpace/DSpace/issues/10397

kosarko avatar Aug 11 '25 13:08 kosarko

Hi Ondrej (@kosarko)

That upstream fix creates the following dim:field element:

<dim:field mdschema="others" element="access-status">restricted</dim:field>

Our fix just ignores all elements that have not xsl:template specified.

May be, we should combine those two fixes. What you think?

kuchtiak-ufal avatar Aug 12 '25 08:08 kuchtiak-ufal

Waiting for the DSpace fix to be merged into clarin-dspace

kuchtiak-ufal avatar Sep 01 '25 09:09 kuchtiak-ufal