Support for parsing metadata using the `a11y` prefix
While casually comparing the output of the Go Toolkit with an OPF from Standard Ebooks, I noticed that the metadata using the a11y prefix do not seem to be properly parsed:
- OPF: https://github.com/standardebooks/nathaniel-hawthorne_the-house-of-the-seven-gables/blob/master/src/epub/content.opf
- RWPM: https://publication-server.readium.org/bmF0aGFuaWVsLWhhd3Rob3JuZV90aGUtaG91c2Utb2YtdGhlLXNldmVuLWdhYmxlc19hZHZhbmNlZC5lcHVi/manifest.json
In this specific case, it's a11y:certifiedBy which doesn't seem to be properly parsed: https://github.com/standardebooks/nathaniel-hawthorne_the-house-of-the-seven-gables/blob/master/src/epub/content.opf#L19
Since we're currently working on supporting all key accessibility metadata for EAA, this is in scope with our work.
I thought it could be because the a11y prefix is not declared in the OPF but that's the case as well in our test cases which cover the a11y prefixes. Maybe the refines?
I thought about that too when I first encountered this example, but the spec is pretty clear that there's no need to declare the prefix.
whatch out for things like a11y:pageBreakSource vs pageBreakSource, there were inconsistent specification examples:
https://www.w3.org/publishing/a11y/page-source-id/#examples
https://www.w3.org/TR/epub-a11y-tech-11/#pageSource
(should not have the prefix, I believe it's fixed now)
yeah https://github.com/w3c/epub-specs/commit/444a3e661611a550303a1ec9d4f1d80dfe451750
@HadrienGardeur The reason is because the parser (which closely follows the kotlin-toolkit at the time of implementation) parses this a11y:certifiedBy tag in the OPF you provided and adds it internally as a "child" of the conformance-statement tag, because it has refines="#conformance-statement" set. I could fix this by checking the children as per the screenshot below (green part):
Is this however a potential issue with all the other accessibility tags as well, not just
certifiedBy? If so, some more significant changes might be necessary.
Relevant XML for everyone:
<meta id="conformance-statement" property="dcterms:conformsTo">EPUB Accessibility 1.1 - WCAG 2.2 Level AA</meta>
<meta property="a11y:certifiedBy" refines="#conformance-statement">Standard Ebooks</meta>
Thank you for the explanation @chocolatkey.
This could indeed be an issue with other metadata that are natively supported in RWPM (vs URI based extensions) and we should be aware of it.
In this specific case:
- we support multiple values for conformance (
conformsTo) - but we don't link these values to certification (the
certificationobject stands on its own inaccessibility, it has no relationship whatsoever toconformsTo) - and we only support a single object in
certification, we don't allow for multiple values/objects
This means that we should IMO:
- use the first value that either stands on its own in the OPF or refines an accessibility conformance statement
- and use our built-in extensibility (URLs) for everything else
@HadrienGardeur So should this logic of checking the children of conformsTo apply just to the certifiedBy property, or others as well?
How do we currently parse metadata using a refine statement? Do you keep the relationship between both metadata somehow?
@HadrienGardeur If I understand what you're saying correctly, the answer is yes. Any tags that refine another tag are "children" of that tag. That's why one quick/potentially naiive fix is to check the children of the conformsTo tag
How do we deal with that in the RWPM?
Do we use something like that?
"parent": {
"value": "123",
"child": "456"
}
With unknown metadata, it looks like this:
<package prefix="myPrefix: http://my.url/#">
<metadata>
<meta id="customProperty" property="myPrefix:customProperty">Custom property</meta>
<meta refines="#customProperty" property="myPrefix:refine1">Refine 1</meta>
<meta refines="#customProperty" property="myPrefix:refine2">Refine 2</meta>
{
"metadata": {
"http://my.url/#customProperty": {
"@value": "Custom property",
"http://my.url/#refine1": "Refine 1",
"http://my.url/#refine2": "Refine 2"
}
}
}
Thanks @mickael-menu, it's been a while so it helps to refresh my memory.
Overall, this means that our support for extensibility should already work as expected, the problem seems to be with "native" properties for RWPM, where we might skip such refinements.
For all "native" properties, we need to make sure that they're parsed properly and I think that we can live with the lack of an equivalent of the refine statements.
Can you focus on these a11y properties for now @chocolatkey ?
We can also file a separate issue somewhere (architecture since it affects all toolkits?) to discuss how we should handle refine statements on our "native" properties. With properties that use an object representation, it should be straightforward, with strings, integers/numbers and booleans, less so.