Galicaster
Galicaster copied to clipboard
Mediapackage metadata unable to contain multiple elements with the same 'name'
currently if you generate the mediapackage episode.xml from opencast with galicaster and it has multiple elements with the same name, only one will be used:
example episode.xml from opencast:
<dublincore><dcterms:isPartOf>example</dcterms:isPartOf><dcterms:temporal xsi:type="dcterms:Period">start=2017-02-06T09:00:00Z; end=2017-02-06T09:55:00Z; scheme=W3C-DTF;</dcterms:temporal><dcterms:audience>restricted</dcterms:audience><dcterms:audience>enrolled</dcterms:audience><dcterms:title>example - 06-Feb</dcterms:title><dcterms:available xsi:type="dcterms:Period">start=2017-02-07T10:00Z; scheme=W3C-DTF;</dcterms:available><dcterms:created xsi:type="dcterms:W3CDTF">2017-02-06T09:00:00Z</dcterms:created><dcterms:source>example</dcterms:source><dcterms:spatial>example</dcterms:spatial><dcterms:identifier>example</dcterms:identifier></dublincore>
galicaster when processing the xml will only write out one 'audience' element in the above case.
i've traced the issue to:
line 1438 marshalDublincore
https://github.com/teltek/Galicaster/blob/2.0.x/galicaster/mediapackage/mediapackage.py
this method processes the xml DOM and puts all the metadata into a dict by 'name' self.metadata_episode[name]
. so if the 'name' is the same then it gets overwritten.
also
line 59 _checknget
https://github.com/teltek/Galicaster/blob/2.0.x/galicaster/mediapackage/utils.py
this along with _checkget
look up the element by name but always use [0]
. if multiple elements with assigned to the named tag then you'd need to look beyond the zeroth element: _checkget(archive.getElementsByTagName(name)[???])
so i've quickly hacked this to work on our current 1.3.x install, but this isn't pull request worthy. Instead i'd suggest that the metadata handling is re written to be a bit more flexable, it should 1:1 represent the mediapackage xml sent to it from opencast
so heres the commit, just to reiterate, its not very good! but worth a look at to understand the issue https://github.com/UoM-Podcast/Galicaster/commit/b4b0752d0e4c4d77464f25caba4db7788e27664a
I brought this up at the Opencast conference, and have requested that the XML handling be rewritten. I've created as separate issue that discusses the wider issues.
https://github.com/teltek/Galicaster/issues/470