pyDataverse
pyDataverse copied to clipboard
Mapping DDI XML
Implement mapping from and to DDI XML.
Requirements
- DDI XML from NESSTAR mapping
- DDI XML from OAI-PMH endpoint mapping
- DDI XML from frontend download mapping
- import from all DDI XML versions
- import from all DDI XML versions (lower priority)
- validate data against schema
- XML schemas
ACTIONS
0. Pre-Requisites
- [ ] Part of re-factor models module #102
- [ ] is there already a Python module out there, which works with DDI XML (or code, or developer)?
- [ ] Check out DANS service
1. Research
- [ ]
2. Plan
- [ ] Define requirements
3. Implement
- [ ] Write tests
- [ ] Create Mapping File
- [ ] Write code
- [ ] Update Docs
- [ ] Basic Usage
- [ ] Advanced Usage
- [ ] Quickstart
- [ ] Write Docstrings
- [ ] Run pytest
- [ ] Run tox
- [ ] Run pylint
- [ ] Run mypy
4. Follow Ups
- [ ] Review
- [ ] Code
- [ ] Tests
- [ ] Docs
Follow-Ups
- [ ] Re-factor models module #102
Here some resources:
Functionalities:
- Data types: Datasets, Datafiles
- Validate data before export and import
- Import to DVObjects Dataset and Datafile
- Export to DVObjects Dataset and Datafile
- Include NESSTAR Dialect
- add custom mapping with custom XSLT
- Use/Create XSLT Schemas: CESSDA has some, DANS ddi-converter uses one too and another one on GitHub.
Mapping for DDI 2.5. Some attributes mapped:
abstract -> dsDescriptionValue
notes -> notesText
titl -> titel
subTitl -> subtitle
altTitl -> alternativeTitle
grantNo -> grantNumberValue
grantNo (agency) -> grantNumberAgency
timePrd (event="start") -> timePeriodCoveredStart
timePrd (event="end") -> timePeriodCoveredEnd
collDate (event="start") -> dateOfCollectionStart
collDate (event="end") -> dateOfCollectionEnd
dataKind -> kindOfData
serName -> seriesName
serInfo -> seriesInformation
relMat -> relatedMaterial
relStdy -> relatedDatasets
othRefs -> otherReferences
srcOrig -> originOfSources
Notes to myself:
- https://wiki.selfhtml.org/wiki/XML/XSL/XSLT
- https://de.wikipedia.org/wiki/XSL_Transformation
- https://www.torsten-horn.de/techdocs/java-xsd.htm
- https://www.w3schools.com/xml/schema_intro.asp
- https://de.wikipedia.org/wiki/Dokumenttypdefinition
- https://de.wikipedia.org/wiki/XPath
- Videos Harvard Kurs
- https://www.youtube.com/watch?v=x8kMELlNaYg&list=WL&index=67&t=2s
- https://www.youtube.com/watch?v=-Wft5dD-1ig&list=WL&index=68&t=14s
- https://www.youtube.com/watch?v=YkAZlQgPXG4&list=WL&index=69&t=30s
- https://www.youtube.com/watch?v=hOR132CodOU&list=WL&index=70&t=31s
- https://www.youtube.com/watch?v=qY2Ezw786ko&list=WL&index=71&t=12s
- https://www.youtube.com/watch?v=6Zvw3kmJ0KA&list=WL&index=72&t=0s
- https://de.wikipedia.org/wiki/Extensible_Stylesheet_Language
As discussed during the 2024-02-14 meeting of the pyDataverse working group, we are closing old milestones in favor of a new project board at https://github.com/orgs/gdcc/projects/1 and removing issues (like this one) from those old milestones. Please feel free to join the working group! You can find us at https://py.gdcc.io and https://dataverse.zulipchat.com/#narrow/stream/377090-python