rio icon indicating copy to clipboard operation
rio copied to clipboard

metadata export to e.g. JSON-LD, DDI

Open rubenarslan opened this issue 6 years ago • 9 comments

  • [x] a suggested code or documentation change, improvement to the code, or feature request

Hi. I had forgotten about rio, but will now use it (and its wonderfully consistent interface for importing files with metadata) for the webapp of https://github.com/rubenarslan/codebook I was wondering whether you had any plans for exporting to metadata-only formats. I wrote a very basic first attempt at exporting to JSON-LD, but I was also asked about DDI.

Since you write about making a FOSS replacement for Stat/Transfer and Sledgehammer, I was wondering whether you consider exporting to metadata only formats within scope. I'd love to be able to embed metadata in various formats in my codebooks, but I don't think I'll tackle writing to DDI in R on my own. State of my own research: ipumsr is a package that reads DDI currently, and r2ddi is the only attempt at writing DDI that I know (and the developer has abandoned it).

rubenarslan avatar Mar 23 '18 13:03 rubenarslan

I've wanted to support DDI (briefly mentioned it here: https://github.com/leeper/rio/issues/12) but don't have the ambition to write a full DDI package. It's unfortunately too complex to bootstrap and there don't seem to be any existing tools that we could draw on directly: https://www.ddialliance.org/resources/tools

My preference would be for a separate DDI package that handles import/export that we could then use here rather than incorporating it directly into rio.

PS - fantastic re: codebook!

leeper avatar Mar 23 '18 14:03 leeper

I think the DDI spec is good at crushing ambitions 😄 I asked @gergness (ipumsr imports DDI) what IPUMS uses, he'll ask.

rubenarslan avatar Mar 23 '18 15:03 rubenarslan

Sadly, I don't think we'll be able to make our tools for DDI writing available any time soon. If helpful, ipumsr has code to read the DDIs generated by our site. However, the spec is huge and I only implement a minimal set of features that allowed me to read our extracts, so I'm not sure it will be.

gergness avatar Mar 28 '18 14:03 gergness

@gergness In your honest opinion, do you think DDI has the potential to spread to the FOSS world? The FOSS ecosystem seems so limited, and it was so much easier to get going with JSON-LD...

rubenarslan avatar Mar 28 '18 14:03 rubenarslan

Ha, I can only think of this: https://xkcd.com/927/

I don't really think there's anything that special about DDI, if you can get data import/export round trip for JSON-LD, faster than with DDI, I'd go with that.

gergness avatar Mar 28 '18 14:03 gergness

@gergness I guess the killer app for research dataset metadata is good search. I don't know anything that uses DDI for search across platforms, do you know if anything is in the works/did I miss sth? Because JSON-LD for datasets search is also just a promise without a timeline I suppose.

rubenarslan avatar Mar 28 '18 14:03 rubenarslan

Nope, I'm not aware of anything either.

gergness avatar Mar 28 '18 14:03 gergness

I think the main issue with DDI is there's not a low-level library that implements it fully so anyone who wants to use it has to start from scratch (see, for example, the Dataverse implementation: https://github.com/IQSS/dataverse/blob/3c7d647cbb2b5cf33b9c40276c4a69de73da16a5/src/main/java/edu/harvard/iq/dataverse/export/ddi/DdiExportUtil.java). That ultimately undermines its utility as it's too complex to really justify investing time in for a niche project.

leeper avatar Mar 28 '18 14:03 leeper

See https://github.com/dusadrian/DDIwR/ btw.

rubenarslan avatar Jan 08 '19 13:01 rubenarslan