ProvToolbox
ProvToolbox copied to clipboard
unmodified prov-json documents cannot be stored in mongoDB due to field name restrictions
Field names in mongoDB cannot start with a dollar sign, making that prov-json documents cannot be stored without first being modified. This limitation is normally overcome by escaping the dollar sign (e.g using the Unicode full width equivalent).
It's very likely that similar limitations could be found in other document-oriented databases where field names usually have semantic meaning. I would be interested in having a general mechanism to make prov-json documents compatible with existing database systems.
Additional References:
Hi @etorres,
Thank you for bringing this to our attention. We'd love to support the escaping you suggested as an output option.
We're currently using the gson library for JSON serialization. Allow me sometime to investigate how gson can support this kind of escaping.
Has this been addressed at all in the last years? We are facing the same issue and would love to be able to use the default output directly in MongoDB.
We are routinely using the more recent prov-jsonld serialisation with mongodb without problem.
@lucmoreau Thanks you for your response. I have tried it and it looks like it should work. Is there a way to control which keys are added to the output though?
For example:
...
"@graph" : [ {
"@type" : "prov:Agent",
"@id" : "pfx:entity::CompanyName",
"efg:resourceType" : [ {
"@value" : "entity"
} ],
"prov:role" : [ ],
"prov:value" : [ ],
"prov:type" : [ "prov:Organization" ],
"prov:label" : [ ],
"prov:location" : [ ]
}, {
...
I don't need all of the empty fields to be present, and I don't believe that they are required either.
Sorry, I have not been able to find how to control this in the serialisation, but suggestions most welcome 😉