python-odml
python-odml copied to clipboard
Bug in the rdflib graph representation while using turtle format
The bug was found in the rdflib graph representation while using the turtle format. Depending on the first symbol of the string (in our case id) that was being added to the namespace (while creating url), the turtle format prints it as a simple string instead of the URI. In case of a digit everything works good. But for a letter it fails. Yet the problem was found only in the turtle format. rdf/xml works well. (See the code sniped)
The issue was tested to see if this problem influences the initial graph when opening the file that was previously saved in the turtle format. The result is that it influences only visual representation but not the graph itself. (See the code for the test)
The issue was created for tutorial or documentation purposes.
from rdflib import Graph, Namespace, RDF, URIRef
g = Graph()
ns = Namespace("http://g-node.org/odml-rdf#")
g.bind('odml', ns)
hubNode = URIRef(ns + uuid.uuid4().urn[9:])
docNode = URIRef(ns + uuid.uuid4().urn[9:])
g.add((hubNode, RDF.type, ns.Hub))
g.add((docNode, RDF.type, ns.Document))
g.add((hubNode, ns.hasDocument, docNode))
print(g.serialize(format='turtle').decode("utf-8"))
Graph outputs for various ids:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<http://g-node.org/odml-rdf#173dac73-2ffc-46d5-9635-52b3d08c1e9a> a odml:Hub ;
odml:hasDocument odml:e818923e-7765-428e-88cd-4b5eb5b5b06d .
odml:e818923e-7765-428e-88cd-4b5eb5b5b06d a odml:Document .
###############################
odml:e7888eee-5041-4cea-aea1-b95a32a71c85 a odml:Hub ;
odml:hasDocument <http://g-node.org/odml-rdf#4e5fdd78-5d4f-4f88-ac82-bd22ca4eb1ce> .
<http://g-node.org/odml-rdf#4e5fdd78-5d4f-4f88-ac82-bd22ca4eb1ce> a odml:Document .
#################################
odml:e4871192-282c-4f2b-882c-67e5f690d550 a odml:Hub ;
odml:hasDocument odml:ac5a6f45-2f9e-4439-a84d-b2f1c23f2c37 .
odml:ac5a6f45-2f9e-4439-a84d-b2f1c23f2c37 a odml:Document .
The test that was used to analyze the problem:
from rdflib import Graph, Namespace, RDF, URIRef
g = Graph()
ns = Namespace("http://g-node.org/odml-rdf#")
g.bind('odml', ns)
hubNode = URIRef(ns + uuid.uuid4().urn[9:])
docNode = URIRef(ns + uuid.uuid4().urn[9:])
g.add((hubNode, RDF.type, ns.Hub))
g.add((docNode, RDF.type, ns.Document))
g.add((hubNode, ns.hasDocument, docNode))
print(g.serialize(format='turtle').decode("utf-8"))
data = g.serialize(format='turtle').decode("utf-8")
f = open("./python-odml/rdf_dev/example_odmls/test1.xml", "w")
f.write(data)
f.close()
g = Graph()
g.parse(source="./python-odml/rdf_dev/example_odmls/test1.xml", format="turtle")
f = open("./python-odml/rdf_dev/example_odmls/test2.xml", "w")
data = g.serialize(format='turtle').decode("utf-8")
print(data)
print(list(g.subject_objects(predicate=ns.hasDocument)))
f.write(data)
f.close()
The output of the test2.xml (same as test1.xml):
<http://g-node.org/odml-rdf#0e06bd34-2670-4a19-ba59-6a3118a6c8ed> a odml:Hub ;
odml:hasDocument odml:c7c57f18-7642-4344-bc49-eb1b9903cb7d .
odml:c7c57f18-7642-4344-bc49-eb1b9903cb7d a odml:Document .
## Output: subject and object in the triple with hasDocument predicate
[(rdflib.term.URIRef('http://g-node.org/odml-rdf#0e06bd34-2670-4a19-ba59-6a3118a6c8ed'),
rdflib.term.URIRef('http://g-node.org/odml-rdf#c7c57f18-7642-4344-bc49-eb1b9903cb7d'))]