node-solid-server
node-solid-server copied to clipboard
test RDF in a data island in the HTML
@melvincarvalho How can I help you ?
- I can create a branch
- I can review your code. The problem is I don't know how dataIsland is defined in HTML ? which tag ? I suppose we need to parse the html content
- just as reminder a
container/
with index.html automatically serves index.html - how do you expect to render RDF ? with an accept Header
- just as reminder a
An Html file with a dataIsland example may help me to understand.
@bourgeoa thanks for looking at this
A structured data island is simply:
<script type="application/ld+json" id="data">
{
"json-ld": "goes here"
}
</script>
Then the content of the script would be whatever the JSON-LD is for that resource
So in the case where we have mashlib in a script an extra script tag is inserted with the data too
This would make all of the different mime types give back consistent RDF
Does that make sense?
It's probably worth noting that such data islands may be in additional formats (media types), either in parallel with or instead of JSON-LD. At OpenLink Software, we commonly inject data islands in both JSON-LD and Turtle. Other media types have also been used in experiments but are not commonly parsed, so are not commonly injected.
@TallTed, do you have an example we could use as a template for learning,
This article should be a help.
So my understanding is the following :
For a file example.html
- data island is defined by these 2 elements :
- a
script
tag. (Can there be multiple occurences ? consider only one for now) - a
type
the type can be any RDF contentType : turtle, jsonld, or XML nota : if multiple occurences of data island is allowed then a third parameter is needed. Theid
must be unique for the HTML document. It is not specific to script tag. may be 'data-*'
- a
-
text/html
shall be considered an RDF by the server What should be the result of :- 'GET' on an html document with an
Accept
headercontentType
- return a script tag content with a
type=any RDF contentType
, content being converted tocontentType
- return a script tag content with a
- 'POST', 'PUT',
DELETE
have an action on the HTML document, including the data island - 'PATCH' has an action on the data island.
- 'GET' on an html document with an
The PR #1715 implements the following :
- a script block
<script type="RDF contentType" id="data">RDF content</script>
. - id is not a MUST and not used by NSS
- a data island can be discovered from anywhere in the HTML resource.
- Both tags
</script>
and closing tag</script>
are needed. - the created or modified data island script is
always
inserted just before the closing</head>
tag
Data island is fetched with :
- GET and returns an RDF resource depending on the
RDF contentType Accept Header
:- text/turtle, text/n3, application/ld+json, application/rdf+xml
- no Accept Header or text/html return the usual HTML resource
- PATCH creates or modifies the HTML resource :
- by default a new data island is created with a
text/turtle
contentType. - an existing data island is modified using the existing data island
type
parameter.
- by default a new data island is created with a
Question : Should PATCH allow to store the data island using Accept Header ? is using the Accept Header SOLID compliant ?
This is fantastic!
Could the default be application/ld+json or configurable, say, in the NSS config? Reason being that parsing JSON is native to the browser and easy
Unsure about the PATCH operation, isnt that server wide?
I tried running the dataIsland branch locally and managed to log in. But I was unable to see a data island in the webid profile that was created. Will have a look to see if there's anything obvious that can be fixed
@melvincarvalho
Could the default be application/ld+json or configurable
- The default is only used in PATCH, you can always use PUT to create an html resource with a JSONLD data island
- Yes it is possible to default to JSONLD, but I was with the idea that Solid usually default to TTL
- Make it configurable imply to pass a parameter, in HEADER I suppose. Nothing is available in the actual n3 patch (solid v0.9)
But I was unable to see a data island in the webid profile that was created
Well webid is not an html resource. Creation of data island is made client side in html documents.
@bourgeoa ah, i see, thank you
re: patch, yes turtle I think is best default in that case
Would it be possible to generate data islands on the server side?
I'm not sure if there are many benefits to making changes on the client side
Would it be possible to generate data islands on the server side? I'm not sure if there are many benefits to making changes on the client side
I'm not sure to understand what you are looking at. Create an html document server side ? At pod creation ? On other situations ? When ? Why ?
Data island is just a way to store RDF data in an html document. Dokieli is an other way. If you want to produce a data island with the html body, you need to create a specification. I haven't seen any.
Create an html document server side
No, just the same way it's done today
When we get an HTML file it contains mashlib, and that file is given to the browser by node solid server
What I'm saying is that, as well as adding mashlib, give back a data island in the RDF so that it's consistent with the other mime types
The way to test this, would be to run curl against the file, and see if the data island is there. This is something I'm trying to write a test for the test suite, to explain it better
So NSS when it has a GET request, and gives back HTML, also pulls in the JSON-LD and puts it in a script tag
So NSS when it has a GET request, and gives back HTML, also pulls in the JSON-LD and puts it in a script tag
Where is the JSON-LD located ? can you give an html content with JSONLD content ? Is this what you are at https://www.w3.org/2012/sde/ ?
can you give an html content with JSONLD content ?
Yes you put the RDF in a SCRIPT tag inside the HTML
This is how most of the semantic web works today, outside of Solid. Having RDF in HTML would bring solid up to par with the majority of existing semantic web
Example
Alice has a <webid>
curl <webid>
Gives back:
- html page
- mashlib script tag
- script tag with RDF in JSON-LD
The RDF for the webID is stored on the server, but returned by node solid server
So exactly as we have today, but now HTML files also have RDF, just like the other mime types
Is this what you are at https://www.w3.org/2012/sde/
Yes, this would be an excellent tool for testing
Side thought: It might be possible that the JSON-LD returned from NSS and the html returned form NSS could be almost identical
JSON-LD returned by NSS
{
JSON-LD-HERE
}
HTML returned by NSS
<html>
...
...
<script>
{
JSON-LD-HERE
}
</script>
... mashlib here
<body> here
</html>
This might be relatively easy to code if the same view is copied from JSON-LD to HTML, and some scaffolding added. If I get some cycles free, I might give this a try in a local branch
I think I have isolated the code that does this:
https://github.com/nodeSolidServer/node-solid-server/blob/main/lib/handlers/get.js#L84
I might be able to change the resource mapper a bit so that it brings back JSON-LD then put that into the HTML with the databrowser config setting
Mashlib is an app running in the browser that allow to browse pod/pods documents giving different representation depending on RDF data, content negotiation or actions ( create/edit ...)
An html document doc.html text/html
was always returned has doc.html text/html
containing all the original html content Including head/script/body with all scripts be it JavaScript or data island.
My PR add only content negotiation.
If the doc.html contains a data island script, then you can ask it with GET Accept header application/ld+json
and receive a document doc.html application/ld+json
. When there is no data island GET return 404.
https://github.com/nodeSolidServer/node-solid-server/blob/main/lib/handlers/get.js#L84
This line just tells if that URL can be displayed using mashlib app. If that URL is an entry point for mashlib app.
A pod URL pointing to an html document is not displayed with mashlib but directly by the browser and contains all the html including the data island if any.
@bourgeoa the content type text/html
should return RDF. Right now it doesnt
The way to fix this is to put the JSON-LD inside a script tag in the HTML as shown above
Doing it client side, does not fix the issue, it can be tested here
https://www.w3.org/2012/sde/
I believe it can be fixed here:
https://github.com/nodeSolidServer/node-solid-server/blob/main/lib/handlers/get.js#L84
By changing the content pulled in by the resource mapper. If I get time I'll have a go locally and a proof of concept
@melvincarvalho As you can see the data island is there for this URL https://bourgeoa.solidcommunity.net/public/alain.html
Exactly what mashlib give in the source-pane
@bourgeoa that looks beautiful!
I can confirm it works with curl:
curl https://bourgeoa.solidcommunity.net/public/alain.html
<html>
<script type="text/turtle" id="data">
<> a "test".
</script>
<body>test data island</body>
Fantastic!
A few things:
- the closing
</html>
tag seems missing - body should be empty
- I cant see the mashlib script (maybe Im wrong)
- Is there some way that the type can be set (say in a config) to json-ld?
In short, everything should be exactly how is was before. With html / head / body / mashlib. The only difference is one extra script tag containing RDF. So the change to the page should be quite minor.
Another point, while RDF is being returned in this one file, RDF needs to be returned by NSS for every file
Example Resource
https://bourgeoa.solidcommunity.net/public/approxlocation.ttl
HTTP GET with Curl
curl -H "Accept: text/html" https://bourgeoa.solidcommunity.net/public/approxlocation.ttl
What is returned (no RDF)
<html><head><meta charset="utf-8"/><title>SolidOS Web App</title><script>document.addEventListener('DOMContentLoaded', function() {
panes.runDataBrowser()
})</script><script defer="defer" src="/mashlib.min.js"></script><link href="/mash.css" rel="stylesheet"></head><body id="PageBody"><header id="PageHeader"></header><div class="TabulatorOutline" id="DummyUUID" role="main"><table id="outline"></table><div id="GlobalDashboard"></div></div><footer id="PageFooter"></footer></body></html>
What SHOULD be returned (includes RDF)
<html><head><meta charset="utf-8"/><title>SolidOS Web App</title>
<!--- DATA ISLAND SHOULD GO IN HERE -->
<script>document.addEventListener('DOMContentLoaded', function() {
panes.runDataBrowser()
})</script><script defer="defer" src="/mashlib.min.js"></script><link href="/mash.css" rel="stylesheet"></head><body id="PageBody"><header id="PageHeader"></header><div class="TabulatorOutline" id="DummyUUID" role="main"><table id="outline"></table><div id="GlobalDashboard"></div></div><footer id="PageFooter"></footer></body></html>
It is a bad html. But it is example. I mistyped the closing html tag There is no mashlib script as I explained URL pointing to an html is not using mashlib.
But https://bourgeoa.solidcommunity.net/public/ which is a container URL does use mashlib app. (Container have a turtle representation.
There is no mashlib script
Mashlib is needed. It it there today. It should not be removed.
Nothing should be removed, only a script tag added, which contains RDF
A good example to test would be: https://bourgeoa.solidcommunity.net/public/approxlocation.ttl
Another point, while RDF is being returned in this one file, RDF needs to be returned by NSS for every file
@timbl do you agree with that ? This seems a very interesting point.
@bourgeoa wrote:
a
script
tag. (Can there be multiple occurrences ? consider only one for now)
Yes, there may be multiple script
tag occurrences. It's often best to have one occurrence per media type, but as long as each occurrence has a unique id
(or no id
), this limit need not be observed.