solid-spec
solid-spec copied to clipboard
Remove globbing
This PR implements the (pending!) proposal to remove globbing at https://github.com/solid/solid-spec/issues/145
It bypasses clarifying the definition of globbing (https://github.com/solid/solid-spec/pull/148) by just removing it altogether, given that currently seems to be what the majority wants.
Just putting it out here as a possible option, no rush.
Client-side globbing alternative implemented in https://github.com/solid/ldp-glob; live demo at https://solid.github.io/ldp-glob/demo.html?https://drive.verborgh.org/public/
This would be for version 0.8 of the spec then. We need to discuss the timeline for that. I agree with Ruben about removal of globbing in the next spec version, but I agree with Melvin about moving slowly and not breaking things every few weeks. About timeline, my gut says let's do a next spec version 0.8 in December, and not rock the boat before that. But let's discuss that in the next weekly meeting!
Discussed out of band with @melvincarvalho; this should be on hold until he and @timbl can discuss.
@michielbdejong Yes, but we should avoid that people start implementing globbing if it is going to be removed, so a note or label in the spec would be useful. And of course https://github.com/solid/solid-spec/pull/148 which aligns the spec with the actual situation.
I'm using globbing to retrieve hundreds to thousands of metafiles https://github.com/linonetwo/solid-tiddlywiki-syncadaptor/issues/4#issuecomment-491118236
I can't afford to do this client side, because there will be hundreds to thousands of wiki pages in that container, so there will be a huge amount of client-side fetch running concurrently.
@linonetwo have you run benchmarks comparing globing approach to https://github.com/solid/ldp-glob with HTTP/2 enabled?
@elf-pavlik Do I need to start solid-server as a library, and use spdy to enable HTTP2 in my server?
I think you could also just run it behind nginx and enable HTTP/2 in your nginx config
I'm using globbing to retrieve hundreds to thousands of metafiles linonetwo/solid-tiddlywiki-syncadaptor#4 (comment)
Thanks for sharing this use case, it's good to know what's out there.
May I ask for a bit more detail here?
What you seem to be using is .meta.*
; however, this is a kind of pattern that is not supported across Solid servers (see #147). The kind of globbing that is currently in use, is only /*
, so all files in a directory.
How does this affect your use case? (For instance, could you put your files in a meta
subfolder?)
Another question I have is about the necessity of this design choice: could you give us some insights into the motivation for splitting data across this many files? (There might very well exist a more generic motivation, so eager to learn about it.)
A concern I do have is that, even for the server, thousands of files would turn this into a very expensive request, which ties into my DDOS worries regarding globbing (#145).
I can't afford to do this client side, because there will be hundreds to thousands of wiki pages in that container, so there will be a huge amount of client-side fetch running concurrently.
Point taken—except for "concurrently"; the browser will take tare of this, and with HTTP/2 there should only be a very limited overhead. Emphasis on should, because the per-request cost of NSS is currently too high, so it will be significantly slower with the magnitude of files you are naming.
That said, whatever design decision we make, having thousands of files in a single folder is bound to cause trouble one way or another. Not just for Solid, but for *nix or Windows systems too. So I believe the information architecture here can likely be more optimal. But please feel free to further expand on your use case, so we understand where the scale comes from.
Well, I've reconsidered it:
- I won't use xxx.meta to store "metadata (like tags) generated by the user and my application" anymore, because https://github.com/solid/solid-spec/issues/168 can't GET and DELETE xxx.meta
- I will use SPARQL to update and read a single
index.metafile.ttl
instead, and create all files using Link<http://www.w3.org/ns/ldp#Resource>; rel="type", <index.metafile.ttl>; rel="describedby"
.
I'm not sure if ./meta/index.ttl
or ./index.metafile.ttl
are good name https://github.com/linonetwo/solid-tiddlywiki-syncadaptor/issues/4#issuecomment-491519312.
The reason I choose to use globbing was "it's the easier way to get my POC app working, and the document is simple and certain", but actually I can use SPARQL instead, while I'm not pretty sure if it will work.
I draw a picture while I was thought about this, it may better describe the motivation. I'm creating a saver plugin for TiddlyWiki, which is a semantic wiki: