unstorage feat(drivers): webdav driver

🔗 Linked issue

Issue #275

❓ Type of change

[x] 📖 Documentation (updates to the documentation, readme, or JSdoc annotations)
[ ] 🐞 Bug fix (a non-breaking change that fixes an issue)
[ ] 👌 Enhancement (improving an existing functionality like performance)
[x] ✨ New feature (a non-breaking change that adds functionality)
[ ] 🧹 Chore (updates to the build process or auxiliary tools and libraries)
[ ] ⚠️ Breaking change (fix or feature that would cause existing functionality to change)

📚 Description

Webdav driver for unstorage powered by the webdav NPM package.

📝 Checklist

[x] I have linked an issue or discussion.
[x] I have updated the documentation accordingly.

Considerations

The driver currently only implements read-only functions.

To-do

[ ] Implement setItem and removeItem, leveraging putFileContents and deleteFile

Jul 23 '23 22:07 shareable-vision

Live Preview ready!

Name	Edit	Preview	Latest Commit
unstorage	Edit on Studio ↗︎	View Live Preview	a01ef02ab91e19f5d89b471fea1df143c207a6f1

Jul 23 '23 22:07 nuxt-studio[bot]

Driver (read-only) is tested and working with Nuxt Content.

Jul 24 '23 10:07 shareable-vision

Hi dear @shareable-vision i love the idea of supporting webdav as a driver!

Checking the webdav npm package used, it seems little bit overcomplex. I think we probably can directly implement a webdav client only using native fetch API.

An (old) example to inspire: https://github.com/ai-zelenin/webdav-client-fetch/blob/master/ts/index.ts

Do you think you can try to reimplement it with fetch only and no dependencies? 🙏🏼

Jul 24 '23 14:07 pi0

Do you think you can try to reimplement it with fetch only and no dependencies? 🙏🏼

That's a great idea. I will work on it.

Jul 24 '23 20:07 shareable-vision

WebDav server sends XML data. Although testing with NextCloud, I'm cross-referencing with other authorities and avoiding any namespaced properties, keeping it minimal and hopefully generally compatible.

Native fetch is great . .. but there is not a Node equivalent of browser-native DOMParser.

If we can use xml2js, then I'm almost done re-implementing read-only driver. Installed in an empty repo, xml2js clocks in at about 250kb. Or xpath.

Any idea for reliably parsing XML with no dependencies? —folks highly discourage parsing XML with regex. What do you think?

—I'm really glad you put me onto this!

Jul 25 '23 02:07 shareable-vision

After taking time to consider, I don't think we should use a manual regex approach for parsing the XML for the sake of having zero dependencies, while I really appreciate that you got me to thinking about the efficiency of the package that we're leveraging.

WebDav is a standardization of XML —the parsing of the XML responses is key to the function of the driver, and it is reasonable to require that a user install an additional peer dependency.

fast-xml-parser looks like it will be a good fit.

Jul 25 '23 20:07 shareable-vision

Hi dear @shareable-vision. Sorry for answering late and thanks for willing to continue working on this.

Nice point about XML parser lacking. Looking at webdav dependencies, they use fast-xml-parser i would recommend to try it.

Usng bundlephobia.com, fast-xml-parser is 25KB vs JSDom 2.4 MB.

We might find something even smaller but i think that's a good start.

Jul 25 '23 20:07 pi0

Just pushed re-implementation of read-only version of the Webdav driver using 'fast-xml-parser'.

I will work on implementing setItem() and removeItem() next.

Will have to study Webdav resource locking.

Jul 27 '23 04:07 shareable-vision

I want to work on this a little longer and make some improvements. One very major improvement that I see now will be to use the Depth header —currently implementation is recursively re-fetching subdirectories and awaiting a Promise.all. I tried the header with a faulty value ('Infinity') —now I see that a value of 'infinity' will eliminate the re-fetching and speed up getKeys() —specs say that server should support "infinity requests" —could implement re-fetching as a fall-back mechanism, but it's probably not necessary.

I'm also mulling over the advantages of polling. Will study the other drivers more about this . . I think where I'm landing is that it would be good to provide optional caching and polling, which are each disabled with a default value of 0.

WebDAV uses 'etag' to version files. When re-syncing does need to happen, the etag can be referenced to quickly validate the cache.

Jul 27 '23 04:07 shareable-vision

Thanks for updates and nice work on this @shareable-vision

I have marked PR as draft in meantime please ping when felt ready

I will also check locally in coming days and try to help if fine for you 👍🏼

Jul 27 '23 15:07 pi0

Perfect :+1:

Did implement Depth: "infinity" header as default with config option to fallback if unsupported —much better.

Will do write functions as best as I can without getting into the weeds about locked resources.

Will touch base soon.

Jul 27 '23 22:07 shareable-vision

Almost ready for review. Open to feedback.

setItem() and removeItem() are implemented, honoring all WebDAV specs requirements.

setItem() will ensure that the target parent directory exists by making MKCOL requests on each non-existent ancestor directory. If 'type' key on opts is set to "directory", setItem() will make a directory on WebDAV storage instead of uploading a file.

WebDAV defines etag as a versioning mechanism that can be used for cache validation. When getKeys() updates meta for an item, if the etag changes, then the item.body is deleted, thus is no longer "cached".

Concerns:

Can optimize PUT request?: I'm not sure if the PUT requests may be optimized by using a ReadableStream. Probably because they are already utf8-encoded, this won't make any difference . .

What should be the desired behavior of the ttl key and caching? In my mind, the most optimal solution for WebDAV is for the consumer of the driver to receive a Webhook from the WebDAV service and then intentionally re-fetch keys/meta by calling getKeys(). Unstorage itself is the "cache". getItem() and getMeta() should immediately return available data, fetching only as needed. There should be no reason to repeatedly call getKeys() unless the consumer is expecting the values to have changed, or else are unsure and are polling the source.

My thinking previously was that an undefined ttl might signify ("Do not cache"), however another reasonable interpretation would be (TTL of zero or undefined => Always re-fetch), which would be the opposite of my intention.

Polling? I haven't implemented any polling (which might be configurable via an 'interval' key). I'm not sure if it's a good idea. It seems like the driver consumer can do that if they wish simply by re-calling getKeys() as needed. If polling were implemented, the consumer would have to provide a callback function to be called when source changes. Actually configuring an interval, or manually polling may be more efficient than using TTL —when polling, the fetching is happening in the background, not in the middle of a user's request.

Jul 30 '23 01:07 shareable-vision

Driver is ready for review.

Jul 30 '23 16:07 shareable-vision

unstorage unstorage copied to clipboard

feat(drivers): webdav driver

🔗 Linked issue

❓ Type of change

📚 Description

📝 Checklist

Considerations

To-do

✅ Live Preview ready!

Concerns:

unstorage
unstorage copied to clipboard

Live Preview ready!