hyperdrive
hyperdrive copied to clipboard
readdir not waiting metadata to be downloaded before firing callback
drive.readFile()
will wait for its metadata to finish downloading before firing callback. drive.readdir
does not.
Here's an example
const hyperdrive = require('hyperdrive')
const swarm = require('hyperdiscovery')
const ram = require('random-access-memory')
var a1 = hyperdrive(ram)
var a2
a1.ready(function () {
swarm(a1)
a2 = hyperdrive(ram, a1.key)
a1.writeFile('/foo', 'bar', function () {
a2.ready(function () {
swarm(a2)
// readdir and readFile have different behavior
a2.readdir('/', function (err, data) {
console.log('readdir', data) // === []
})
a2.readFile('/foo', function (err, data) {
console.log('readfile', data) // buffer
})
// you can see metadata is still being download when readdir callback is fired
a2.metadata.on('download', (idx, data) => console.log('download', idx, data))
})
})
})
Looks like readdir()
and other similar methods need to wait for _ensureContent()
, just like createReadStream()
?.
@mafintosh mentioned that he's been using a trick where you use archive.metadata.update({ifAvailable: true}, () => archive.readdir())
so that it tries to download some available blocks before reading.
However this doesn't work unless you have a peer connected.
At the moment I've been experimenting with doing something like the following:
- use
readdir('/')
- If it isn't empty, we're good to go
- else, wait for a peer to join, and
use archive.metadata.update
- have a timeout to account for peers never joining
It's pretty messed up tbh. 😅 Gonna keep iterating to see if I can simplify it.
Actually, this code seems to be working okay-ish
const someArchive = Hyperdrive(SOME_URL)
reallyReady(someArchive, () => {
someArchive.readdir('/', console.log)
})
function reallyReady (archive, cb) {
let wasReady = false
archive.metadata.once('sync', tryReady)
archive.readdir('/', function (e, d) {
if (e) return
if(!d.length) return
console.log('Already loaded metadata?')
wasReady = true
cb()
})
function tryReady () {
if (wasReady) return
console.log('Got an append event so it must be loaded')
wasReady = true
cb()
}
}
This is something that interests me as well! Glad you're tracking it @RangerMauve
Yeah, it's been a pain point for @serapath 's work with the SDK so I'm looking to see how to alleviate it. :)
waiting for append instead of sync is prob a lot faster. The optimal flow using the update method though as that makes it update outside the first load, ie you’ll always get the latest update.
If you want it to wait forever, I’d suggest to hook up the peer-add event and retry with ifAvailable then
K, check this out:
const someArchive = Hyperdrive(SOME_URL)
reallyReady(someArchive, () => {
someArchive.readdir('/', console.log)
})
function reallyReady (archive, cb) {
let wasReady = false
if(archive.metadata.peers.length) {
archive.metadata.update({ifAvailable: true}, cb)
} else {
archive.metadata.once('peer-add', () => {
archive.metadata.update({ifAvailable: true}, cb)
})
}
}
A timeout should be wrapped around the call since that's a bit more opinionated.
Also, this shouldn't be invoked if the application somehow knows it's offline. (Does hyperswarm provide this?)
I'm going to have this in the SDK docs for now, maybe later on we can figure out if this is something we can integrate directly into append-tree or hypertrie.
This pattern should be incorporated into the daemon, I think. Exposing enough stuff through the RPC API sounds like it'll be a PITA.
CC @andrewosh
The daemon already does ifAvailable updates before returning any calls :)
Perfect! :D
And so does hyperdrive 10 in general through the trie btw
How does hypertrie avoid the situation where you don't have any local peers to wait for updates from?
It doesn’t, but it always does an ifAvailable update.
You are touching on an interesting point though as there is no perfect to the solution you describe which is why the stack doesn’t magically do it for you except for updating if available.
You have to “play the map”.
What is your requirements? Do you want to “block” until a peer appears? Are you in an offline environment? Do you want to fully sync? Do you want to wait for a bit and then return an old snapshot? It all depends on what you are trying to build.
We can expose primitives and options to help guide you but at the end of the day the stack can’t solve this for everyone so it only does the least opionated thing it can - uodate ifAvailable