dat-node icon indicating copy to clipboard operation
dat-node copied to clipboard

How do you tell when a dat download has finished?

Open ryuheiyokokawa opened this issue 5 years ago • 4 comments

I am reporting:

  • [x] a bug or unexpected behavior
  • [ ] general feedback
  • [x] feature request
  • [ ] security issue

Bug Report

  • Operating system: macOS High Sierra / Fedora 27 (both)
  • Node Version: 8.9.3
  • dat-node Version: 3.5.12 (whatever is pulling from npm as latest at the moment)

Expected behavior/methods

We can't tell when the dat is finished downloading definitively and safely. Maybe have something like dat.on('finished-download') or similar?

Actual behavior

No documentation on how we should detecting finished download?

const stats = dat.trackStats()
stats.on('update', () => {
    const newStats = stats.get()
    if(newStats.length.length == newStats.downloaded.length) {
        //Is it done?
    }
})

Also looked into this:

dat.archive.stat(newDatPath, { wait: true } ,() => {})

This seems closer looking at the hyperdrive archive to see if its finished. This didn't make the next issue go away though. It looks like there is a delay between the time the blocks are downloaded and the time its written to disk. Is this the reason why you guys use ram and mirror the files over to have a way of knowing if the files have finished downloads? Saw that in the example here.

We weren't sure we'd want to use RAM for something that could be gigs in size. Thoughts?

ryuheiyokokawa avatar Sep 14 '18 23:09 ryuheiyokokawa

Hey! There are a few approaches to this, and unfortunately it can get complicated quickly depending on the state of your archive.

The most basic is to either:

  • Use replication option live: false and wait for the replication stream to end see docs on replication
  • If you are only fetching latest, you can listen for archive.on('sync') (unfortunately, this is not documented but you can see in the code...). See how we use this in the command line for tips.

Beyond that, it can get more complicated depending on whether there are partial downloads, etc. Happy to fill in more there if the above does not work.

joehand avatar Sep 17 '18 19:09 joehand

@joehand Thanks for the tricks! We're using archive.on('sync') and that seems to do it. We're currently not doing sparse or versions so should be okay for now. Thanks!

ryuheiyokokawa avatar Sep 17 '18 20:09 ryuheiyokokawa

archive.on('sync') seems to fire double the amount of times I listen to it... if that makes sense. If i do dat.archive.on('sync', console.dir(JSON.parse(fs.readFileSync('path/to/file'))) I can reliably make it fire once the first sync, twice the second, etc

jedahan avatar Mar 14 '19 00:03 jedahan

@jedahan yep, seems like that may be related to #233 .

joehand avatar Mar 15 '19 19:03 joehand