webdb
webdb copied to clipboard
NotFoundError when an indexed file is removed
After WebDB has indexed a file, if that file is then removed (outside of webdb), a NotFoundError
is thrown. The record for that file seems to remain in the database.
This may be a "works as designed" situation, as I may not be thinking of all use cases. However, it seems like it would be a good idea to handle this error and unindex the file, when that indexed file is removed from the archive. This would probably be a catch (maybe emitting a 'index-file-missing' event or something) on archive.download()
in the following line:
archive.fileEvents.addEventListener('invalidated', ({path}) => archive.download(path))
https://github.com/beakerbrowser/webdb/blob/master/lib/indexer.js#L48
I'm happy to submit a PR, but thought I'd open an issue to be sure I'm not missing something.
Hmm! A file deletion is intended to be interpreted as a deletion of the record and the data should be unindexed. See this code. Maybe this is a bug?
So it looks like deletion is indeed happening. When I initially looked there was apparently some lingering cruft in the database. Cleared the site data, and things have been running smoothly since. Further, I added some debug output to the locaion you linked, ran some tests, and verified that unindexFile does fire and the record is deleted. However, the NotFoundError
error is still thrown on this line. I'll continue testing to try and understand what's happening here.
Ok sounds like two bugs:
- Indexed data got out of sync with the source data
- Calling
.download()
on invalidate is triggering an error when the file is deleted
Not sure if we should bother fixing 2, or just suppress the error, but 1 is something I've noticed myself when there's a connectivity issue. I think the issue is that WebDB doesn't properly account for download failures. I'm not sure why deleting data wouldn't get handled, though, so I'll need to poke at it a bit.
So, at least in my case, I think the index got out of sync as I was developing the source archive. Data changed a little bit, and I did a lot of starting and stopping of the archive. I have a feeling that's where things diverged. After clearing the db in beaker two days ago I've yet to see lingering records re-emerge. I'll keep my eye out to see if I can catch it happening, if it ever happens again.
I've looked into issue 2 with a variety of setups, and it seems to happen every time. The error does not seem to hinder any function, and is mostly an annoyance. Would it be best to catch the error right there at the invalidate .download()
call? If so I can take a stab at a PR.
Thanks but I'll write some tests and figure out how to handle download failures first