absurd-sql icon indicating copy to clipboard operation
absurd-sql copied to clipboard

IDB benchmark would probably be faster with `getAll()`

Open nolanlawson opened this issue 3 years ago • 4 comments

Hey, this is an awesome project and really technically impressive. Hats off!

Just wanted to point out that the IDB benchmark for summing each item would probably be faster using getAll() rather than cursors. It avoids the back-and-forth of iterating through each item in the object store with a cursor.

Now of course, for the 100,000 case, you probably don't want to read 100,000 items into memory at once. But even in that case, you can use batching with getAll() to implement a faster "cursor" (e.g. fetch 100 items at a time instead of just 1).

I did this years ago for PouchDB (https://github.com/pouchdb/pouchdb/pull/6033); it was a ~50% speedup at the time. So absurd-sql would probably still win! :slightly_smiling_face: But it might be interesting to see a comparison with a more "optimized" IDB usage.

nolanlawson avatar Aug 12 '21 23:08 nolanlawson

That's really interesting! I'll try that out tomorrow. Thanks for pointing it out. I would have thought that cursors would have been the fastest possible (they do seem to be that way in Firefox; fetching the next item with a cursor there only takes ~0.02ms, while Chrome is an order of magnitude slower).

I'll play with it!

jlongster avatar Aug 13 '21 02:08 jlongster

@nolanlawson you are correct! Switching to getAll and summing 400,000 items takes ~3100ms instead of a wild ~8100ms. That's Chrome btw, with cursors Firefox only takes ~5100ms (I've found cursors to be a lot faster in Chrome). I've been using 400,000 items as my stress test.

I really expected cursors to be the fastest possible. I'll have to think about this. The problem is that when we are at the read level, we don't know which blocks to expect in the future. We could "prefetch" 10 blocks at a time or something like that, but we'd easily overfetch as SQLite sometimes jumps around and does random access.

Not sure what the right heuristic is. It's probably worth exposing options so apps can fine-tine it for their app.

And yeah, absurd-sql still only takes ~1200ms to sum 400,000 items :)

It's hard to know what's a fair benchmark. In SQLite's case, it's not having to keep everything in memory, so it's not really fair to call getAll and shows those numbers side-by-side. Not sure what the right call is.

jlongster avatar Aug 14 '21 03:08 jlongster

I'll do some more digging, but I think cursors might be find in absurd-sql's case. The problem with cursors seems to be the jumping back and forth between IDB and JS. But because we are effectively batching reads, we're not jumping back and forth nearly as much. If I sum 400,000 items, there's not nearly the same amount of jumping back and forth, there's just the occasional block read when it needs more data. Using a cursor is definitely faster than using individual get requests, so the only other option would be calling getAll with a range that load the next 5 blocks or something. But I'm skeptical that will show much improvement; it might be slightly faster, but there's probably enough cases where it overfetches that voids out the perf benefit.

(I know this isn't what you're saying, just thinking about the right heuristics we should use)

jlongster avatar Aug 14 '21 03:08 jlongster

That's interesting, yeah. If you know what kind of data you're working with, then you can probably choose a reasonable batch size. But for a generic database like absurd-sql it's kind of hard to make that call on behalf of the user. In that PouchDB issue I mentioned overfetching but said I thought it was worth the tradeoff (maybe it was; can't recall :sweat_smile:).

As for what's a fair comparison, I think it's fine either way. You used IDB cursors because that's the idiomatic approach – most IDB users are probably not implementing batched cursors unless they run into perf problems and even know it's a solution. The browser vendors should make cursors faster!

nolanlawson avatar Aug 18 '21 14:08 nolanlawson