tingodb icon indicating copy to clipboard operation
tingodb copied to clipboard

Is there any way to speedup count({}) performance?

Open ElderOrb opened this issue 8 years ago • 4 comments

.. like creating additional indexes or using some other approaches?

ElderOrb avatar Mar 10 '16 21:03 ElderOrb

I do believe that current method is already works as fast as possible especially if you specify empty or no query, see here https://github.com/sergeyksv/tingodb/blob/master/lib/tcoll.js#L620 It basically just add into operation queue function that check collections size directly. If it works slow it can only means that you have other operations on collection done in parallel (reads, writes). By the way, how slow it is for you, do you have any measurements?

sergeyksv avatar Mar 11 '16 08:03 sergeyksv

Here is the numbers for 125734 items (~157 Mb)

51 s 9.08 s 14 s 9.12 s

var start = process.hrtime();
Entry.countQ({}).then(function(count) {
  console.log(prettyHrtime(process.hrtime(start)));
  start = process.hrtime();
  return Entry.countQ({})
}).then(function(count) {
  console.log(prettyHrtime(process.hrtime(start)));
  start = process.hrtime();
  return Entry.countQ({})
}).then(function(count) {
  console.log(prettyHrtime(process.hrtime(start)));
  start = process.hrtime();
  return Entry.countQ({})
}).then(function(count) {
  console.log(prettyHrtime(process.hrtime(start)));
  start = process.hrtime();
  return Entry.countQ({})
}).then(function(count) {
  console.log('count: ', count)
})

The first run it is much slower than subsequent - 51s vs 9-14s, which is probably expectable because in-memory indexes are not yet created, but why is it still slow after the first run? Can it be made faster?

By the way, is the any means to store indexes in another file to reuse them after tingodb re-init?

ElderOrb avatar Mar 11 '16 11:03 ElderOrb

First time is something expectable, because full collection scan happens on first access to collection (in order to rebuild indexes). However subsequent calls should be just a milliseconds because no data access should happens. It looks that you using some wrapper (mongoose, ..)? I can't recognize countQ function. Non persistent indexes was one of initial decisions to keep things simple.

sergeyksv avatar Mar 11 '16 12:03 sergeyksv

Yes, I'm using mongoose, and countQ is nothing more than just 'count' which returns Q promise.

entrySchema.index({ hash: 1 }, { unique: true }) entrySchema.set('autoIndex', true);

entrySchema.index({ uploaded: 2}, { unique: false }) Entry = mongoose.model('Entry', entrySchema); Entry.countQ = Q.nfbind(Entry.count.bind(Entry));

I'm ok with non persistent indexes (although would be nice to have API allowing to 'dump' them and then 'reload' manually), but if subsequent calls should be just milliseconds, then something is definitely wrong with my code / mongoose wrapper / etc.

I need to check the performance without any wrappers.

ElderOrb avatar Mar 11 '16 12:03 ElderOrb