Dexie.js icon indicating copy to clipboard operation
Dexie.js copied to clipboard

.ToArray() Performance

Open yazarloo opened this issue 8 years ago • 8 comments

Thank you dear David Fahlander. I found Dexie as a best solution for storing and recovering data at the client side. I implemented a solution with mixing the Dexie.js and Linq.js. It's amazing :) . But I think there is a bottleneck. When I want to select all data of a table with around 100k records and converting it to array there is a delay about 5 second. That is an issue that I don't know how to overcome that. Best regards.

yazarloo avatar Nov 05 '17 06:11 yazarloo

Thanks for reporting. 100,000 records is a lot and whether 5 seconds is a long time to retrieve 100 k records depends on the size of each record. It also depends on which browser and version is being used, as most modern browsers supports IDBObjectStore.getAll() which basically does the entire query natively without any js callbacks involved, and since Dexie utilizes that method on browsers where it's available and can be used (which it can when querying entire table), Dexie could not be the bottleneck here, unless you are using IE, Edge or an old version of other browsers.

What is the use case exactly? Would it be doable to use some kind of paging?

dfahlander avatar Nov 05 '17 22:11 dfahlander

So my questions are:

  • browser version?
  • approx size of records?
  • your use case explained.

dfahlander avatar Nov 05 '17 22:11 dfahlander

Thank you dear @dfahlander . browser version: Latest Version of Chrome approx size of records: 20 MB Each record has 4 columns. (Name, ID, Province,City )

I have a dropdown multiselectist which should be filled by ID and Name by mentioned table so that province and the city of any record must be compared with Provinces[] and Cities[] which this two arrays exist in the code. Handling the comparison is possible with linq.js and there is no problem but the time consuming part(about 5 seconds) is related to "db.table("TableName").toArray()" request. you mentioned "IDBObjectStore.getAll() " and I should test it :) Best regards

yazarloo avatar Nov 06 '17 11:11 yazarloo

Same issue for me. 5000 elements, i read 2000 with let records = db.mycollection.where({...}) // it takes 1ms records = records.toArray(async ass => { records = arr }) // it takes 5sec

Skeinet avatar Mar 17 '20 11:03 Skeinet

This is a major issue still in 2024 after years since originally posted. In my tests even when a table has no data the toArray() supposed to return empty array it still takes 2s and above. and when there are 20 items in the table it still takes more than 2s. but if I remove the toArray() function then performance indicates the query happens within 600ms. I understand the asynchronous nature of IndexedDB maybe to blame. like from opening a transaction, creating a cursor, and iterating over the potential entries may take some overhead. but is there no way a around it? like checking for if table is empty before iterating, or minimizing promises in valueMapper maybe? just some thoughts.

DK013 avatar Sep 26 '24 14:09 DK013

This is a major issue still in 2024 after years since originally posted. In my tests even when a table has no data the toArray() supposed to return empty array it still takes 2s and above. and when there are 20 items in the table it still takes more than 2s. but if I remove the toArray() function then performance indicates the query happens within 600ms. I understand the asynchronous nature of IndexedDB maybe to blame. like from opening a transaction, creating a cursor, and iterating over the potential entries may take some overhead. but is there no way a around it? like checking for if table is empty before iterating, or minimizing promises in valueMapper maybe? just some thoughts.

There's a big difference on the query performance depending on whether an index is being used to query or if dexie has to go through every record and compare "manually". A query such as db.items.filter(x => /foo/.test(x.bar)).toArray() may take a long time if the table contains many records even though the end result would be an empty array.

On the other hand, a query that uses and index, such as db.items.where('bar').startsWith('foo').toArray() would be extremely fast no matter the size of the table.

Queries that are fast are index-based queries that do not require a cursor. These are:

  • equals
  • startsWith
  • below
  • above
  • between

Adding limit(x) is also fast but using offest(x) will make dexie have to use cursors. Cursor based queries shall also be fast unless the result is large or there's a filter attached.

dfahlander avatar Sep 26 '24 15:09 dfahlander

This is a major issue still in 2024 after years since originally posted. In my tests even when a table has no data the toArray() supposed to return empty array it still takes 2s and above. and when there are 20 items in the table it still takes more than 2s. but if I remove the toArray() function then performance indicates the query happens within 600ms. I understand the asynchronous nature of IndexedDB maybe to blame. like from opening a transaction, creating a cursor, and iterating over the potential entries may take some overhead. but is there no way a around it? like checking for if table is empty before iterating, or minimizing promises in valueMapper maybe? just some thoughts.

There's a big difference on the query performance depending on whether an index is being used to query or if dexie has to go through every record and compare "manually". A query such as db.items.filter(x => /foo/.test(x.bar)).toArray() may take a long time if the table contains many records even though the end result would be an empty array.

On the other hand, a query that uses and index, such as db.items.where('bar').startsWith('foo').toArray() would be extremely fast no matter the size of the table.

Queries that are fast are index-based queries that do not require a cursor. These are:

  • equals
  • startsWith
  • below
  • above
  • between

Adding limit(x) is also fast but using offest(x) will make dexie have to use cursors. Cursor based queries shall also be fast unless the result is large or there's a filter attached.

Hi, @dfahlander ,

I am on a similar performance issue. In a table with >165k records, I apply

where('[idx1+idx2+idx3]').equals([idx1Value, idx2Value, idx3Value]).and((r) => r.time >= time)

is this and requiring a cursor and slowing down my query? Because with >165k records, I got more than 5 seconds to get 0 records.

Is there another way to filter by indexes and also the time together? time is my PK.

josemarcilio avatar Feb 11 '25 13:02 josemarcilio

Yes, include time with the existing index and use between() as such:

const db = new Dexie('dbname');
db.version(1).stores({
  myTable: 'primaryKey, [idx1+idx2+idx3+time]`
});

function doTheQuery(idx1Value, idx2Value, idx3Value, time) {
  return db.myTable.where('[idx1+idx2+idx3+time]').between(
    [idx1Value, idx2Value, idx3Value, time],
    [idx1Value, idx2Value, idx3Value, Dexie.maxKey]
  ).toArray();
}

This will allow for querying in an optimized way (using getAll() under the hood) and utilize a single index for all the criterias including time.

dfahlander avatar Feb 11 '25 15:02 dfahlander