Dexie.js
Dexie.js copied to clipboard
Bulk query - performance report
Hi there. I've noticed that you've requested real-life performance numbers, so I'd like to share my research when hunting for the problem of our slow queries.
DATABASE SETUP
1 Database loaded in browser, multiple tables with data items ranging from 10 items to 20000. WebPack compiling to ES2017 target. Our web app is built on React + Redux stack, using Sagas, hooks and async selectors. I can share more info if requested, but it's not very important as I've tested just the dexie performance itself (see below)
Selector tested on a table with 447 items. Item type:
export type RawCodingItem = {
name: string,
codes_to_meanings: CodesToMeanings, // [code: meaning]
codes_to_concepts?: CodesToMeanings, // [code: concept]
display?: DisplayHierarchy // Display order
}
HW & SW
- Chrome Version 87.0.4280.67 (Official Build) (x86_64) running anonymous window
- MacBook Pro (13-inch, 2019, Four Thunderbolt 3 ports)
- Node v12.7.0
- Dexie 3.0.0-rc7
USE CASE
Select multiple coding items via the primary key. We're selecting about 90% of coding items in our case.
CONTROLLED ENVIRONMENT
- All re-runs were done with exactly the same state of application, with the same database
- all API calls were finished
- no JS was processing in the background
- no repaints or any other DOM manipulation took place
- time measured in the Dexie selector itself (so it's not affected by rest of the stack) with
new Date().getTime()
before and after the selector itself (using await)
RESULTS
SINGLE RUN
(only one selector fired)
Promise.all(name => db.codings.where({name})) - 6836 ms -
const codingObject = await Promise.all(names.map(name => db.codings.where({name}).toArray()));
const filteredCodings = codingObject.flat();
- 6689 ms
- 7183 ms
- 6582 ms
- 5972 ms
- 7758 ms
bulkGet.toArray().filter() - 2318 ms -
const codingObject = await db.codings.bulkGet(names);
const filteredCodings = codingObject.filter(({name}) => names.includes(name));
- 2076 ms
- 2165 ms
- 3083 ms
- 2085 ms
- 2183 ms
Collection.where().anyOf() - 167 ms -
const filteredCodings = await db.codings
.where("name")
.anyOf(names)
.toArray();
- 134 ms
- 244 ms
- 142 ms
- 173 ms
- 141 ms
Collection.filter().toArray() - 140 ms -
const filteredCodings = await db.codings.filter(({name}) => names.includes(name)).toArray();
- 181 ms
- 121 ms
- 158 ms
- 123 ms
- 116 ms
Collection.toArray().filter() - 92 ms -
const codingObject = await db.codings.toArray();
const filteredCodings = codingObject.filter(({name}) => names.includes(name));
- 83 ms
- 114 ms
- 83 ms
- 84 ms
- 96 ms
CONCURRENT RUN
(2 instances of the same selector fired at the same time)
Promise.all(name => db.codings.where({name}))
- 14580 / 10851 ms -
- 13934 / 10402 ms
- 15146 / 11473 ms
- 14660 / 10678 ms
bulkGet.toArray().filter()
- 5030 / 4262 ms -
- 4523 / 4220 ms
- 5948 / 4166 ms
- 4620 / 4399 ms
Collection.where().anyOf()
- 295 / 968 ms -
- 245 / 835 ms
- 266 / 974 ms
- 351 / 1018 ms
- 358 / 1008 ms
- 255 / 1005 ms
Collection.filter().toArray()
- 263 / 879 ms -
- 312 / 899 ms
- 259 / 854 ms
- 296 / 948 ms
- 211 / 773 ms
- 240 / 910 ms
Collection.toArray().filter()
- 138 / 774 ms -
- 85 / 735 ms
- 124 / 768 ms
- 182 / 823 ms
- 206 / 813 ms
- 94 / 732 ms
CONCLUSION
Getting all the items from dexie with toArray()
and then filtering them with JS .filter
function was the fastest in both single and double instance selector. Second fastest in both cases was the Collection.filter().toArray()
method.
Interesting observation is that running two selectors on the same table at the same time causes huge performance problems, so this might be a good start for everyone struggling with UI freeze. Thanks to this I've actually discovered that we were doing exactly that by accident :D
Note: I will add one more performance report tommorow. We have a selector querying 20000 objects and then filtering them. Currently it takes ~500ms for DB query and another ~500ms for JS parsing.
Will look into this further
Thanks a lot @Ruskyj! It would be a great benefit if you could share a link to the source code also. I'd like to trace down the reasons of performance issues with Promise.all(name => db.codings.where({name}))
. In the long term I'd also like to have this test scriptable and make performance reports for both old and future releases.
Access to the source code won't be possible unfortunately, as this is a client's project behind NDA. I can share some extra snippets if needed (our Dexie -> Redux selectors for example).
@dfahlander I've run another set of tests with much bigger table. I've also tried to upgrade to Dexie 3.0.3, with no measurable effect on performance. If you're interested, I can prepare an example of our selector stack (so I won't break the NDA).
DEXIE Fields table run (19627)
Query for 19559 fields (99%), Querying by secondary key
Same app, same database, same environment and testing riles
Table item:
export type RawFieldItem = {
// Field detail
is_hierarchical: boolean,
is_multi_select: boolean,
is_primary_key: boolean,
is_sparse_coding: boolean,
name: string,
primary_table_name: string,
serial_type: FIELD_TYPE,
title: string,
type: FIELD_TYPE,
// Optional keys
coding_name?: string,
concept?: string,
description?: string,
folder_path?: string[],
linkout?: string,
longitudinal_axis_type?: "primary" | "secondary"
units?: string,
mapping?: {
column: string,
column_sql_type: string,
database_id: string,
database_name: string,
database_unique_name: string,
table: string,
},
referenced_entity_field?: {
entity: string,
field: string,
database_id: string,
database_name: string,
database_unique_name: string,
table: string,
column: string,
}
stats?: {
min: number,
max: number,
avg: number,
null_count: number,
}
}
Results
Promise.all(name => db.codings.where({name}))
- N/A ms -
- Not applicable, as we’re filtering by one key value only
bulkGet.toArray().filter()
- N/A ms -
- Not applicable, as required filter is not a primary key
Collection.where().equals()
- 423 ms -
- 420 ms
- 426 ms
- 417 ms
- 423 ms
- 431 ms
Collection.filter().toArray()
- 1614 ms -
- 1457 ms
- 1386 ms
- 1745 ms
- 1854 ms
- 1627 ms
Collection.toArray().filter()
- 339 ms -
- 351 ms
- 339 ms
- 342 ms
- 330 ms
- 335 ms
CONCLUSION
Similar results as with smaller table. Querying everything and then filtering it with vanilla JS is still the fastest.
Interesting change was a big performance drop of Collection.filter
method when dealing with bigger table.
Number for Collection.where().equals() surprice a bit. But it doesn't say so much without knowing how the code is executed, awaited and whether it's in a transaction or not, how large the queried indexes are, etc.
Thanks anyway for sharing the numbers! We will need to create a performance test for Dexie where we have the entire preconditionset - source and data, and continue from there.
@dfahlander I've tried to put our whole selector stack (from DB to React) in this example. Business logic was removed as it's not important for tests. All performance results were performed in the DB selector itself (see comments).
Let me know if this helps you, or I'm doing something really bad here :D
EXAMPLE STACK
// Database constructor
class DeepStateDB extends Dexie {
public fields!: Dexie.Table<FieldItem, string>;
public folders!: Dexie.Table<FolderHierachicalItem, DATASET_FIELD_TYPES>;
public codings!: Dexie.Table<RawCodingItem, string>;
public config!: Dexie.Table<ConfigItem, CONFIG_TYPES>;
public entities!: Dexie.Table<EntityItem, string>;
constructor(recordID) {
super(recordID);
const db = this;
db.version(VERSIONS.DATABASE).stores({
fields: "[entity+name], database, entity, name",
folders: "type",
codings: "name",
config: "type",
entities: "name",
});
this.fields = this.table("fields");
this.folders = this.table("folders");
this.codings = this.table("codings");
this.config = this.table("config");
this.entities = this.table("entities");
}
}
// Database acessor
export const DeepStateRecord = (recordID: string): DeepStateDB => new DeepStateDB(recordID);
// Unviversal selector
async function createDBselector<S extends(...args: any[]) => any>(
recordID: string | undefined,
selectorFunc: S,
...args
): Promise<ReturnType<S>> {
if (!recordID) throw new Error("Can't open a database connection without record-id!");
const db = DeepStateRecord(recordID);
const selectorResult = await selectorFunc(db, ...args);
return selectorResult;
}
// Specific combined selector
export async function asyncSelectPrimaryPhenoField(
db: DeepStateDB,
): Promise<FieldItem | undefined> {
/* ---------------- PERFORMANCE WAS MEASURED HERE ---------------- */
const primaryFieldPath = await asyncSelectPrimaryPhenoFieldPath(db);
const primaryField = await dbSelectSingleField(db, primaryFieldPath);
return primaryField;
}
// DB selector
export async function dbSelectSingleField(
db: DeepStateDB,
fieldPath: FieldID,
): Promise<FieldItem | undefined> {
const singleField = await db.fields
.where(["entity+name"])
.equals(fieldPath)
.first();
return singleField;
}
/* ********* EXAMPLES OF USAGE ********* */
// Sagas
function* exampleSaga(recordID) {
const result = yield call(createDBselector, recordID, asyncSelectPrimaryPhenoField);
return result;
}
// Classic async
async function exampleAsync(recordID) {
const result = await createDBselector(recordID, asyncSelectPrimaryPhenoField);
return result;
}
// Using custom fork of createSyncSelector from "async-selector-kit" with small changes to redux dispatch
export const [selectResult, selectResultLoading] = createAsyncSelectorWithDispatch(
{
async: async(recordID) => createDBselector(recordID, asyncSelectPrimaryPhenoField),
id: "componentSelectorExample",
},
[
selectRecordID,
],
);
Is a new Dexie instance created for every query? If so it could explain a longer response time due to the time it takes for the browser to open the database. It is recommended to have a single instance of Dexie in your app so that it only needs to open on page load. It would also explain why parallel queries would be slow as database opening can only be done one at a time.
I believe we only create one instance per page mount (as we define it once and then use that instance to access any database selector).
Is there an easy way how to test it/count the instances?
The number of instances can be counted using Dexie.connections.length
. It's strange that a plain equals- request takes seconds and not milliseconds. In this fiddle we get a completely different performance, so something must be very different.
It's possible that our abstraction layer is creating some sort of bottleneck. I'll test the number of connections and report back.
Whole implementation is outlined in the "example" stack comment. Please let me know if you see any possible issues straight away.
Any updates on this?
I guess this issue is related.