"this._changes is undefined" when adding records to auto-loaded db & collection
I'm submitting a...
[ ] Regression (a behavior that used to work and stopped working in a new release)
[X] Bug report
[ ] Feature request
[ ] Documentation issue or request
[ ] Support request
Current behavior
This error shows up after saving 500,000 rows, then refreshing the browser and trying to save the next set of rows. Code is simple to replicate.
Error Code
Uncaught (in promise) TypeError: this._changes is undefined
_createChange collection.ts:1148
_createInsertChange collection.ts:1158
_insertMetaWithChange collection.ts:1167
insertOne collection.ts:726
insert collection.ts:665
Expected behavior
No error, rows inserted into collection.
Minimal reproduction of the problem with instructions
https://jsbin.com/nejijiberu (Copy to local dev server if needed. it's hard to make LokiDB work on jsBin) Same code as below
let testCollection; // we will assign it after persistence
IndexedStorage.register();
PartitioningAdapter.register();
let idbAdapter = new IndexedStorage(dbName);
let paAdapter = new PartitioningAdapter(
idbAdapter,
{
paging: true,
pageSize: 5*1024*1024
}
);
wdb = new Loki(dbName);
let persistenceOptions = {
adapter: paAdapter,
autosave: window.self === window.top,
autosaveInterval: 1000,
autoload: window.self === window.top,
throttledSaves: true,
persistenceMethod: 'indexed-storage',
};
wdb.initializePersistence(persistenceOptions)
.then(() => {
console.log('Loki Persistence initialized');
// Initialize collection
testCollection = wdb.getCollection('Test');
if (!testCollection) testCollection = wdb.addCollection('Test');
// Display number of rows IN collection
const oldRecords = testCollection.find();
console.log('Number of rows loaded from IndexedDB collection:',oldRecords.length);
// Use verbose method to get maximum id of existing records.
const oldIds = oldRecords.map((r) => parseInt(r.id));
const oldIdsLength = oldIds.length;
let maxId = oldIds[0] || 0;
for (let i = 1; i < oldIdsLength; ++i) {
if (oldIds[i] > maxId) {
maxId = oldIds[i];
}
}
// Fetch placeholder data
fetch('https://jsonplaceholder.typicode.com/photos')
.then(response => response.json())
.then(json => {
console.log('Enlarging result set from jsonplaceholder...');
let bSet = json;
// Make set larger
bSet = [
...bSet,...bSet,...bSet,...bSet,...bSet,
...bSet,...bSet,...bSet,...bSet,...bSet,
];
bSet = [
...bSet,...bSet,...bSet,...bSet,...bSet,
...bSet,...bSet,...bSet,...bSet,...bSet,
];
// bSet = [
// ...bSet,...bSet,...bSet,...bSet,...bSet,
// ...bSet,...bSet,...bSet,...bSet,...bSet,
// ];
console.log('Cloning',bSet.length,'records...');
bSet = JSON.parse(JSON.stringify(bSet)); // clone so we can reset id field
console.log('Resetting',bSet.length,'id fields, starting at',maxId+1);
for (i = 0; i < bSet.length; i += 1) bSet[i].id = i+maxId+1; // reset id field
console.log('Adding',bSet.length,'rows from jsonplaceholder into LokiDB...');
testCollection.insert(bSet);
console.log('Added rows from jsonplaceholder. Refresh to see if they stick around.');
})
})
.catch((e) => console.error('initializePersistence error', e.message));
What is the motivation / use case for changing the behavior?
I need to save large numbers of records (about 2 mil) across 80 or so collections. The above bug was encountered while I was attempting to make an easy way to replicate a secondary bug.
The secondary bug: With 80 or so collections and ~2M records saved (and showing up in both IndexedDB and in memory), reloading the browser would cause the collections to vanish, even though I can still see the data sitting in IndexedDB. I figured it'd be best to focus on the primary bug first, then try to figure out why something between LokiDB, IndexedStorage, and PartitioningAdapter was failing to autoload the 80+ collections.
Environment
LokiDB version: 2.1.0
Exact same error in Firefox 81.0 and Chrome 85.0.4183.121.
My current workaround for the main bug mentioned is to put
if (!collection.getChanges()) collection.flushChanges();
just before the collection.insert() line, so that the _changes property is created before the .insert() call. Thanks for accepting this as a bug!
.. and kudos for the speed of LokiDB! In my tests, it outperforms lokiJS by a huge factor!
So the problem is somewhere inside the PartitioningAdapter. But I am not sure where. Never used this one. I will take a deeper look at the weekend.
Thank you so much! I'm looking through the partitioning adapter code, and can't find any references to _changes, so was not sure where else to go.
I think I've run into another part of the glitch with the 2nd bug mentioned as well, and pinpointed the difference. Sometimes after running my own data into it and refreshing the browser (making sure to wait until there are no _dirty collections by looking at wdb._persistenceAdapter._dirtyPartitions.length ), there is a difference between these two data sets:
a) wdb._collections (empty) b) wdb._persistenceAdapter._dbref.collections (has first few collections, then lots of empty ones. Always stops at same collection).
Flow:
- Copy data from external into Loki, and then it autosaves to IndexedDB.
- Wait for wdb._persistenceAdapter._dirtyPartitions.length = 0
- Refresh browser
- Get (a)(b) issue described above.
Oddly, right before refreshing the browser, the count of wdb._persistenceAdapter._dbref.collections where _dirty is true is always 1. And it's always the LAST collection that has that flag set. :-/ Even if I wait for 5-10 minutes, that last collection remains marked as DIRTY, but is not mentioned in wdb._persistenceAdapter._dirtyPartitions. It could just be the way they do loops and unmark the collections, but I find the _dirty flag to be an oddity there.
The real problem is the (a)(b) issue above, where the adapter seems to silently stop loading data from IndexedDB. No errors thrown.
I'm going to try using several different splitting characters to see if that's hitting something in my data, as well as pinpoint exactly the record that's making it drop the ball and will report back here.
2nd Issue Updates
Common results:
- wdb._collections is empty after browser refresh (refreshed browser after loading data from external source, then waiting for all _dirty flags to be OFF, then waiting 1 more minute).
Tests & Results:
- Changing 'Page Size' up to 24MB increases the number of collections in wdb._persistenceAdapter._dbref.collections to between 11-17 that have _data.
- Changing 'Page Size' down to 5 MB results in 5-7 wdb._persistenceAdapter._dbref.collections that have _data.
- Changing 'delimiter' around does not change results much. (tried all sorts of characters and combinations that I know are not in the data)
- Turning paging OFF results: about the same.
- Refreshing after waiting 10 seconds after 0 _dirty collections: results are about the same as refreshing after waiting two minutes after 0 _dirty collections.
- Even with exact same settings, wdb._persistenceAdapter._dbref.collections with _data varies by 3 or 4 collections each time. (looks like that's just based on the order the collections are saved/loaded in. Not all are equal size, and order changes by 4 or 5 slots)
Testing Procedure:
- Chrome DevTools: Application tab: Delete Database (confirm)
- Shift-Refresh page
- Sync data from external source into LokiDB
- Wait for _dirty flag count in be zero. (and then additional time as described above)
- Shift-Refresh page
- Chrome DevTools console:
wdb._persistenceAdapter._dbref._collections.map((c) => `${c.name}: ${c._data.length}`)to view collections and data count.
Conclusion so far: Somewhere between IndexedDB and the Persistance Adapter, things are failing silently.
Is there a command I can use to 'Reload data from iDB into memory, just as Autoload does'? That way I could test without refreshing the browser. I've tried various combinations of all the loadDatabase() calls I could find, but nothing seemed to trigger the same condition as refreshing the browser.
Further Updates
Something looks off with the keys in IndexeDB, right after loading database from external source (before refreshing browser):
And funnily enough, after I refresh the browser and peer into wdb._persistenceAdapter._dbref._collections, that is right where the data stops.
I dropped a console.warn('dirty:', JSON.stringify(this._dirtyPartitions)); into the end of exportDatabase(dbname, dbref). (Marked up where missing)
...and it looks like those 'missing partition numbers' are never being marked as dirty, and thus never being saved to iDB.
As _loadNextPartition() expects the partition numbers to be contiguous, then it seems the bug has to do with the collection/partition not being marked dirty.
So, I changed the 'autosaveInterval' to 5000 instead of 1000, and got this:
And as you can see, it's still not catching that some of the collections are dirty. It is as if the saving to iDB is happening at the same time as other data is being added to the DB in memory, then _autosaveClearFlags is being called which resets ALL _dirty flags to false, instead of just the ones that were actually saved. Do I need to cache ALL the incomming data somewhere else first, then dump it all into the LokiDB database at once (all collections) so it all fits within one "saveInterval" window? It still seems a little risky, as data might be added while a background save is happening....
Wow, great analysis!
I found the first bug. I don't know what chased me to rename the serialization output (removing the underscore)...with the correct naming it works.
Please have a look here: https://github.com/LokiJS-Forge/LokiDB/pull/184
Could you provide a small code example for the second bug? Maybe it is gone with the fix.
#184 fixed the first bug, thank you!
The 2nd bug remains. I'll see if I can work up some code with timeouts or intervals to repeat it. :-)
2nd Issue Repeatable Code
After running the below code, take a look at the IndexedDB storage, and you should notice some keys missing. On mine it likes to skip mm.db.4, 9, 14, and 19 consistently.
It helps if you stick console.warn('dirty:', JSON.stringify(this._dirtyPartitions)); just above the return statement in exportDatabase() of the partitioningAdapter code file.
function showAndAddTestData(db) {
// Initialize collections
const numCollections = 20;
let col;
console.log('Data loaded before reset:');
db._collections.forEach((c) => {
console.log(c.find().length + ' rows in ' + c.name);
db.removeCollection(c.name);
});
console.log('Collections cleared. Now fetching...');
setTimeout(() => { // timer to make sure the CLEAR command cleans out iDB too.
// do the request just once...
fetch('https://jsonplaceholder.typicode.com/photos')
.then(response => response.json())
.then(json => {
console.log('JSON fetched, saving...');
let collection, i = 0;
const it = setInterval(() => {
// only do this a few times
if (i > numCollections) { clearInterval(it); return; }
// Get/Create collection
collection = db.addCollection(`Test_${i}`);
// Add clone to collection
collection.insert(JSON.parse(JSON.stringify(json)));
i += 1;
}, 200); // END OF interval
});
}, 1000); // END OF timer to make sure the CLEAR command is echoed to iDB
};
let idbAdapter = new IndexedStorage('mm.db');
let paAdapter = new PartitioningAdapter(idbAdapter);
twdb = new Loki('mm.db');
let persistenceOptions = {
adapter: paAdapter,
autosave: window.self === window.top,
autosaveInterval: 500,
autoload: window.self === window.top,
throttledSaves: false,
persistenceMethod: 'indexed-storage',
};
console.log('Loki started. Initializing Persistence...');
return twdb.initializePersistence(persistenceOptions)
.then(() => {
console.log('Loki Persistence is Initialized');
showAndAddTestData(twdb);
return true;
})
.catch((e) => console.error('initializePersistence error', e.message));
I took a stab at letting the partitioning-adapter.js unmark the collections as _dirty after saving (instead of depending on the main Loki code to do it all-at-once), but it seems like it's _dirty flag unsetting never made it back up to the main database. Is this an architectural issue? Is there a way to make the _dirty flag changes echo back up to the main database?
As a workaround in my code, I added a callback into the Partitioning Adapter code that is called whenever a partition is saved. Then I tied into that in my application code and reset the _dirty flag for that partion, and it is all working. (and also commented out the this._autosaveClearFlags(); that is called within lokiDB's _saveDatabase function for the "reference" mode adapters) But that feels terribly dirty. I'd love to hear what solution you'd recommend.
It looks like wdb._collections.forEach((c) => { wdb.removeCollection(c.name); }); only deletes HALF of the collections?? When you run my code above several times, take a close look at the qty of _data rows. They should always be exactly 5000 per collection, but for some reason the first few of them seem to keep growing when you refresh the browser.
wdb._collections.forEach((c) => { wdb.removeCollection(c.name); }); deletes half of wdb._collections each time, but wdb._persistenceAdapter._dbref._collections stays at the full number.
I've split the bug of db.removeCollection(name) not triggering a database autosave into a seperate issue.