crawlee icon indicating copy to clipboard operation
crawlee copied to clipboard

Performance issues with KVS holding lots of items locally

Open B4nan opened this issue 1 year ago • 1 comments

1 - Extract the attached file and do npm install 2 - Run it once : npm start 3 - It will generate KV with 100k keys. If you notice, KV will initialize in no time in first run 4 - Once complete, run it again with npm start 5 - Notice that it will freeze here. Ideally, that line should not do anything with size of data.

bottleneck-poc.zip

import { Actor } from 'apify';

await Actor.init();
console.log(`store initialisation started. It will freeze here when you run this POC second time.`);
const store = await Actor.openKeyValueStore('100k-keys');
console.log(`store initialised successfull`);

for (let i = 0; i < 100001; i++) {
    await store.setValue(`number-${i}`,"1");
    console.log(`storing ${i}`);
}   

console.log('100k KV stored');

await Actor.exit();

Originally posted by @dhrumil4u360 in https://github.com/apify/crawlee/discussions/2722#discussioncomment-11053044

B4nan avatar Oct 25 '24 13:10 B4nan

A very painful dupe of https://github.com/apify/crawlee/issues/2248 ... It's really starting to bite our ass more and more because we cannot just preload everything without scanning the whole dir (or storing a metadata file mandatory)

vladfrangu avatar Oct 25 '24 15:10 vladfrangu