crawlee icon indicating copy to clipboard operation
crawlee copied to clipboard

Multiple crawler instances share `useState` state

Open barjin opened this issue 6 months ago • 1 comments

Which package is this bug report for? If unsure which one to select, leave blank

@crawlee/basic (BasicCrawler)

Issue description

When instantiating multiple crawler instances at once, their useState methods (both on the crawler instance and in the requestHandler context param) will always resolve to the same state.

From the API, this is not expected (crawler.useState feels like it should resolve to internal crawler state). If it is, it IMO requires better docs.

Code sample

import { CheerioCrawler } from '@crawlee/cheerio';

async function main() {
    function createCrawler() {
        return new CheerioCrawler({
            requestHandler: async ({ request, useState }) => {
                const state = await useState<string[]>([]);
                state.push(request.url);
            },
        });
    }

    const [crawler1, crawler2] = [createCrawler(), createCrawler()];

    await crawler1.run(['https://example.com']);
    await crawler2.run(['https://example.org']);

    console.log(crawler1 === crawler2); // false
    console.log(await crawler1.useState() === await crawler2.useState()); // true
    console.log(await crawler1.useState()); //  ['https://example.com', 'https://example.org' ]
}

main();

Package version

3.13.8

Node.js version

Node 22

Operating system

Linux

Apify platform

  • [ ] Tick me if you encountered this issue on the Apify platform

I have tested this on the next release

No response

Other context

No response

barjin avatar Jun 20 '25 12:06 barjin

Closed by #3309 in Crawlee v4

barjin avatar Dec 18 '25 13:12 barjin