BasicCrawler status logging
- configurable interval
- configurable status message callback (constructor parameter, property or decorator?)
- we periodically set the crawler status via storage client
- in javascript crawlee, this does nothing when MemoryStorage is being used
in javascript crawlee, this does nothing when MemoryStorage is being used
What's the reason behind this? I believe this could be useful for local execution as well.
We also log the message on the basic crawler level (to the debug level, since it's executed quite often, every 10 seconds iirc). What is ignored is the API call which this method represents, since there is no alternative to that when running locally.
https://github.com/apify/crawlee/blob/ad6fcd4a6c7ddd59afb48e763c0523992d263b54/packages/basic-crawler/src/internals/basic-crawler.ts#L797-L799
I remember me and @tobice as well being confused by the "set status message" method being a part of the StorageClient - it has nothing to do with storage, except that it invokes an Apify API call - I suppose that is why it got slapped on top of that.
Could the crawler emit a "crawler status" event via EventManager instead? Then the SDK could hook into that and set the status message using its apify client instance.
Yeah that sounds good to me.