Max crawl depth exceeded
Hi all,
On virtually all longer crawl sessions, the max crawl depth is regularly exceeded. I am currently doing a crawl that has been running for 2 hours, found 1100 states, 60 with candidates, and counting. The max depth of 5 is currently being significantly exceeded with depths of 10+. Until now, I have only encountered this issue with longer crawls with a couple of hundred states.
Maybe related to this, browsers often end up hanging on longer crawls, usually leading to crawljax being unable to crawl queued states and therefore a premature "exhausted" crawl. Could it have to do with memory issues on longer crawls?
For one such longer crawls I got a full heap error on the CrawlOverview plugin due to the high number of states (2500) when manually ending the crawl, so I increased the heap to 3GB for the current crawl.
Let me know what kind of data you would need to debug this. I currently only log on INFO level, but could do crawls on trace or such.
– Ivan