xxl-crawler
xxl-crawler copied to clipboard
[issue] 多线程情况下,tryFinish()很小的概率会误判当前运行状态
- issue description:
多线程情况下,tryFinish()会误判CrawlerThread
的运行状态,导致提前stop,以下是运行XxlCrawlerTest,开启3个thread,并打印日志:
概率比较小,大概试10次能出现一次,原因可能如下:
thread-3调用tryFinish()
并提前获取了3个CrawlerThread的isRunning
状态均为false,刚好此时thread-1调用了crawler.getRunData().getUrl()
并将running设为true(但thread-3已经无法知晓),最后thread-3判断runData.getUrlNum()==0
为true,由此isEnd
为true,导致了误判:
- solution:
- 改写
tryFinish()
,先判断runData.getUrlNum()==0
,再逐一获取CrawlerThread的状态,防止调用crawler.getRunData().getUrl()
无法获取running的最新状态:
public void tryFinish(){
boolean isEnd = runData.getUrlNum()==0;
boolean isRunning = false;
for (CrawlerThread crawlerThread: crawlerThreads) {
if (crawlerThread.isRunning()) {
isRunning = true;
break;
}
}
isEnd = isEnd && !isRunning;
if (isEnd) {
logger.info(">>>>>>>>>>> xxl crawler is finished.");
stop();
}
}
- CrawlerThread的running参数加上volatile关键字,保证可见性:
private volatile boolean running;
后续测试发现即使做了上述修改,极低的概率还是会出问题,猜测是CrawlerThread在调用完crawler.getRunData().getUrl();
后时间片到期,没有来得及running = true;
,导致其他线程继续误判状态
这样的话只能改进整个tryFinish()的逻辑流程了