Crawler-Parallel
Crawler-Parallel copied to clipboard
C语言并行爬虫(epoll),爬取服务器的16W个有效网页,通过爬取页面源代码进行确定性自动机匹配和布隆过滤器去重,对链接编号并写入url.txt文件,并通过中间文件和三叉树去除掉状态码非200的链接关系,将正确的链接...
Results
2
Crawler-Parallel issues
Sort by
recently updated
recently updated
newest added
在bash下面运行命令: ./crawler 124.127.207.5 80 url.txt 显示 renyajie@ryj:/mnt/e/c-workspace/lab1$ ./crawler 124.127.207.5 80 url.txt 开始爬虫: 爬虫完毕! 开始建树: 建树完毕! 开始写入: Segmentation fault (core dumped)