webmagic
webmagic copied to clipboard

Published 20 hours ago •

→

Metadata

A scalable web crawler framework for Java.

Reame
Issues

Results 147 webmagic issues

Sort by recently updated

PriorityScheduler为什么写的如此复杂？要3个queue去配合

2

comment

PriorityScheduler源码如截图： ![image](https://user-images.githubusercontent.com/22490427/117938486-aeaccc80-b339-11eb-9114-5ae4a479dd12.png) 问题：为什么需要使用三个queue？直接把QueueScheduler的队列换成PriorityBlockingQueue就可以了吧？而且统计队列剩余数量好像是错的，只统计一个队列的，请作者看看。 QueueScheduler源码如截图： ![image](https://user-images.githubusercontent.com/22490427/117938791-fcc1d000-b339-11eb-9b13-de356cf68421.png) 请作者指点一下，谢谢！

There is no crawling depth property

Can't find any crawler policy and\or property to restrict crawling depth. Is it missed and only way how we can restrict depth is by choosing suitable selector in PageProcessor?

AbstractDownloader 类的 onError方法，是否可以扩展下，把异常信息也带过去呢

比如有些下载地址是因为网络波动读取超时。这时onError有异常信息才能比较好即时处理

Fix await return value exception

Throws a exception when the waiting time detectably elapsed before return from the method.

VivianDelannoyEtu

fixing smells

- diamond operator since JAVA 7 - naming conventions - duplication

VivianDelannoyEtu

Fix unclosed file in webDriverPool.java

Adding a try-catch-finally clause to properly close the configFileReader file

VivianDelannoyEtu

little fix smells deprecated decorator in Spider.java

Found a code smells on a missing decorator. A fix on a code smells, trying to use pull requests for school.

VivianDelannoyEtu

Reduce code smells for the framework quality improvement

1. Add @Deprecated annotation with both @deprecated Javadoc tag just to enable tools such as IDEs to warn about referencing deprecated elements and to highlight a user when the element...

542327088 这个群为啥搜不到呢

哪位大佬能给解释下

Code smells

Hello ! this in this pull request, Im correcting and deleting Code smells, but also refactoring some methods for a better readability.

‹
1
2
3
4
5
6
7
8
9
10
...
14
15
›

About

A scalable web crawler framework for Java.

framework

java

crawler

scraping

11.3k

Stars

4.2k

Forks

Watchers

Owner

← Metadata

11.3k

Stars

4.2k

Forks

Watchers

Owner

Metadata

A scalable web crawler framework for Java.