aduana
aduana copied to clipboard
Frontera backend to guide a crawl using PageRank, HITS or other ranking algorithms based on the link structure of the web graph, even when making big crawls (one billion pages).
Bumps the pip group with 3 updates in the / directory: [gevent](https://github.com/gevent/gevent), [requests](https://github.com/psf/requests) and [pyopenssl](https://github.com/pyca/pyopenssl). Updates `gevent` from 1.0.2 to 23.9.0 Release notes Sourced from gevent's releases. 1.2.2 No release...
Hi I am trying to run Aduana example as given in the documentation. But when i run the command scrapy crawl example, i am getting this error please help how...
Hi i am finding this issue while running aduana example. Please help how to to resolve this issue. scray crawl example Unhandled error in Deferred: 2017-05-08 12:22:20 [twisted] CRITICAL: Unhandled...
It would be nice to provide to user a flexible scoring concept, to achieve that we need to be able to dump from Aduana top N (can be 10K or...
Hi, I didn't find any documentation on how the link scores affect/influence their scheduling. It would be nice to understand the relation between: - spider defined score for the links...
Hi, I run aduana with the version 0.2.1 in PyPI and everything was fine. But just after cloning the master branch I started to get the following error: ``` 2015-11-12...
It's 5x slower than the Best First and I don't know why.
It's not clear what happens with the data already present in the array if a resize happens, specially if we are using MAP_PRIVATE since there is no true file backing...