Alexander Sibiryakov

Results 124 comments of Alexander Sibiryakov

`count 82257 avg 794.1626245 median 644 90% 1144` for a GURL. So, Yandex one is from 1.25 to 2x faster. May be this is connected with more efficient memory allocation...

I've got an idea. Let's create a library supporting batch operations on URL parsing. For Scrapy it should be a common use case. Let me know, what you think!

I've made a wrong conclusion about Yandex parser being 1000 times faster, and updated the comment.

Batch of URLs as input, and response is vector of results.

Here is the testing code https://github.com/sibiryakov/balancer/blob/urlbench/tools/urlbench/main.cpp

Hi @liho00 your seeds weren't injected, because the strategy worker was unable to create the table `crawler:queue`. Check that it can connect to Hbase Thrift Server, and namespace `crawler` exists.

@Gallaecio it should be a tiny PR https://github.com/scrapinghub/frontera/issues/371#issuecomment-500197551

Hi @DiscipleOfOne the right approach will be to use this guide https://github.com/scrapy-plugins/scrapy-splash#configuration , and Scrapy.Request with `splash` meta key.

Hey @amitsing89 , I think you should rebase it to latest master, it seems to me your code is based on some outdated version.