Adrien Barbaresi

Results 412 comments of Adrien Barbaresi

@AdamQuadmon Are you still working on the PR?

Hi @felipehertzer, I cannot reproduce the issue, I get results for your example with the latest version of the code (from the Github repository). Did you make other changes?

I still cannot reproduce it, `probe_alternative_homepage()` works as expected, it returns the HTML code, `https://www.australiandefence.com.au/news/news` and `https://www.australiandefence.com.au`. Besides, the lines `if response.url not in homepage and response.url != "/":` you're...

Thanks for the details, this is tricky, it may be a bug in urllib3. How do you think we can solve this?

Hi @eyupcanakman, the idea looks good but as it stands your code isn't actually used during the extraction. So it's hard to tell what would be the benefit here.

@eyupcanakman Your PR doesn't change anything in the way documents are processed, I will close it if you don't integrate it into the actual code.

@eyupcanakman It works but it doesn't make much sense to keep both conversions active, or am I getting it wrong? - for elem in tree.iter(CONVERSIONS.keys()): - for elem in tree.iter(*_ALL_TAGS_TO_CONVERT):...

@eyupcanakman The last change looks good but I still need to think about the PR. There is a small negative impact on the benchmark. You get more coverage if you...

Hi @hitesh1997, there was such a timeout function but the underlying `signal` library prevents use of the extract function in certain contexts, see #202 for details. You can write a...