Adrien Barbaresi

Results 412 comments of Adrien Barbaresi

Hi @tkapias, as you say Gooey is not actively maintained. Easily adapting the command-line to a guided user interface is challenging and I lack time to fully test it. I...

Could you please be more specific about the tables? As for the rest you probably need to import from the corresponding submodule, e.g. `trafilatura.core`.

Hi @shivanker, extraction of main content from what is actually a summary page is tricky, but there is a bug here indeed.

I understand your point but in this case its code within a `p` element, which affects its processing. The software has reached a balance and although improvements are still possible...

This issue is now solved.

Sorry @drammock, my bad, it's `sphinx-rtd-theme` indeed. I'm not sure how to implement it but I'll have a look.

Note: exclude 'javascript:', 'mailto:', and 'tel:'

It's not working, let's stick to `urllib3`.

Hi @sbusso, I'm not sure what you mean regarding the `/page/` pattern, maybe it's a documentation issue. I added tests, could you please look at the commit above and see...

I see, let's keep an eye on that.