Adrien Barbaresi
Adrien Barbaresi
Hi @tkapias, as you say Gooey is not actively maintained. Easily adapting the command-line to a guided user interface is challenging and I lack time to fully test it. I...
Could you please be more specific about the tables? As for the rest you probably need to import from the corresponding submodule, e.g. `trafilatura.core`.
Hi @shivanker, extraction of main content from what is actually a summary page is tricky, but there is a bug here indeed.
I understand your point but in this case its code within a `p` element, which affects its processing. The software has reached a balance and although improvements are still possible...
This issue is now solved.
Sorry @drammock, my bad, it's `sphinx-rtd-theme` indeed. I'm not sure how to implement it but I'll have a look.
Note: exclude 'javascript:', 'mailto:', and 'tel:'
It's not working, let's stick to `urllib3`.
Hi @sbusso, I'm not sure what you mean regarding the `/page/` pattern, maybe it's a documentation issue. I added tests, could you please look at the commit above and see...
I see, let's keep an eye on that.