trafilatura
trafilatura copied to clipboard
Downloads: add support to switch between proxies
The package now has support for an optional setting where all requests are routed through a proxy. As described in this comment it is not currently possible to switch between proxies or trigger the proxy on demand with each request.
Trafilatura uses two different librairies under the hood: urllib3
(standard) and pycurl
(optional). The latter can be easily adapted, however the former uses a connection pool for performance reasons which makes it slightly more difficult to add this new functionality.
Maybe somebody is interested in drafting a PR. CC @andremacola in case you find some time.