darc
darc copied to clipboard
AttributeError: 'Proxy' object has no attribute 'add_to_capabilities'
Describe the bug
When process_loader
tries to go over the link pool and call darc.crawl.loader
, the loader throws an exception:
loader | Traceback (most recent call last):
loader | File "/app/darc/crawl.py", line 306, in loader
loader | with request_driver(link) as driver:
loader | File "/app/darc/selenium.py", line 69, in request_driver
loader | return driver()
loader | File "/app/darc/selenium.py", line 215, in tor_driver
loader | capabilities = get_capabilities('tor')
loader | File "/app/darc/selenium.py", line 175, in get_capabilities
loader | TOR_SELENIUM_PROXY.add_to_capabilities(capabilities)
loader | AttributeError: 'Proxy' object has no attribute 'add_to_capabilities'
The issue seems to be related to the fact that there is no more method add_to_capabilities()
in selenium.webdriver.common.proxy
class. I supp
I assumed that it might have been changed to to_capabilities()
as shown in the documentation, but this method seems to return the current capabilities. Not sure how to proceed here...
To Reproduce Steps to reproduce the behavior:
- Provide darc something to start crawling on. I used onion version of BBC news website.
- Observe how
loader
starts to load links from the link pool - Observe how exception is thrown and depicted in the logs
Expected behavior A clear and concise description of what you expected to happen.
- Links are loaded via selenium without exceptions
-
add_to_capabilities()
work.
Screenshots
Desktop (please complete the following information):
- Linux PopOS
UPD: After digging a bit more, it seems that desired_capabilities is no more present, and instead webdriver.ChromeOptions()
could be used. Will test it out and update back here.
Pinned selenium<4
for now.
@sejego should you have any updates on how to migrate to 4.* version, do please lemme know. Thx.