scylla icon indicating copy to clipboard operation
scylla copied to clipboard

KeyError: 'usproxy'

Open windowshopr opened this issue 2 years ago • 4 comments

Describe the bug To reproduce:

Windows 10 Python 3.7.9

  1. Install Docker desktop
  2. Run the install docker command for scylla as recommended on the first page
  3. Navigate to http://localhost:8899/#/
  4. Wait. And wait. And wait. Still no proxies showing up in the list
  5. Open new terminal window and check the logs for errors with docker logs -f scylla
  6. See this traceback
2022-05-25 - 12:41:37 DEBUG: create new db connection
2022-05-25 - 12:41:38 INFO: Scheduler starts...
2022-05-25 - 12:41:38 DEBUG: feed 16 providers...
2022-05-25 - 12:41:38 INFO: Start python scheduler
2022-05-25 - 12:41:38 INFO: worker_process started
2022-05-25 - 12:41:38 INFO: validator_thread started
2022-05-25 - 12:41:38 DEBUG: fetch_ips...
2022-05-25 - 12:41:38 INFO: Start the web server
[2022-05-25 12:41:38 -0700] [8] [INFO] Goin' Fast @ http://0.0.0.0:8899
2022-05-25 - 12:41:38 INFO: Start forward proxy server on port 8081
[2022-05-25 12:41:38 -0700] [8] [INFO] Starting worker [8]
2022-05-25 - 12:41:39 DEBUG: Get a provider from the provider queue: A2uProvider
2022-05-25 - 12:41:39 INFO:  A2uProvider: feed 5 potential proxies into the validator queue
2022-05-25 - 12:41:39 DEBUG: Get a provider from the provider queue: CoolProxyProvider
2022-05-25 - 12:41:39 ERROR: worker.get_html failed: Event loop is closed
2022-05-25 - 12:41:39 ERROR: worker.get_html failed: Event loop is closed
2022-05-25 - 12:41:39 ERROR: worker.get_html failed: Event loop is closed
2022-05-25 - 12:41:39 DEBUG: Get a provider from the provider queue: Data5uProvider
2022-05-25 - 12:41:41 INFO:  Data5uProvider: feed 0 potential proxies into the validator queue
2022-05-25 - 12:41:41 DEBUG: Get a provider from the provider queue: FreeProxyListProvider
2022-05-25 - 12:41:41 INFO:  FreeProxyListProvider: feed 0 potential proxies into the validator queue
2022-05-25 - 12:41:41 DEBUG: Get a provider from the provider queue: HttpProxyProvider
2022-05-25 - 12:41:41 ERROR: worker.get_html failed: Event loop is closed
2022-05-25 - 12:41:41 ERROR: worker.get_html failed: Event loop is closed
2022-05-25 - 12:41:41 ERROR: worker.get_html failed: Event loop is closed
2022-05-25 - 12:41:41 DEBUG: Get a provider from the provider queue: SpyMeProvider
2022-05-25 - 12:41:41 INFO:  SpyMeProvider: feed 400 potential proxies into the validator queue
2022-05-25 - 12:41:41 DEBUG: Get a provider from the provider queue: SpysOneProvider
2022-05-25 - 12:41:41 ERROR: worker.get_html failed: Event loop is closed
2022-05-25 - 12:41:41 DEBUG: Get a provider from the provider queue: IpaddressProvider
2022-05-25 - 12:41:42 INFO:  IpaddressProvider: feed 29 potential proxies into the validator queue
2022-05-25 - 12:41:43 DEBUG: Get a provider from the provider queue: ProxyListProvider
2022-05-25 - 12:41:44 INFO:  ProxyListProvider: feed 0 potential proxies into the validator queue
2022-05-25 - 12:41:44 INFO:  ProxyListProvider: feed 0 potential proxies into the validator queue
2022-05-25 - 12:41:44 INFO:  ProxyListProvider: feed 0 potential proxies into the validator queue
2022-05-25 - 12:41:44 INFO:  ProxyListProvider: feed 0 potential proxies into the validator queue
2022-05-25 - 12:41:44 INFO:  ProxyListProvider: feed 0 potential proxies into the validator queue
2022-05-25 - 12:41:44 INFO:  ProxyListProvider: feed 0 potential proxies into the validator queue
2022-05-25 - 12:41:44 DEBUG: Request for http://proxy-list.org/english/index.php?p=7 failed, status code: 503
2022-05-25 - 12:41:44 DEBUG: Request for http://proxy-list.org/english/index.php?p=8 failed, status code: 503
2022-05-25 - 12:41:44 DEBUG: Request for http://proxy-list.org/english/index.php?p=9 failed, status code: 503
2022-05-25 - 12:41:44 DEBUG: Request for http://proxy-list.org/english/index.php?p=10 failed, status code: 503
2022-05-25 - 12:41:44 DEBUG: Get a provider from the provider queue: ProxyScraperProvider
/app/scylla/scheduler.py:36: RuntimeWarning: coroutine 'Browser.new_page' was never awaited
  continue
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/app/scylla/scheduler.py", line 39, in fetch_ips
    proxies = provider.parse(html)
  File "/app/scylla/providers/proxy_scraper_provider.py", line 19, in parse
    if not json_object or type(json_object['usproxy']) != list:
KeyError: 'usproxy'

2022-05-25 - 12:41:50 DEBUG: Catch requests.RequestException for proxy ip: 80.252.5.34
2022-05-25 - 12:41:50 DEBUG: HTTPConnectionPool(host='80.252.5.34', port=7001): Max retries exceeded with url: http://api.ipify.org/?format=json (Caused by ProxyError('Cannot connect to proxy.', RemoteDisconnected('Remote end closed connection without response')))
2022-05-25 - 12:41:54 DEBUG: Catch requests.RequestException for proxy ip: 45.184.103.113
2022-05-25 - 12:41:54 DEBUG: HTTPConnectionPool(host='45.184.103.113', port=999): Max retries exceeded with url: http://api.ipify.org/?format=json (Caused by ProxyError('Cannot connect to proxy.', RemoteDisconnected('Remote end closed connection without response')))
2022-05-25 - 12:41:56 DEBUG: Catch requests.RequestException for proxy ip: 157.100.52.146
2022-05-25 - 12:41:56 DEBUG: HTTPConnectionPool(host='157.100.52.146', port=999): Max retries exceeded with url: http://api.ipify.org/?format=json (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f875c50c190>: Failed to establish a new connection: [Errno 111] Connection refused')))
2022-05-25 - 12:41:59 DEBUG: Catch requests.Timeout for proxy ip: 54.200.36.47
2022-05-25 - 12:42:06 DEBUG: Catch requests.Timeout for proxy ip: 104.248.63.49
2022-05-25 - 12:42:06 DEBUG: Catch requests.Timeout for proxy ip: 104.248.63.15
2022-05-25 - 12:42:06 DEBUG: Catch requests.Timeout for proxy ip: 45.77.71.140
2022-05-25 - 12:42:06 DEBUG: Catch requests.Timeout for proxy ip: 104.248.63.17
2022-05-25 - 12:42:06 DEBUG: Catch requests.Timeout for proxy ip: 104.248.63.18
2022-05-25 - 12:42:08 DEBUG: Catch requests.Timeout for proxy ip: 45.170.101.2
2022-05-25 - 12:42:08 DEBUG: Catch requests.Timeout for proxy ip: 103.239.200.186
2022-05-25 - 12:42:08 DEBUG: Catch requests.Timeout for proxy ip: 185.255.47.59
2022-05-25 - 12:42:08 DEBUG: Catch requests.Timeout for proxy ip: 103.90.32.206
2022-05-25 - 12:42:08 DEBUG: Catch requests.Timeout for proxy ip: 139.99.236.128
2022-05-25 - 12:42:09 DEBUG: Catch requests.Timeout for proxy ip: 138.118.200.93
2022-05-25 - 12:42:09 DEBUG: Catch requests.Timeout for proxy ip: 46.249.33.93
2022-05-25 - 12:42:09 DEBUG: Catch requests.Timeout for proxy ip: 203.243.51.111
2022-05-25 - 12:42:09 DEBUG: Catch requests.Timeout for proxy ip: 103.42.162.50
2022-05-25 - 12:42:09 DEBUG: Catch requests.Timeout for proxy ip: 84.214.150.146
2022-05-25 - 12:42:09 DEBUG: Catch requests.Timeout for proxy ip: 91.150.189.122
2022-05-25 - 12:42:09 DEBUG: Catch requests.Timeout for proxy ip: 27.72.149.205
2022-05-25 - 12:42:09 DEBUG: Catch requests.Timeout for proxy ip: 41.174.179.147
2022-05-25 - 12:42:09 DEBUG: Catch requests.Timeout for proxy ip: 202.142.158.114

I'm wondering if the key error is the issue that's preventing the rest of the code from running so I get a bunch of timeout errors, or what's goin on here. Or the ERROR: worker.get_html failed: Event loop is closed is something? Ideas?

windowshopr avatar May 25 '22 20:05 windowshopr

I have the same issue

gremur avatar May 30 '22 18:05 gremur

I have the same issue

agungsantoso avatar Aug 30 '22 02:08 agungsantoso

same issue

hendrikbgr avatar Sep 19 '22 09:09 hendrikbgr

Same issue

adildg avatar Sep 27 '22 22:09 adildg