scrapydweb
scrapydweb copied to clipboard
Timer tasks not working with auth on
When having auth enabled, my timer tasks stop working.
The response visible in result is:
So Scrapyd is trying to send a request to Scrapydweb, but with auth it expects the basic auth, which Scrapyd does not add to the header. Is there any way to fix this? It's worth mentioning I have deployed Scrapydweb with gunicorn&nginx.
Any advice would be helpful.
- Click the history button on the timer tasks page, then post the related log.
- Run scrapydweb without gunicorn&nginx and try again.
History log:
[2021-04-08 16:20:05,034] WARNING in apscheduler: Fail to execute task #1 (upplandsbrohus sthlm 10min - edit) on node 1, would retry later: Request got {'status_code': 401, 'status': 'error', 'message': "<script>alert('Fail to login: basic auth for ScrapydWeb has been enabled');</script>"}
[2021-04-08 16:20:08,039] ERROR in apscheduler: Fail to execute task #1 (upplandsbrohus sthlm 10min - edit) on node 1, no more retries: Traceback (most recent call last):
File "/var/www/html/scrapydweb/views/operations/execute_task.py", line 89, in schedule_task
assert js['status_code'] == 200 and js['status'] == 'ok', "Request got %s" % js
AssertionError: Request got {'status_code': 401, 'status': 'error', 'message': "<script>alert('Fail to login: basic auth for ScrapydWeb has been enabled');</script>"}
[2021-04-08 16:20:40,519] WARNING in apscheduler: Shutting down the scheduler for timer tasks gracefully, wait until all currently executing tasks are finished
[2021-04-08 16:20:40,521] WARNING in apscheduler: The main pid is 1267. Kill it manually if you don't want to wait
Unfortunately running Scrapyd with gunicorn&nginx has created all kinds of problems for me, I hope you one day add an official way to deploy scrapydweb so that we don't have to create workarounds :( Without a prod server I've never had issues, so I know it would work otherwise.
My understanding is that each request goes through a middleware in run.py
@app.before_request
def require_login():
if app.config.get('ENABLE_AUTH', False):
auth = request.authorization
USERNAME = str(app.config.get('USERNAME', '')) # May be 0 from config file
PASSWORD = str(app.config.get('PASSWORD', ''))
if not auth or not (auth.username == USERNAME and auth.password == PASSWORD):
return authenticate()
My only workaround so far is to change this..
Could you debug with the following steps first?
- Run scrapydweb without gunicorn&nginx and try again.
- Run scrapydweb with gunicorn and try again.
- Run scrapydweb with nginx and try again.