XLoader Hook Authorization Failure: Worker Runs as Anonymous User
Problem XLoader jobs complete successfully but task status never updates because the worker process runs as anonymous user. The xloader_hook function fails authorization check when trying to update task status, leaving jobs stuck in "pending" state.
Current Behavior Job executes successfully in worker
xloader_hook called to update status
check_access('xloader_submit', context, metadata) fails - no authenticated user
Task status never updates to "complete"
Root Cause
In action.py line ~139:
p.toolkit.check_access('xloader_submit', context, metadata)
Worker context lacks user authentication, causing authorization failure.
Workaround
Adding context['ignore_auth'] = True before check_access resolves the issue.
Has anyone encountered this issue? Any suggestions for running the jobs worker with proper authentication context, or should xloader_hook automatically ignore auth since it's an internal callback?
Starting CKAN 2.10 you will need to set an API Token to be able to execute jobs against the server:
ckanext.xloader.api_token = <your-CKAN-generated-API-Token>
ckan config-tool test.ini "ckanext.xloader.api_token=$(ckan -c test.ini user token add ckan_admin xloader | tail -n 1 | tr -d '\t')"
Thanks @duttonw for the response! I already have ckanext.xloader.api_token configured in my setup
However, I debugged the xloader_hook function and noticed that data_dict doesn't contain an api_key field when the hook is called from the worker. Looking at the code in action.py around line 25, I can see that api_key is passed to the job data but not directly to the hook's data_dict.
The issue seems to be that the xloader_hook function calls:
p.toolkit.check_access('xloader_submit', context, metadata)
But the context doesn't have the API token authentication when called from the worker process.
Question: Should the API token be passed through to data_dict in the xloader_hook function, or is the current approach where I add context['ignore_auth'] = True the correct solution since this is an internal callback?
Looking at the job submission code, the api_key is included in the job data but doesn't seem to make it to the hook's context for authentication. Is this the intended behavior or should the API token be used to authenticate the hook call?
Hmm,
Did you place that api key ckanext.xloader.api_token in both the xloader workers as well as all web workers?
In our website, we do see Authorization being included on loopback.
Host: www.data.qld.gov.au
User-Agent: python-requests/2.32.4
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Type: application/json
Authorization: #REMOVED#.#REMOVED#.#REMOVED#
Content-Length: 434
we are using the following which has some slight improvements but core logic is still the same.
CKANExtXLoader: &CKANExtXLoader
name: "ckanext-xloader-{{ Environment }}"
shortname: "ckanext-xloader"
description: "CKAN Express Loader Extension"
type: "git"
url: "https://github.com/qld-gov-au/ckanext-xloader.git"
version: "2.1.1-qgov.1"
https://github.com/ckan/ckanext-xloader/compare/master...qld-gov-au:ckanext-xloader:2.1.1-qgov.1 https://github.com/qld-gov-au/ckanext-xloader/compare/2.1.1-qgov.1...ckan:ckanext-xloader:master