ckanext-qa icon indicating copy to clipboard operation
ckanext-qa copied to clipboard

bad url join in celery task when site_url has a path components

Open dlax opened this issue 10 years ago • 3 comments

Consider ckan.site_url = http://somehost/ckan in CKAN configuration file. With this, when building api_url here, one gets the wrong URL http://somehost/api/action because the "ckan" path component gets dropped by urljoin. One solution would be to have a trailing / in site_url configuration option but this is apparently not recommended. So I guess some url manipulation would be needed on extension side.

Note that other extensions (such as archiver and datastorer) have the same problem.

dlax avatar Nov 03 '15 09:11 dlax

You're quite right.

How about changing it to:

api_url = urlparse.urljoin(context['site_url'] + '/', 'api/action')

Perhaps you could create some pull requests with this or similar change?

davidread avatar Nov 03 '15 10:11 davidread

Yes I could. However, I'm not sure what the proper way to fix this. Sure your suggestion would work, but since this appears to be broken in other extensions as well, I was looking for a standard way to handle this... And looking into ckan source code, I could find a few different "solutions":

  • controllers/feed.py has '/'.join([site_url, resource_path]) and also base_url + h.url_for(controller='package', ...)

  • controllers/package.py has c.datastore_api = '%s/api/action' % \ config.get('ckan.site_url', '').rstrip('/') similar things in lib/dictization/model_dictize.py or model/package.py

  • lib/cli.py has

    fetch_url = config['ckan.site_url']
    url = h.url_for(controller='package', action='read', id=dd['name'])
    url = urlparse.urljoin(fetch_url, url[1:]) + '.rdf'
    

    which, I guess, would also fail in case site_url has a path component.

Did I miss something or should we really go and fix all usages of urljoin the way you suggests?

dlax avatar Nov 03 '15 12:11 dlax

site_url shouldn't have path components, that's what root_path is for. The url_for helper can be used to generate full paths correctly, I think.

wardi avatar Nov 03 '15 12:11 wardi