ckanext-xloader icon indicating copy to clipboard operation
ckanext-xloader copied to clipboard

Unclear support for loading FTP resources

Open Chealer opened this issue 5 years ago • 4 comments

Express Loader uses the "Requests: HTTP for Humans" library, which - unsurprisingly - only supports HTTP, as it is based on urllib3, which - despite its name - does not support FTP like urllib2 did. As a result, Express Loader cannot upload FTP resources to DataStore.

I'm filing this:

  1. so that the code can be cleaned up. An error message for unsupported schemes ("Only http, https, and ftp resources may be fetched.") suggests that FTP resources are supported. This seems to come from an artifact in a merge from DataPusher via commit 9316a07b6ebd2ce5646ea56713aa2064a9828eae.
  2. so that the actual error message displayed is improved. With our CKAN 2.8.4, the actual message we're seeing is: No connection adapters were found for 'ftp://transfert.mern.gouv.qc.ca/[...]csv' status=None url=ftp://transfert.mern.gouv.qc.ca/[...]csv response=None This is not clear, but also not quite cryptic, and an experienced administrator (or at least myself) is likely to infer that installing the appropriate "connection adapter" would suffice to support FTP. But unfortunately, that doesn't seem to be possible. This message comes straight from python-requests's sessions.py, and the only adapters requests defines are HTTP adapters. Either:
    1. python-requests should change its message to simply say that the scheme is unsupported.
    2. or xloader should simply and clearly say that FTP is unsupported.

Note that Express Loader's alternative DataPusher does not support FTP since version 0.0.13.

Chealer avatar Apr 24 '20 19:04 Chealer

Wonderful work sleuthing! I wasn’t aware of the ftp support or lack of it.

‘requests’ is loads better than urllib2 - it would make no sense to support go back, just for the sake of ftp.

I’m also not terribly excited about adding ftp support. Ftp is not very good these days for open access - it places a burden on clients. I guess there are a few legacy downloads still using ftp, that are too hard to change, though. So I’d be happy if you proposed a PR to add ftp support, as long as it doesn’t make the code much more complex. I wonder if you could catch the exception from the requests call to see if it looks likely that ftp might work and then try an ftp library instead to download the file. Do you think that would work?

davidread avatar Apr 24 '20 22:04 davidread

Thank you for the quick reply David, I apologize for the unclear request; despite the initial title, I did not fully mean this as a request to reintroduce support for FTP. I think clarifying that it is not currently supported would already be great.

That being said, we do use FTP resources and expect this loss to be problematic. I will discuss this issue with my team next week and we'll try to choose our course of action, which might be to modify xloader to support FTP. If so, I will share the results.

Chealer avatar Apr 24 '20 22:04 Chealer