django-downloadview icon indicating copy to clipboard operation
django-downloadview copied to clipboard

mod_xsendfile serving files with url-encoded names

Open AdrianLC opened this issue 11 years ago • 6 comments

Hello,

This is my first time using both xsendfile and downloadview so I might be doing something wrong. I apologize if that is the case.

So I'm trying to serve files with a non ascii characters and getting the file names all messed up.

My httpd.conf:

    Alias /priv/ /var/www/wsgi/site/priv/
    <Directory /var/www/wsgi/site/priv>
        Require all denied
        XSendFile On
        XSendFilePath /var/www/wsgi/site/priv/jobfiles
    </Directory>

settings.py:

MIDDLEWARE_CLASSES += ('django_downloadview.SmartDownloadMiddleware', )
DOWNLOADVIEW_BACKEND = 'django_downloadview.apache.XSendfileMiddleware'
DOWNLOADVIEW_RULES = [
    {
        'source_url': '/priv/',
        'destination_dir': "/var/www/wsgi/site/priv",
    },
]

The view:

class DownloadJobResultsView(ObjectDownloadView):
    model = Job
    file_field = 'zipped_results'

~~I've been reading the changelogs from both downloadview an mod_xsenfile and it appears that downloadview might have been upgraded too soon ???~~

Below is the last update of xsendfile which is still on beta and not available in the repositories of Ubuntu or Fedora.

Version 1.0
    Unescape/url-decode header value to support non-ascii file names
    XSendFileUnescape setting, to support legacy applications
    X-SENDFILE-TEMPORARY header and corresponding AllowFileDelete flag
    Fix: Actually look into the backend-provided headers for Last-Modified

So, if I understand correctly, previous versions don't url-decode the file names...

EDIT: Just tried with xsendfile 1.0 built from source and still have the problem so... I have no idea of what is happening...

Response headers:

(Status-Line)       HTTP/1.1 200 OK
Date                Thu, 22 May 2014 20:16:48 GMT
Server              Apache/2.4.9 (Fedora) PHP/5.5.12 mod_wsgi/3.4 Python/2.7.5
Content-Disposition attachment; filename=%5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip; filename*=UTF-8''%255Bpbs%252Bssh%2540example.com%253A22%255D-%255B2067%255D.zip
Content-Language    en
Vary                Accept-Language,Cookie
X-Frame-Options     SAMEORIGIN
X-Sendfile          /var/www/wsgi/site/priv/jobfiles/results/%5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip
Content-Length      0
Keep-Alive          timeout=5, max=98
Connection          Keep-Alive
Content-Type        application/zip; charset=utf-8

Do you know how I could fix this?

Many thanks, Adrian

AdrianLC avatar May 22 '14 19:05 AdrianLC

This is my first time using both xsendfile and downloadview so I might be doing something wrong. I apologize if that is the case.

And I am not an expert of xsendfile... But let's try to find what's going wrong ;)

If I understood correctly, in your client, you request a dowload of [[email protected]:22]-[2067].zip file. What is the filename shown in your client? %5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip?

Just to be sure, does your setup work fine with full-ascii filenames?

It looks like the anomaly is in the Content-Disposition attachment; filename=%5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip; filename*=UTF-8''%255Bpbs%252Bssh%2540example.com%253A22%255D-%255B2067%255D.zip response header. filename seems urlencoded. And filename*UTF-8'' seems double urlencoded. It think it should be filename contains only ascii, and filename*UTF-8'' is urlencoded, because of the behaviour of django_downloadview.response.content_disposition(filename)

That said, at the moment I do not know what is the cause of the anomaly above... I think we could check 2 things:

  • what does django_downloadview.response.content_disposition('[[email protected]:22]-[2067].zip') return?
  • what is sent to django_downloadview.response.content_disposition with your setup? I mean, is the filename already urlencoded when it reaches django_downloadview? Or perhaps django_downloadview does double urlencoding in some way?

(note: I cannot promise I will investigate today...)

benoitbryon avatar May 23 '14 09:05 benoitbryon

>>> from django_downloadview import response
>>> response.content_disposition('[[email protected]:22]-[2067].zip')
"attachment; filename=[[email protected]:22]-[2067].zip; filename*=UTF-8''%5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip"

If django_downloadview is given not-urlencoded filename, it should return the content-disposition as shown above.

Another thing that could happen is: django_downloadview returns the content-disposition above, and xsendfile urlencodes it again before sending it to the client. You can check what django_downloadview returns by deactivating xsendfile in apache configuration, then perform the request again and watch the response. Since xsendfile does not handle the response, you should see django_downloadview's raw "internal redirect" response, using x-sendfile headers.

benoitbryon avatar May 23 '14 09:05 benoitbryon

Thank you for answering so soon.

If I understood correctly, in your client, you request a dowload of [[email protected]:22]-[2067].zip file. What is the filename shown in your client? %5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip?

Yes.

Just to be sure, does your setup work fine with full-ascii filenames?

Yes (for the file name, please see below).

what does django_downloadview.response.content_disposition('[[email protected]:22]-[2067].zip') return?

>>> from django_downloadview import response
>>> response.content_disposition('[[email protected]:22]-[2067].zip')
"attachment; filename=[[email protected]:22]-[2067].zip; filename*=UTF-8''%5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip"

Another thing that could happen is: django_downloadview returns the content-disposition above, and xsendfile urlencodes it again before sending it to the client. You can check what django_downloadview returns by deactivating xsendfile in apache configuration, then perform the request again and watch the response. Since xsendfile does not handle the response, you should see django_downloadview's raw "internal redirect" response, using x-sendfile headers.

I was getting the same response even after restarting httpd with XSendfile Off and the LoadModule commented out. Turns out that my xsendfile was never enabled. I realized I was serving 0 byte files. Now, I moved the XSendFile On outside of the <Directory> and this is in my logs:

(2)No such file or directory: [client 127.0.0.1:53519] xsendfile: cannot open file: /var/www/wsgi/site/priv/jobfiles/results/%5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip, referer: http://localhost/execution/list/

so... perhaps my original assumption (that xsendfile pre-1.0 doesn't url-decode) was right after all ??? Because the file is there for sure.

EDIT: I tried again with xsendfile 1.0. It serves the file with the correct size but still the wrong name. Also, with XSendFileUnescape off (which supposedly restores <1.0 behaviour) it'll throw the same "No such file" error.

AdrianLC avatar May 23 '14 15:05 AdrianLC

Sorry, I closed it by mistake.

AdrianLC avatar May 23 '14 15:05 AdrianLC

Think I've found where the double encode is happening.

FileSystemStorage's url() method escapes the filename with django.utils.encoding.filepath_to_uri. The method is called in ProxiedDownloadMiddleware through the url property of FieldFile.

AdrianLC avatar May 23 '14 17:05 AdrianLC

#135 Will this fix this issue?

pkaczynski avatar Oct 12 '16 11:10 pkaczynski