mod_wsgi Uploading Large Files through Apache mod

Hi,

We are trying to set up a site that should support large file uploads (~more than 2GB). We are using: python3-mod_wsgi version 4.7.1-2.el7 OS: centos 7 flask: 2.0.2 Apache: 2.4.39 When trying to upload a file that is larger than about 100Mb (takes more than 15 seconds) we get this error in the Apache logs: "[Mon Jan 03 10:40:46.863635 2022] [wsgi:error] [pid XXXXX] [client XXX.XX.X.XX:XXXXX] mod_wsgi (pid=XXXXX): Request data read error when proxying data to daemon process: The timeout specified has expired., referer: XXXXXX"
Here are the logs from the flask app: "2022-01-03 14:32:58,514[ERROR][app.py][log_exception]: Exception on / [POST] Traceback (most recent call last): File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/werkzeug/wsgi.py", line 921, in read read = self._read(to_read) OSError: Apache/mod_wsgi request data read error: Partial results are valid but processing is incomplete.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/werkzeug/formparser.py", line 142, in wrapper return f(self, stream, *args, **kwargs) File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/werkzeug/formparser.py", line 292, in _parse_multipart form, files = parser.parse(stream, boundary, content_length) File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/werkzeug/formparser.py", line 458, in parse for data in iterator: File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/werkzeug/wsgi.py", line 653, in _make_chunk_iter item = _read(buffer_size) File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/werkzeug/wsgi.py", line 923, in read return self.on_disconnect() File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/werkzeug/wsgi.py", line 893, in on_disconnect raise ClientDisconnected() werkzeug.exceptions.ClientDisconnected: 400 Bad Request: The browser (or proxy) sent a request that this server could not understand.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/flask/app.py", line 2073, in wsgi_app response = self.full_dispatch_request() File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/flask/app.py", line 1518, in full_dispatch_request rv = self.handle_user_exception(e) File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/flask/app.py", line 1516, in full_dispatch_request rv = self.dispatch_request() File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/flask/app.py", line 1502, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args) File "/data/www/flask/fltr_backend/app/init.py", line 132, in upload_file if 'file' not in request.files: File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/werkzeug/utils.py", line 97, in get value = self.fget(obj) # type: ignore File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/werkzeug/wrappers/request.py", line 499, in files self._load_form_data() File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/flask/wrappers.py", line 113, in _load_form_data RequestBase._load_form_data(self) File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/werkzeug/wrappers/request.py", line 289, in _load_form_data self.mimetype_params, File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/werkzeug/formparser.py", line 265, in parse return parse_func(self, stream, mimetype, content_length, options) File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/werkzeug/formparser.py", line 150, in wrapper chunk = stream.read(1024 * 64) OSError: Apache/mod_wsgi request data read error: Input is already in error state. "

This is our Apache configuration file: "<VirtualHost *:80> TimeOut 10800 ServerName genomefltr.tau.ac.il

WSGIDaemonProcess fltr_backend user=bioseq group=bioseq threads=5 connect-timeout=10800 request-timeout=10800 socket-timeout=10800
WSGIScriptAlias / /data/www/flask/fltr_backend/wsgi.py

<Directory /data/www/flask/fltr_backend>
    WSGIProcessGroup fltr_backend
    WSGIApplicationGroup %{GLOBAL}
    Require all granted
</Directory>
ErrorLog /data/www/flask/fltr_backend/logs/aphace/genomefltr.error_log
TransferLog /data/www/flask/fltr_backend/logs/aphace/genomefltr.access_log

</VirtualHost>"

When we try to run a local dev server through flask, the issue dissapears...

Thanks, and a happy new year!

Jan 03 '22 12:01 idotan286

You shouldn't be setting connect-timeout to such a high value, that would be a really bad idea and could just result in you running out of capacity, especially since you have such low thread count for this situation defined for the daemon process. Usually setting a high thread count is itself a bad idea, but with long running requests which are just I/O and no CPU, it might be acceptable depending on how many concurrent requests you expect.

How many concurrent requests are you expecting and is this occurring for even just a single request, or when you have many?

Are you running with LogLevel info? This will result in mod_wsgi outputting more information about process restarts for the daemon processes.

In general I would say it is a bad idea to use WSGI for handling large file uploads given it is synchronous and that will impose various limitations. You would be better using an async capable web server. So perhaps create a small custom aiohttp server for handling just the upload URL.

Jan 03 '22 23:01 GrahamDumpleton

Hi, We changed to LogLevel info, but there doesn't seem to be much ore information around this error. Here it is the new log: err We are still getting this error too in the flask app lof: File "/data/www/flask/fltr_backend/venv/lib/python3.6/site-packages/werkzeug/wsgi.py", line 921, in read read = self._read(to_read) OSError: Apache/mod_wsgi request data read error: Partial results are valid but processing is incomplete.

We also change the apache: <VirtualHost *:80> TimeOut 7200 ServerName **********

WSGIDaemonProcess fltr_backend user=bioseq group=bioseq threads=5 request-timeout=7200
WSGIScriptAlias / /data/www/flask/fltr_backend/wsgi.py
KeepAlive On
KeepAliveTimeout 20000

<Directory /data/www/flask/fltr_backend>
    WSGIProcessGroup fltr_backend
    WSGIApplicationGroup %{GLOBAL}
    Require all granted
</Directory>
ErrorLog /data/www/flask/fltr_backend/logs/aphace/genomefltr.error_log
TransferLog /data/www/flask/fltr_backend/logs/aphace/genomefltr.access_log

In CGI this server is able too handle much larger request... but in mod_wsgi it breaks down at very low sizes (even 15Mb). We think the files that we are able to upload are linked to time rather than actual filesize.

Thank you for your time.

Jan 24 '22 11:01 elyawy

You left out the useful part of the logs. The timestamps and the bit just before what you showed. That forced restart is significant and the messages before that should say why it occurred.

Jan 24 '22 11:01 GrahamDumpleton

It reloaded because I saved the wsgi script, which prompted a forced reload. Here is it is with the time stamps and the previous line: [Mon Jan 24 13:13:31.468681 2022] [wsgi:error] [pid 24951] [remote ...:] <Request 'http://.../' [GET]> [Mon Jan 24 13:14:36.495367 2022] [wsgi:info] [pid 24951] [remote ...:] mod_wsgi (pid=24951): Force restart of process 'fltr_backend'. [Mon Jan 24 13:14:36.495739 2022] [wsgi:info] [pid 24951] mod_wsgi (pid=24951): Shutdown requested 'fltr_backend'. [Mon Jan 24 13:14:36.496344 2022] [wsgi:info] [pid 21353] [client ...:] mod_wsgi (pid=21353): Connect after WSGI daemon process restart, attempt #1. [Mon Jan 24 13:14:36.496634 2022] [wsgi:info] [pid 24951] mod_wsgi (pid=24951): Stopping process 'fltr_backend'. [Mon Jan 24 13:14:36.496651 2022] [wsgi:info] [pid 24951] mod_wsgi (pid=24951): Destroying interpreters. [Mon Jan 24 13:14:36.496660 2022] [wsgi:info] [pid 24951] mod_wsgi (pid=24951): Cleanup interpreter ''. [Mon Jan 24 13:14:36.496935 2022] [wsgi:info] [pid 24951] mod_wsgi (pid=24951): Terminating Python. [Mon Jan 24 13:14:36.718211 2022] [wsgi:info] [pid 24951] mod_wsgi (pid=24951): Python has shutdown. [Mon Jan 24 13:14:36.718266 2022] [wsgi:info] [pid 24951] mod_wsgi (pid=24951): Exiting process 'fltr_backend'. [Mon Jan 24 13:14:37.465950 2022] [wsgi:info] [pid 28763] mod_wsgi (pid=28763): Attach interpreter ''. [Mon Jan 24 13:14:37.480856 2022] [wsgi:info] [pid 28763] mod_wsgi (pid=28763): Imported 'mod_wsgi'. [Mon Jan 24 13:14:37.481588 2022] [wsgi:info] [pid 28763] [remote ...:] mod_wsgi (pid=28763, process='fltr_backend', application=''): Loading Python script file '/data/www/flask/fltr_backend/wsgi.py'. [Mon Jan 24 13:15:39.446649 2022] [wsgi:error] [pid 21366] [client ...:] mod_wsgi (pid=21366): Request data read error when proxying data to daemon process: The timeout specified has expired., referer: http://...*/'

Jan 24 '22 11:01 elyawy

The error:

Request data read error when proxying data to daemon process: The timeout specified has expired., referer: XXXXXX"

is actually generated from the Apache child process and not the mod_wsgi daemon process. When it says read error it is thus talking about reading from the HTTP client. This error would thus usually indicate that the HTTP client sending the data stopped sending any data. The problem is that you have Timeout set to 7200, which is a ridiculously high 2 hours. Setting this so high is generally really bad practice as it isn't a timeout on how long the whole request takes, but just on a single socket read/write, so the default of 60 seconds is generally more appropriate, and could even be argued as being too high even at 60 seconds.

If this is reliably reproducible on a test server, I would suggest running with settings:

LogLevel debug
WSGIVerboseDebugging On

This may help in trying to correlate the error with something happening in the corresponding daemon process which is supposed to handle it, or whether it may relate to large request content specifically when a daemon process restart is triggered.

Jan 25 '22 05:01 GrahamDumpleton

Hi, we are getting the same error. @elyawy were you able to get this fixed? If so, how

For clarity, we are getting the following error: Apache/mod_wsgi request data read error: Input is already in error state.

Jun 02 '22 19:06 michaelakin

Also @GrahamDumpleton Please forgive my ignorance, but where do you set these settings for verbose logging? I tried searching the docs but could not find it.

LogLevel debug
WSGIVerboseDebugging On

Jun 02 '22 19:06 michaelakin

You put those directives inside of the VirtualHost if have per VirtualHost error logs, or could also set them at global scope in Apache configuration.

Jun 04 '22 07:06 GrahamDumpleton

@michaelakin Hi, For us it was an issue with the apache2 version, After updating to a newer version the problem vanished. It was probably related to the bug described here https://bz.apache.org/bugzilla/show_bug.cgi?id=63617 @GrahamDumpleton Sorry we didn't keep you informed.

Thank you for all the help.

Jun 06 '22 06:06 elyawy

mod_wsgi
mod_wsgi copied to clipboard

Uploading Large Files through Apache mod_wsgi Flask Error

mod_wsgi mod_wsgi copied to clipboard

Uploading Large Files through Apache mod_wsgi Flask Error

mod_wsgi
mod_wsgi copied to clipboard