wsgi-request-logger
wsgi-request-logger copied to clipboard
Empty files when request logger enabled.
I'm serving dynamically generated files with my WSGI application and are streaming them to the client. When I apply the wsgi-request-logger middleware the files are empty. The reason is that when you are calculating the content type you are calling len on the retval which on a file-like object sets the internal pointer to the end of the file.
# file /__init__.py
51 content_length = content_lengths[0] if content_lengths else len(b''.join(retval))
A minimal example with flask, where the middleware breaks the application:
# file /main.py
import logging
from flask import Flask, send_file
import requestlogger
app = Flask(__name__)
handler = logging.StreamHandler()
app.wsgi_app = requestlogger.WSGILogger(app.wsgi_app, [handler], requestlogger.ApacheFormatter())
@app.route('/file', methods=['GET'])
def serve_file():
f = open('data', 'r')
return send_file(f, mimetype="application/ocet-stream")
if __name__ == '__main__':
app.run(port=4500, debug=True)
#file /data
Some data...
This application will serves an empty file. Commenting out the middleware serves the file properly.
Thanks for the detailed bug report. You're right, the behaviour is not optimal and I will improve it. I will fix it in wsgi-request-logger but you can work-around this issue by setting the "Content-Length" header in your application. If you call send_file()
with a file name instead of a file-like object, this is done automatically. If you really need to provide send_file()
with a file-like object, you can also set the content_length yourself (and you should, if there's any way of knowing it the moment you create the stream to let Browsers display an e.t.a.). In your example, this could be done like this:
@app.route('/file', methods=['GET'])
def serve_file():
f = open('data', 'rb')
sf = send_file(f, mimetype="application/ocet-stream")
sf.headers['Content-Length'] = str(os.path.getsize('data'))
return sf
Concerning the how-to-fix-it in wsgi-request-logger I'm thinking about something like this:
if content_lengths :
content_length = content_lengths[0]
elif all(hasattr(retval, attr) for attr in ('seek', 'tell')):
retval.seek(0, 2)
content_length = retval.tell()
retval.seek(0)
elif hasattr(retval, 'file') and hasattr(retval.file, 'seek') and hasattr(retval.file, 'tell'):
retval.file.seek(0, 2)
content_length = retval.file.tell()
retval.file.seek(0)
else:
content_length = len(b''.join(retval))
Not optimal yet... If you're really streaming content where you don't know the content size at time when you start serving it (e.g. live content such as mjpeg frames from a web cam), the seek(0,2)
will most probably block. So what should the log-file contain for the file size in those cases? I'm not sure.
Let me know what you think and thanks again for reporting the issue.
the alternative is to delay the log until response close, you have to do some counting during the response, I just hit by this now I am searching an alternative.
simple POC
def __call__(self, environ, start_response):
start = clock()
status_codes = []
content_lengths = []
def custom_start_response(status, response_headers, exc_info=None):
status_codes.append(int(status.partition(' ')[0]))
for name, value in response_headers:
if name.lower() == 'content-length':
content_lengths.append(int(value))
break
return start_response(status, response_headers, exc_info)
retval = self.application(environ, custom_start_response)
try:
counter = 0
for i in retval:
yield i
counter += len(i)
finally:
runtime = int((clock() - start) * 10**6)
content_length = content_lengths[0] if content_lengths else counter
msg = self.formatter(status_codes[0], environ, content_length, ip_header=self.ip_header, rt_us=runtime)
self.logger.info(msg)