Zope
Zope copied to clipboard
RESPONSE.write does not stream data in wsgi
What seem to be a regression in the ZServer -> wsgi migration is that RESPONSE.write
no longer send the response data to the client in a streaming way but just send everything when the published method returns.
For example this in a script python (with ID stream_test
at the root of application):
request = container.REQUEST
response = request.RESPONSE
for i in range(1000):
for x in range(1000):
y = x ** x # this introduce a delay
response.write(b'.')
if I use curl --no-buffer http://127.0.0.1:8080/stream_test
, with ZPublisher we can see the .
printed one after the other, but with wsgi we see all the .....
after a delay.
RESPONSE.write
is (still) documented in zope book as a way to stream data: https://github.com/zopefoundation/Zope/blob/e9f10101b8a4db64a2440e3a51561a7805ea44f5/docs/zdgbook/ObjectPublishing.rst#L1258-L1267
It's also used in OFS.File
to stream file chunks:
https://github.com/zopefoundation/Zope/blob/b9a346903fe4a6874face0930876184ac321853e/src/OFS/Image.py#L279-L298
WSGIResponse.write
docstring also states that "This allows the browser to display partial results while computation of a response to proceed", but it's currently not the case.
Jérome Perrin wrote at 2019-6-24 19:07 -0700:
What seem to be a regression in the ZServer -> wsgi migration is that
RESPONSE.write
no longer send the response data to the client in a streaming way but just send everything when the published method returns.For example this in a script python (with ID
stream_test
at the root of application):request = container.REQUEST response = request.RESPONSE for i in range(1000): for x in range(1000): y = x ** x # this introduce a delay response.write(b'.')
if I use
curl --no-buffer http://127.0.0.1:8080/stream_test
, with ZPublisher we can see the.
printed one after the other, but with wsgi we see all the.....
after a delay.
RESPONSE.write
is (still) documented in zope book as a way to stream data: https://github.com/zopefoundation/Zope/blob/e9f10101b8a4db64a2440e3a51561a7805ea44f5/docs/zdgbook/ObjectPublishing.rst#L1258-L1267It's also used in
OFS.File
to stream file chunks:https://github.com/zopefoundation/Zope/blob/b9a346903fe4a6874face0930876184ac321853e/src/OFS/Image.py#L279-L298
WSGIResponse.write
docstring also states that "This allows the browser to display partial results while computation of a response to proceed", but it's currently not the case.
Your example requires a lot of support (which may not all be in place):
In order to use Response.write
, usually the final content-length
must be known. The reason for this: even if the response is streamed,
the response header must be produced before the body starts.
Without a valid content length, the server must either use a
chunked response or close the communication channel
to indicate that the response is finished.
For efficiency reasons, the server (component) might cache some output before it decides what to do.
WSGI itself provides strong control over the delivery of response
parts; thus, you can precisely control streaming in principle.
However, WSGI support for Response.write
requires a special
(WSGI) interface method the use of which WSGI discourages.
The ZPublisher
integration may support only the typical
use case for Response.write
(avoid to construction of a huge
structure in memory) and not precise streaming control.
Note, that in general Response.write
should not wait until
the chunk is delivered (what WSGI's streaming contol would enforce).
Note, that ZServer
(unlike WSGI) does not provide precise streaming control.
Response.write
simply gives a body part to ZServer
which this
may deliver - at its discretion.
Thanks for the feedback, I did not know that of that "discouraged interface". I assume this is the "deprecated method" from the examples in https://stackoverflow.com/a/16775731 , ie:
def application(environ, start_response):
write = start_response(status, headers)
write('content block 1')
write('content block 2')
write('content block 3')
return None
If I try a quick and dirty patch doing something like this:
response.write = start_response(status, headers)
...
then the response data seems properly sent chunk by chunk with waitress and other things look working ... but that's a discouraged approach.
About content-length
header, methods can set this header before starting to write, this is what OFS.Image
does: https://github.com/zopefoundation/Zope/blob/b9a346903fe4a6874face0930876184ac321853e/src/OFS/Image.py#L264
Not sending a Content-Length
header and letting the client wait for server to close connection is valid in my understanding of https://tools.ietf.org/html/rfc7230#section-3.3.3 .
The use case I have in mind is the one you could guess, prevent the server from having to keep the full structure in memory, like OFS.Image
is doing. Sending data as chunk with the same level of detail as the chunk are written in response is not so important I guess.
I'm still wondering if there's a way for WSGIPublisher.publish_module
to return a generator that would yield from what is "pushed" by response.write
and then what the published method returns ... I assume this would be the recommended WSGI way.
Hi @perrinjerome when I was going through the documentation at the Zope sprint last month I am pretty sure I tried to run the code in the example, but obviously I did not check the "streaming" feature. I am sorry!
Once a proper solution is found, please update the documentation accordingly.
@d-maurer Thanks for your explanation. In my Zope app, once there was a newsletter module, which streamed the Response and stopped working some time - even under Zope 2.13 - your hint may explain why. Meanwhile the newsletter module got deleted anyway.
I'm still wondering if there's a way for
WSGIPublisher.publish_module
to return a generator that would yield from what is "pushed" byresponse.write
and then what the published method returns ... I assume this would be the recommended WSGI way.
I don't see how that can be done. I know nothing in Python to turn a classic call stack (from mapply to RESPONSE.write) into a kind of coroutine.
then the response data seems properly sent chunk by chunk with waitress and other things look working ... but that's a discouraged approach.
Then if there's no discouraged approach, let's rather remove RESPONSE.write
.
With the switch from Medusa to WSGI, we already broke all applications that stream data and I suppose many have already switched to stream iterators.
Another problem with RESPONSE.write
is that it's easy to use it for something else that streaming data, and that's a bad idea. I mean it allows to do strange things like write()
and return
data, and by the way what to output first ? If write()
does not stream anymore, better reply properly if any exception happens after any call to write()
, i.e. ignore all calls to write()
(currently, we can have the strange behaviour that some written data is followed by a truncated error page). To sum up, RESPONSE.write
is useless and error-prone.
Julien Muchembled wrote at 2019-11-5 08:12 -0800:
... Then if there's no discouraged approach, let's rather remove
RESPONSE.write
.
I still (current master) see a use of Response.write
in OFS.Image.File
(line 388 and others). It is used there to decouple the ZODB use
from the delivery of the file content -- such that downloading a huge
file does not need a ZODB connection during the whole download time.
In my view, we should think about removing Response.write
only after at least this use case has another good solution.
We are talking about the fact that your case is already broken: with WSGI, it makes no difference between:
- using
Response.write
- make your
OFS.Image.File
code instanciate aio.BytesIO()
object, fill it and return.getvalue()
BTW, I am surprised by your example because one usually prefer the contrary: stream data directly from the DB to the user agent, without reading everything first into RAM or disk (someone pausing the download of a big file should cost nothing besides a few file descriptors). For this use case, there's nothing anymore: Response.write
is broken and existing iterators are processed after the DB connection is closed. That's actually the first thing on which we plan to work (a discussion was started at https://lab.nexedi.com/nexedi/erp5/merge_requests/883#note_84239).
In my view, we should think about removing
Response.write
only after at least this use case has another good solution.
Sure. It's premature to clean up code if there's anyway no other good solution. But my comment is rather to announce the direction I'd like we choose.
Julien Muchembled wrote at 2019-11-5 11:32 -0800:
We are talking about the fact that your case is already broken: with WSGI, it makes no difference between:
- using
Response.write
- make your
OFS.Image.File
code instanciate aio.BytesIO()
object, fill it and return.getvalue()
Then, likely, the WSGI interface should improve!
BTW, I am surprised by your example because one usually prefer the contrary: stream data directly from the DB to the user agent, without reading everything first into RAM or disk (someone pausing the download of a big file should cost nothing besides a few file descriptors).
Of course, this requires a database connection -- during the whole delivery of the file content. The required time depends on the bandwidth which is not under the control of the server. With Zope the database connection management is implicit: as long as a database connection is open, a worker thread is blocked. It is therefore vital to make the time the connection needs to be open independent on things not under server control.
The ZServer
interface was aware of these issues and supported them
(by transparently buffering, if necessary)
The WSGI
interface should do so as well.
In my opinion, Response.write
is an adequate interface to do this.
It looks like the future is ASGI and if one solution for this current issue brings us closer to ASGI, we should choose it. I looked at the WSGI compatibility of https://github.com/encode/uvicorn and their start_response
does not return a write
callable, probably because PEP3333 deprecates this way. Changing Uvicorn to return write
is trivial but I wouldn't be surprised if they refuse such patch.
IOW, there are WSGI implementations that simply don't implement the write
API and if we don't have good arguments to convince all of them to implement it, better give up now.
The
ZServer
interface was aware of these issues and supported them (by transparently buffering, if necessary)
I didn't know about the transparent buffering for content bigger than 128k (each chunk bigger than 200 bytes goes via a temporary file). That's actually something we don't want, so it should not be hardcoded. By trying to avoid reaching the maximum number of ZODB connections, you get a lot of IO and you could fill the disk. That can also cause useless workload if the client decides after all to cancel its request.
I don't see the problem with increasing the maximum number of ZODB connections and workers (currently threads but not necessarily in the future).
I've just looked at the code of waitress (that's what we chose for WSGI) and it does the same kind of transparent buffering as ZServer. What's strange is that it also buffers to file for unbound streams. All this looks bad for me :(
Julien Muchembled wrote at 2019-11-14 07:32 -0800:
...
The
ZServer
interface was aware of these issues and supported them (by transparently buffering, if necessary)I didn't know about the transparent buffering for content bigger than 128k (each chunk bigger than 200 bytes goes via a temporary file). That's actually something we don't want, so it should not be hardcoded.
The buffering is an elegant way to decouple request processing and response delivery. Decoupling is a good thing as response delivery can take arbitrarily long (depending on actions on the client and the available bandwidth).
By trying to avoid reaching the maximum number of ZODB connections, you get a lot of IO and you could fill the disk.
Otherwise, you use an large amount of ZODB connections (limited by the availability of file descriptors) and those are much rarer than disk space.
If you do not decouple response delivery and request processing, I can easily design a DOS attack: perform a large number of requests and refuse to read the responses. Thus, you must have some form of decoupling - disk buffering is one of the easiest ways.
Otherwise, you use an large amount of ZODB connections (limited by the availability of file descriptors)
I was wrong in a previous comment. A ZODB Connection (or a thread) does not use any file descriptor (what may use file descriptors is the ZODB Storage from which Connections get data, and the number of file descriptors does not depend on the number of Connections). You may argue that it's specific to ZODB and other DB like MariaDB would use 1 socket per connection.
On the other side, current implementations of transparent buffering use 1 fd per request (for the temporary file). Sure, this could be improved with something more complex like a single sqlite3 DB buffering data for all requests, and even freeing data as soon as it's sent to the client.
I can easily design a DOS attack: perform a large number of requests and refuse to read the responses.
Currently, it looks like the decoupling was not implemented for this reason: at least the fd argument is in favor of my position.
Anyway, that's not where we can protect from a DoS attack. By not decoupling, it is indeed trivial to reach the maximum number of requests if one finds a request that returns too much data. With decoupling, 1 request finishes from time to time, giving some client a chance to get a slot, but frankly, the attacker can simply continue to open requests. I doubt decoupling helps significantly here (I mean the site is down for most users), whereas the cost is high: disk filled, wasted IO/CPU to fill the buffer.
Julien Muchembled wrote at 2019-11-14 15:51 -0800:
Otherwise, you use an large amount of ZODB connections (limited by the availability of file descriptors)
I was wrong in a previous comment. A ZODB Connection (or a thread) does not use any file descriptor
I am aware of this. My remark "limited by the availability of file descriptors" refers to my DOS szenario -- create lots of requests and refuse to read the responses -- each such request uses a connection to the server and thus a file descriptor.
... Currently, it looks like the decoupling was not implemented for this reason: at least the fd argument is in favor of my position.
OFS.Image.File
uses Response.write
in order to decouple the
request processing (bound to one of very few worker threads) from
the response delivery (with its timing completely outside the control
of the server).
When I looked at the ZServer
code, I got the impression
that many of its design decisions have been influenced by
the uncontrollable delivery time; among them the disk buffering
for large responses. Think about it: if the server produces more
data than the client is ready to consume, something must be done
with the excess. Something like ZServer
can either drop it (potentially
punishing well behaving clients in unfavorable situations (small bandwidth))
or keep it. Apparently, ZServer
decided to keep it - and to use
disk space as a relatively cheap resource for this.
I am aware that you would like e.g. in the case of OFS.Image.File
to use a stream iterator in order to interleave reading from the ZODB
and delivering partial data (and thus avoiding large intermediate data).
This would require a new ZODB connection (not associated with
the current request) and non trivial transaction handling
(to ensure that the new ZODB connection sees potential modifications
by the current request). It can be done but it is considerably
more difficult than the use of Response.write
.
Anyway, that's not where we can protect from a DoS attack. By not decoupling, it is indeed trivial to reach the maximum number of requests if one finds a request that returns too much data. With decoupling, 1 request finishes from time to time, giving some client a chance to get a slot, but frankly, the attacker can simply continue to open requests. I doubt decoupling helps significantly here (I mean the site is down for most users), whereas the cost is high: disk filled, wasted IO/CPU to fill the buffer.
The decoupling was likely not done to prevent DOS attacks but to ensure satisfactory behaviour despite delivery odds (like low bandwith, temporary network problems, etc.) at low costs (use disk space rather than RAM).
A ZODB Connection (or a thread) does not use any file descriptor (what may use file descriptors is the ZODB Storage from which Connections get data, and the number of file descriptors does not depend on the number of Connections)
That depends on the storage implementation. A storage that natively implements IMVCCStorage
has a new instance created for each Connection. Those new instances may require new sockets. RelStorage is an IMVCCStorage
and may require up to two sockets per ZODB Connection.
Oops, I forgot RelStorage.
bound to one of very few worker threads
Because of the GIL and the connection cache, one may indeed limit the number of threads within a process that serves heterogeneous requests. Sometimes, it's possible to specialize processes using a balancer, and then you can configure as many threads as you want, and small connection cache to avoid wasting RAM.
(With everything that has been said previously, I assume FDs aren't an issue: about them, there's already one for the socket to the client so an extra for the DB is in the same order of magnitude, current buffering implementations are bad, and the limit of FDs can be increased.)
I am aware that you would like e.g. in the case of
OFS.Image.File
to use a stream iterator in order to interleave reading from the ZODB and delivering partial data (and thus avoiding large intermediate data). This would require a new ZODB connection (not associated with the current request) and non trivial transaction handling (to ensure that the new ZODB connection sees potential modifications by the current request). It can be done but it is considerably more difficult than the use ofResponse.write
.
This does not require a new ZODB connection. This can be done with the same connection as the one used for the request, by postponing the closing of the connection after the iteration is over. This is trivial with the deprecated WSGI write API. With the recommended WSGI API, it is a bit tricky, as you can see at https://lab.nexedi.com/nexedi/erp5/commit/4ca4d7f65393d2caf08abfa774ef9a52b5497f63: the iterator is wrapped within a auto_close_app_iter
function that closes the ZODB in a finally
clause, plus something else to switch to this way of closing.
You also seem to mix up the use of write
and buffering. These are 2 independent things (modulo what I wrote in the previous paragraph). Removing Response.write
does not prevent anything: it remains possible to buffer or not.
I didn't know that ZServer and waitress already do buffering, which means that buffering is off-topic (if I want to get this feature removed, I'll discuss elsewhere) and WSGIPublisher does redundant buffering.
So the way I'd like this issue to be resolved is, at least:
- removing
Response.write
- removing the use of BytesIO from WSGIPublisher.py
Julien Muchembled wrote at 2019-11-18 11:04 -0800:
...
I am aware that you would like e.g. in the case of
OFS.Image.File
to use a stream iterator in order to interleave reading from the ZODB and delivering partial data (and thus avoiding large intermediate data). This would require a new ZODB connection (not associated with the current request) and non trivial transaction handling (to ensure that the new ZODB connection sees potential modifications by the current request). It can be done but it is considerably more difficult than the use ofResponse.write
.This does not require a new ZODB connection. This can be done with the same connection as the one used for the request, by postponing the closing of the connection after the iteration is over.
Currently, there is an automatism which couples the request, the ZODB connection and the worker thread. If you drop "Response.write" in preference to keeping the ZODB connection longer open you must invent something which allows you to control this automatism and I have my doubts that this will be more natural.
... So the way I'd like this issue to be resolved is, at least:
- removing
Response.write
- removing the use of BytesIO from WSGIPublisher.py
Apparently, someone needed a streaming facility and
found that Response.write
does not do this under WSGI
(unlike ZServer
). The most natural way would be
make Response.write
stream under WSGI as well
and not to drop Response.write
altogether.
Any solution you come up with must at least well accomodate
the use case in OFS.Image.File
. Show us such a complete
solution and maybe it will be convincing...
BTW, @d-maurer which WSGI implementations do you use ?
Julien Muchembled wrote at 2019-11-19 07:37 -0800:
BTW, @d-maurer which WSGI implementations do you use ?
The default one (i.e. waitress
).
This discussion focuses on the use case of OFS.Image.File
, delivering data from the ZODB. But that might not be the only use case. We use a (not version-controlled) file repository on the hard disk with methods that allow reading and delivering files from this very specific directory. This might have the same problems, but if delivering gigabytes of data to a client, Response.write
is the only option, isn't it?
FYI: We added an item to our roadmap (at PerFact) that (possibly large) static files should no longer be served by Zope, relying on a lighttpd running in parallel for these instead (maybe with Zope handling the authentication). We just are under the impression that this use case is not one of Zope's strong points, in part due to the issue discussed here.