connexion
connexion copied to clipboard
Mark endpoints for streaming uploads
Fixes #592 based on the suggestion by @MatteoRagni: https://github.com/zalando/connexion/issues/592#issuecomment-775153162. Also Fixes #1332, I think.
Changes proposed in this pull request:
- Adds an
x-stream-uploadoption for requests. When this is specified, connexion will passflask.request.streamas therequest.bodyinstead of callingget_data(). This allows streaming uploads to disk instead of reading the entire file into memory.
Hi @tst-lsavoie, thanks for the PR.
- I'm a bit hesitant to define a custom extension. Is there a reason we cannot inspect
Content-Typeinstead? - We moved to the
mainbranch. Can you rebase or merge and change the target branch of the PR?
@RobbeSneyders I didn't use the Content-Type because it's less reliable since a client could leave the content type out or set it incorrectly. There's also the question of which content types should be streamed, which might be different for each application, or even different for different endpoints in the same application. I don't love the idea of adding a new extension either but it seemed like the least bad option.
I changed the base branch to main but it looks like there are non-trivial merge conflicts I need to figure out. I'm going to hold off on that until we have a clearer picture of how the PR will need to change.
I've been looking into this a bit more, and I would suggest the following solution.
Define a body property with body_getter on ConnexionRequest, similar to the json property with json_getter.
https://github.com/zalando/connexion/blob/f46551cee8056dca65c1177e8252f19805d95d50/connexion/lifecycle.py#L25-L27
We can then assign request.get_data as the boddy_getter for Flask, which would delay stream consumption until the body property is evaluated. There's only two places in Connexion where request.body is evaluated:
-
During request body validation. Here the body is only consumed if Connexion supports validation for the provided
Content-Type, which is currently only forjsonorform: https://github.com/zalando/connexion/blob/f46551cee8056dca65c1177e8252f19805d95d50/connexion/decorators/validation.py#L137-L190 -
During parameter parsing. Here the body is currently always consumed: https://github.com/zalando/connexion/blob/f46551cee8056dca65c1177e8252f19805d95d50/connexion/decorators/parameter.py#L79-L84 But it's actually not used if the body parameter is not defined in the view function parameters: https://github.com/zalando/connexion/blob/f46551cee8056dca65c1177e8252f19805d95d50/connexion/operations/swagger2.py#L277-L282 If we refactor the parameter parsing a bit, we can prevent body consumption here as well if the body parameter is not present in the view function parameters.
You can then prevent stream consumption in connexion by
- Defining a
Content-Typefor which Connexion does not support validation, this should be the case for all streaming types - Not defining a body parameter in the view function
WDYT?
You can find a quick and dirty PoC of the solution here: https://github.com/thermopylae/connexion/compare/mark-endpoints-for-streaming-uploads...spec-first:streaming-uploads-poc
I think I understand the proposed solution. So, if the content type was not form or JSON, and if the user didn't add a parameter for the body to their function, then the stream would be available.
I could see that working. I worry about it for the future, though. One small, innocent looking update that accesses the request body could break it. Hopefully testing would catch that, but it's not guaranteed.
We'd have to do some refactoring for the OpenAPI 3 version of the code as well: https://github.com/zalando/connexion/blob/c014299435990ff1d4c20596396ca563f7b266b4/connexion/operations/openapi.py#L270-L300
The biggest problem is going to be time. I'm not sure when I'll next have a good chunk of time to focus on this issue.
This is great!
I have used it marking my endpoint with x-stream-upload and work like a charm, thanks :)
Fixed by https://github.com/spec-first/connexion/pull/1618