anonlink-entity-service
anonlink-entity-service copied to clipboard
streamed processing of CLKs in the front-end
Previously, we were accessing the CLKs in a streaming fashion to avoid parsing the json in one hit. This enables running the web front-end with less memory.
However, as connexion is very, very strict about input validation when it comes to json, it will always consume the stream first to validate it against the spec. Thus the backflip to fully reading the CLks as json into memory.
Possible approaches are:
- uploading CLKs as something different to
application/json
, then connexion won't touch it. - bypass the front-end altogether, as in #20
Aha! Link: https://csiro.aha.io/features/ANONLINK-16
The api now supports uploading via a binary stream in #208 but there is an issue with connexion still interfering. Reported upstream: https://github.com/zalando/connexion/issues/592
Perhaps we can make a separate app (without connexion) just to deal with binary data uploading?
What about this idea of using nginx to buffer the upload to disk and then passing the filename
to our flask backend?