twirp icon indicating copy to clipboard operation
twirp copied to clipboard

Proposal: Support streaming in the Twirp spec

Open jacksontj opened this issue 7 years ago • 59 comments

I think this is a great addition for RPC for Go (as well as other languages!). The main thing that I expect will limit adoption is the lack of streaming support. There are mechanisms to do streaming on http/1.x (a great example is the grpc-gateway stuff) where they effectively just use chunked encoding with blobs to do the streaming.

jacksontj avatar Jan 16 '18 23:01 jacksontj

Yeah, let's get this discussion going. I agree that this is an important feature.

One important constraint: we need to continue to work on HTTP 1.1. I think that means we can't ever support full-duplex, bidirectional streaming, but I'd be very happy to be proven wrong.

Could you (and others) write a bit about the use cases for streaming RPCs? At Twitch, we haven't really needed any, so I'd like to have concrete users to build for to guide the design of the user API. The underlying implementation seems pretty clear, but I want to make sure the generated code feels nice to use.

spenczar avatar Jan 17 '18 00:01 spenczar

My use case is usually file uploads from one backend service to another. I've written a CI/CD where I have multiple containers build different docker images and stream data from all these containers to a central backend that tars all of them and uploads them as a github release asset. Those could be many megabytes and obviously had to be chunk'd and sent out as streams. RPC streams were perfect for my use case : )

marwan-at-work avatar Jan 17 '18 07:01 marwan-at-work

I don't have any needs for Twirp, but I do use gRPC streaming.

In all cases so far, I've never need bi-directional on the same RPC. The one place I could think of is if you have a single RPC that sends state deltas and receives state deltas. But in that case, you're really looking for classic long-polling e.g. Websockets, not the RPC style of doing things. You can always invent an identifier on top of the RPC mechanism that would allow concurrency safe pub-sub on two uni-directional streams.

So if the only concern is support for bi-directionality, I'd say start with mutually exclusive directions, WLOG. That also cuts the question of full duplex short. I mean, on the wire, HTTP/1.1 could definitely support full duplex streaming, but I assume the problem is that the XHR API requires a request to be finished before reading the response and is hence practically unusable. (Is it more common for HTTP server APIs to support FD BD streaming? Thinking about the Go HTTP library got me to "yes".)

tommie avatar Jan 17 '18 09:01 tommie

We should also consider websockets. Client support is pretty widely available these days and we'd get bidirectional streaming support out of the box.

The biggest argument I can think of against is that LB/Proxy support would be poorer which is certainly unfortunate. AWS ALBs, Nginx, and HAProxy all support websockets and I imagine those would be the most commonly used Proxy tools with Twirp.

thelazyfox avatar Jan 17 '18 18:01 thelazyfox

The streaming support can be added using a wrapper message, such as:

// A wrapper for (stream Foo)
message StreamFoo {
  repeated Foo messages = 1;
  Status status = 2;
}

Pros:

  • It has the correct trailer semantics.
  • Works for both client and server streaming.
  • Works for all HTTP/* protocols.
  • Works for HTTP JSON as well.
  • Standard error payload.
  • Extremely simple to use.

Cons:

  • Needs HTTP/2 for bidi streaming.

wora avatar Jan 26 '18 01:01 wora

@spenczar thanks for the quick response (and sorry for mine being so slow :) ). I'll try to find some time here in the next couple of weeks to get a proposal (with PoC) together to review here.

jacksontj avatar Jan 26 '18 02:01 jacksontj

If we could get away with this without using websockets, I'd be happier. Support exists in proxies and stuff, but it can be shaky. You often have to do some configuration. Nobody has to configure their proxy or load balancer to be able to serve HTTP 1 requests, though.

Websockets libraries usually aren't as battle-tested as HTTP libraries, too, so we'd likely be running people into more bugs.

I think we don't need to use websockets to get what we want. I think we can do unidirectional streaming in pure HTTP 1, and say that bidirectional streaming is only available in http/2. I think that's going to be fine for almost everyone's use cases, and will make things simpler for everyone.

I like @wora's proposal. To flesh it out a bit:

Streams are represented as a sequence of messages, terminated by a single "trailer" message. The trailer holds a status code. This trailer has two functions. First, it signals the end of the messages, so the recipient knows the stream is done. Second, it lets the sender send an error, even if things started out well.

For an rpc like this:

rpc UploadFile(stream FileChunk) returns UploadFileResponse

We'd create a helper type:

message FileChunkStream {
  repeated FileChunk messages = 1 [packed=true];
  Trailer trailer = 2;
}

The "Trailer" type can be figured out later - I won't get into it here. It doesn't really matter for the core design.

A sender writes out the serialization of the Stream helper type to the wire, which would either be a HTTP request body or response body. The recipient would need a streaming deserializer for this to work.

A streaming deserializer would work in JSON, because each message is balanced by matching { and } characters, so you know when you've received a complete message.

It would work in Protobuf because repeated messages use length prefixing in front of each message - it looks like <tag><len><encoded_msg><tag><len><encoded_msg>..., where <len> is the size of the encoded_msg in bytes, so a reader can always know how many bytes to chomp.

Most JSON and Protobuf deserializers are not so low-level that they let you emit each message on the fly like this, though. Most would want the whole thing to be loaded into memory. We don't want that, so we would probably need to implement our own deserializer in generated code. It's hard to tell how difficult that would be - we should take a crack at it to get a feel for its difficulty before accepting or declining this particular proposal.

spenczar avatar Jan 26 '18 02:01 spenczar

Using HTTP/1.1 for unidirectional streaming, with bidi being available on HTTP/2 sounds great. The message design / wire protocol sounds good.

Doing streaming processing shouldn't be too hard. For JSON, we might specify that messages are separated by \n. That would let us easily find the boundaries, be very readable on the wire, and for the Go implementation we could use encoding/json.Decoder out of the box (probably decoding to a json.RawMessage before handing off to jsonpb.Unmarshal). The protobuf framing is pretty straightforward too, though I don't know of any currently-open-source code for Go that does it.

I'm interested to hear about the Go API design. Currently the client and server use a single interface, which is a really useful property. Will we be able to keep it when adding streaming APIs?

rhysh avatar Jan 28 '18 22:01 rhysh

Glad it sounds good, @rhysh!

I was wary of using \n when I first thought about your suggestion because I didn't want to have to escape them, but then I learned that \n is forbidden inside JSON strings, so it actually shouldn't be too hard at all. I like that idea.

Could we use ,\n as the separator instead, though? Then the full recorded stream would be valid JSON, which seems like a nice property.

I'll make a separate issue for discussing the Go API design, I have some thoghts there but yes - it's tricky.

spenczar avatar Jan 29 '18 17:01 spenczar

Regarding streaming JSON, there is an RFC describing JSON text sequences ("application/json-seq"): https://tools.ietf.org/html/rfc7464. Basically, always lead with a RS and end with a LF (sometimes required to ensure no truncation). I've implemented a go library here https://github.com/jmank88/jsonseq, which adapts std encoding/json by default.

jmank88 avatar Jan 29 '18 19:01 jmank88

What would you think about using an array representation for streaming JSON? Writing [ and ] at the beginning and end, and separating with ,? That way there's no need to support a custom "line-split" format, any JSON decoder can read the stream, those that support streaming JSON arrays can produce items as they arrive, the others would load the entire array in memory but still work.

achille-roussel avatar Jan 29 '18 20:01 achille-roussel

Yes, @achille-roussel, that's what I am thinking of too. I think we should send a JSON array which is a sequence of messages, then the trailer.

~(I'm not sure what I was thinking when I suggested ,\n. The spec is quite clear that , is the only legal separator in arrays.)~ (Wrong again! The spec permits this.)

That's an interesting RFC, @jmank88, and it does seem to be designed for this case. But I have a few concerns:

  • Stuff will look pretty weird when printed to most terminals.
  • How good is support across languages? It's an obscure RFC.
  • It's no longer valid as a plain JSON object, which would be a nice feature as @achille-roussel said.

spenczar avatar Jan 29 '18 22:01 spenczar

The primary advantage of json-seq is recovery without dropping the stream (https://tools.ietf.org/html/rfc7464#section-2.3), but that may not be necessary here.

jmank88 avatar Jan 29 '18 22:01 jmank88

Some options for the JSON representation of a message stream of {"m":4}, followed by {"m":1}, and concluding with {"m":5}:

Space-separated

{"k":4} {"k":1} {"k":5}

Newline-separated (\n)

{"k":4}
{"k":1}
{"k":5}

Dense JSON Array

[{"k":4},{"k":1},{"k":5}]

JSON Array with Helpful Newlines

[
{"k":4}
,{"k":1}
,{"k":5}
]

Of these, I most prefer the ones with newlines. They're easy to work with on the command line (and manual debugging is important for the JSON protocol). Client and server implementations can safely split on the newlines, process a few bytes on either end of the line, and pass to a non-stream-aware JSON decoder.

But the difference between "Newline-separated" and "JSON Array with Helpful Newlines" is whether the wire protocol for a unary RPC is different from a streaming RPC with a single value. We'll need to consider this for the protobuf wire format as well.

I think that gRPC's wire format doesn't differentiate between streaming and unary RPCs. That design implies more complexity in some places, but allows simplification in others.

rhysh avatar Jan 30 '18 00:01 rhysh

+1 on having one item per line when JSON is used, it doesn't add much to the payload size and makes it much easier to debug.

achille-roussel avatar Jan 30 '18 00:01 achille-roussel

The entire response must be a valid JSON object, including the messages and the trailer. Otherwise, it won't be compatible with OpenAPI, and create all kind of compatibility issues. Incremental parser for JSON should be a trivial work, at least outside the browser. I don't think we need to worry too much about the complexity, since it is one time cost to write such a parser.

wora avatar Jan 30 '18 00:01 wora

I like your "JSON Array with Helpful Newlines" ("JAHN") proposal a lot, @rhysh. We would need one more constraint on serialization, which is no newlines within a serialized object. I want to forbid this:

[
{
  "k": 4,
}
,{"k": 1},
,{"k": 5}
]

@wora, I don't totally agree that we should discount the difficulty of making a deserializer. Simplicity in making clients is important for Twirp's success.

That said, I think it's okay for Streaming to be a slightly "advanced" feature. Client generators that can only handle deserializing streams APIs by holding everything in memory (like @achille-roussel described) will still work, they'll just be less good than well-made deserializers.

spenczar avatar Jan 30 '18 00:01 spenczar

If you only send the array, where do you put the trailer?

Adding the object wrapper does not add any complexity to the parser. The output looks like this:

{"messages":[
]}

It still uses a fixed first line with a few extra bytes.

wora avatar Jan 30 '18 00:01 wora

Oh yes, I was thinking of a mixed-type array:

[
{"k":1}
,{"k":2}
,{"k":3}
,{"status": "ok"}
]

Mixed-type arrays can be obnoxious to deserialize, though; I agree that wrapping is probably better, so it becomes

{"messages":[
{"k":1}
,{"k":2}
,{"k":3}
],"trailer": {"status": "ok"}
}

spenczar avatar Jan 30 '18 00:01 spenczar

@spenczar your last example LGTM

Thanks for clarifying that we're not relying on HTTP Trailers; those have spotty support in HTTP/1.1 software which would roughly translate into Twirp depending on HTTP/2.

Having the top level be a JSON message rather than an array is probably good for JavaScript too .. I think there's an attack involving overriding the constructor for Array, which allows intercepting XHR responses that use an array at the top level.

What are your thoughts on how unary RPCs look compared with streams that contain a single element? As a parallel, in protobuf messages we're allowed to change between "repeated" and non-repeated ("optional" in proto2) fields without breaking wire compatibility.

rhysh avatar Jan 30 '18 00:01 rhysh

Changing between repeated and non-repeated is a breaking change for JSON. That is how JSON works, array of ints is different from int.

wora avatar Jan 30 '18 01:01 wora

@rhysh I don't mind them looking different. Unary RPCs are different than streams. We can handle that difference, and so can our users - in fact, I think they'd be happier with this. It's much simpler and clearer.

@wora, I think Rhys is suggesting using the same wrapper for unary RPCs, which would be compatible. I think that would be a mistake, though.

spenczar avatar Jan 30 '18 01:01 spenczar

Right, single-field -> repeated-field is a breaking change for JSON. Is it worthwhile to make unary RPC -> streaming RPC be a non-breaking change for JSON?

Is it worthwhile to make unary RPC -> streaming RPC a non-breaking change for protobuf?

rhysh avatar Jan 30 '18 01:01 rhysh

OK, thanks Spencer.

rhysh avatar Jan 30 '18 01:01 rhysh

unary vs streaming is a breaking change in practice, just like repeated vs non-repeated. Let's not pretend otherwise. Even it doesn't break wire, it would certainly break client. You can return a million items over a stream, and blow out memory of naive clients all over the places.

wora avatar Jan 30 '18 01:01 wora

About the {"message":[...],"trailer":{...}} format, it seems like the "message" field has to come before the "trailer", that's kind of unintuitive with JSON where the the order of keys usually doesn't matter in an object. Would/Should it be valid for a client to send the "trailer" field first? It would be valid JSON but invalid in the context of Twirp?

Progressively decoding a JSON object is also a bit more complex than just decoding a JSON array (read [, decode a value, read ,, repeat until ] + skip white spaces).

I wonder if there's really a need for trailing values:

  • gRPC's streaming doesn't support it for example
  • the almost inexistent use of HTTP trailers suggests that web services are mostly fine without it
  • HTTP/1.1 chunk encoding already carries information about when the stream ends (with the 0\r\n marker). All of this is already managed by the net/http package.

For systems that do need trailer support, it can easily be built at the application level with a definition like this:

service Service {
  rpc MyRPC(stream Value) returns (Result)
}

message Value {
  ...
  Trailer trailer; // the client only sets this value on the last item
}

this has zero overhead since the trailer is omitted on all values but the last, and on the last value all other fields are absent as well.

achille-roussel avatar Jan 30 '18 05:01 achille-roussel

How about supporting EventSource for streaming?

surajbarkale avatar Jan 30 '18 06:01 surajbarkale

I think we should have identical semantics between JSON and protobuf. This is necessary to maintain a stable and healthy wire protocol in the long term, e.g. next 10+ years.

Adding a trailer message to every request message is a poor experience. It forces developers to check for trailer for each message, and they won't enjoy it. We should see how complicated it is to do the JSON incremental parsing. It is unlikely to be a major issue in the long term.

The complexity of Twirp protocol is much much less than the EventSource. It is better to keep things as simple as possible, which is the selling point of Twirp.

mocax avatar Jan 30 '18 06:01 mocax

Just to be clear, I'm not suggesting to force the use of a trailer field in order to use streaming. I'm saying that most applications out there don't need trailers, and for those that do, they can still implement it within their API.

I'm having a hard time seeing how trailers is an important feature of an RPC system, gRPC's popularity being a good example of it.

achille-roussel avatar Jan 30 '18 06:01 achille-roussel

The proposed the trailer support requires a few lines of code. The question is whether there is enough value to provide it at little cost? We can omit it for now and add it later.

Without the trailer, how do you report error from server to client? We have to introduce another design pattern to handle it, which has non-zero cost either. If we have to do something, it is better to use a known design pattern than an unknown one. Giving the problem to developers will create significant inconsistency across APIs.

wora avatar Jan 30 '18 07:01 wora