RestEase icon indicating copy to clipboard operation
RestEase copied to clipboard

String responses are not passed to Response Deserializer

Open KevivL opened this issue 7 years ago • 14 comments

Hi,

I'm finding that if a route has a return type as string, the http content from the response is directly passed back instead of being first sent to the deserializer. It might be a saving but it's wrong in the case of custom deserializers.

I use restease exclusively with a protobuf-net serializer and this causes a rather unexpected gotcha as you can imagine. As i get the protobuf message directly interpreted as a string...

Let me know if that makes sense? Can dig around and make the PR if it helps. Thanks.

KevivL avatar Sep 20 '16 13:09 KevivL

Indeed, that's the defined behaviour. With JSON that's sensible behaviour, as a bare string is not valid JSON. My understanding is that the same is true for protobuf - it's legal to send messages around, but not just primitives? SO seems to think so.

When you say that you use RestEase with a protobuf serializer, do you mean that you're talking to a REST endpoint which returns binary responses in protobuf format? If so, using RestEase with a custom serializer is probably not the right thing to do: it will try and read the response as a string, and could fall over as not every byte sequence is a valid string. There's a better approach, but I'll only go into if it it's relevant here.

canton7 avatar Sep 20 '16 13:09 canton7

Hmm. Fair enough. But yes, i'm using protobuf-net and requesting a custom media type from a REST endpoint (which is protobuf formatted), hence running into this.

Would be interested in the better approach though. Else can always fall back on not returning string primitives.

But in general, proto or no proto. If you provide a custom deserializer support, it's non-intuitive to do the string pass through? The assumption that HTTP protocol content string equals the string type data payload doesn't seem right when you have a custom deserializer in the middle.

Edit: on the reading bytes as string, i just make sure in the deserialization to read the content stream and not as a string.

KevivL avatar Sep 20 '16 15:09 KevivL

If your endpoint is returning raw binary data, you really need to be subclassing Requester, and overriding RequestWithResponseAsync to NOT call response.Content.ReadAsStringAsync() - as that can and will fail if your binary data is not a valid UTF-8 string. The README talks about how to do this.

While you're there, you can override RequestRawAsync, so that requests for string responses go through your deserializer as well.

I can see your argument for passing string response types through the custom deserializer as well. The arguments for doing this the way it's currently done are:

  1. None of the common serialization formats allow bare strings (JSON, XML, protobuf, and many others all forbid it). Passing raw strings to people's custom deserializers is confusion waiting to happen, as they'll invariably fall over.
  2. It allows a way of fetching the raw content of the response without it passing through a deserializer. 99.999% of REST endpoints return strings (which may contain serialized data, but they aren't binary), so this is useful behaviour.
  3. Refit does it like this.
  4. I don't think I can change this without breaking backwards compatibility anyway.

In theory, I'm in favour of better support for binary endpoints in RestEase, since that's apparently a use-case. This wouldn't be first-class support, but probably an alternative IRequester implementation which doesn't assume that the response is a string, and doesn't assume you want to serialize and deserialize everything as JSON. It probably wouldn't support Response<T> responses either.

Else can always fall back on not returning string primitives.

As the SO post I linked to says, passing a raw string gives you no forwards compatibility. You can't add anything to the response without breaking existing consumers. This somewhat defeats the point of protobuf.

canton7 avatar Sep 20 '16 15:09 canton7

Okay. Fair enough. Glancing at the code, the custom requester looks like a good place to start. Will close the issue as backward compatibility seems to put a full stop to this being changed...

I will come back and say though that the custom serializer support then completely blows up as a paradigm. Both as you already read the content as string instead of passing in the raw content to be interpreted as well as this string interception. The whole point of a custom serializer is that i get to use the requester and routing without any judgements on the data payload. I don't think it's valid to say "strings will blow up anyways". If someone is using a relatively sophisticated feature, they ought to be able to handle their protocol specific cases.

Point 2 also kinda makes sense, but only on the surface. You have a type-safe rest client, by definition. And support for custom serialization. So where's the excuse to read and pass raw API response strings? Only for testing, exploration and that's best done in a browser or elsewhere.

I guess it comes back to the custom serialization aspect feeling pretty leaky (and that was the reason to try here over Refit).

KevivL avatar Sep 20 '16 15:09 KevivL

Will close the issue as backward compatibility seems to put a full stop to this being changed...

Reopening, as I want to think about binary requester support.

The whole point of a custom serializer is that i get to use the requester and routing without any judgements on the data payload

That's not correct. RestEase makes an assumption that the endpoint returns a string, and it does this in order to make the majority use-case smoother for people. It's a custom string deserializer, if you prefer.

Point 2 also kinda makes sense, but only on the surface. You have a type-safe rest client, by definition. And support for custom serialization. So where's the excuse to read and pass raw API response strings? Only for testing, exploration and that's best done in a browser or elsewhere.

Unfortunately it's not possible to make assumptions like this. You can guarantee that, whatever your assumptions, someone will come along with a use-case which breaks all of them. I can guarantee that, somewhere, someone wants a raw string. Maybe RestEase's deserialization abstractions aren't enough for them to do their own deserialization? Maybe they just want to log the response, but not interpret it?

You're a perfect example of this: you've got a binary endpoint, and you're breaking the spec for the binary protocol you're using. And yet RestEase still has ways to help you, but only because I didn't assume that people would never want to customise exactly how requests were formed and responses handled.

I guess it comes back to the custom serialization aspect feeling pretty leaky (and that was the reason to try here over Refit).

When you're writing a tool which gives people a better experience for the majority use case, you have to make assumptions and simplifications. That's the very nature of what you're doing: by making assumptions about what people are trying to do, you can make life easier for them. If you don't want to make assumptions, you can't do better than HttpClient on its own.

A good library will provide multiple layers: if the highest level of abstraction doesn't work, users can fall back to writing a bit more code and being able to customise the behaviour a bit further.

This is what RestEase does.

  1. The first layer assumes that you're returning JSON: life is really easy if you're doing this.
  2. The second layer (custom serializers) assumes you're working with text, but not JSON: you have to write a bit of code, but you can do this too.
  3. The third layer (subclassing Requester) assumes you're working with HttpClient, but allows you to control exactly how the request is constructed and the response interpreted.
  4. The fourth layer (implementing IRequester) gives you absolute control: you don't even need to be using HttpClient, or talking HTTP at all.

If the custom serializers layer didn't assume you were working with text, then life would be harder for that majority which are working with text. It wouldn't be achieving its aim of making life easier for the majority.

In hindsight, if I wanted to add first-class support for binary endpoints from the start, I would have structured IResponseDeserializer to not accept a string, and instead provided an abstract StringResponseDeserializer class which did the reading of the string. I'm not sure how Response<T> would have worked however... The truth is that I never even considered that someone would have a binary endpoint, because that's so fantastically rare.

You obviously have your own expectations about where each of the layers of abstraction should stop and start, and that's OK, but sadly irrelevant. I had my own ideas when I defined them, and that's what got written. If you were around to ask when I was designing RestEase I'm sure I'd have taken your opinion into consideration, but you weren't, so I went with what I thought they should be.

If someone is using a relatively sophisticated feature, they ought to be able to handle their protocol specific cases.

That's what I talked about above. If someone is using a relatively sophisticated feature, they have to fall back to a lower level of abstraction (in this case, back to Requester), but they still have the tools to do what they want.

Subclassing Requester or providing your own IRequester entirely is a perfectly supported way of customising how RestEase creates the request and interprets the response. It's not a workaround because RestEase doesn't do what you want: it's one of the supported and documented ways to get RestEase do handle your advanced scenario.

canton7 avatar Sep 20 '16 16:09 canton7

Cool. Thanks for taking the time to respond. It just comes back to expectation mismatch then. Written a little code with a custom requester, should be easy enough to integrate. It's fair to have the string/text slant as after all this is HTTP, i suppose i just expected "custom serializer" to mean only that, custom handling of a data payload. Thanks again for the help.

KevivL avatar Sep 20 '16 16:09 KevivL

No problem. I'm happy to update the README to clarify the stringy nature of most of RestEase, and point people who want to talk binary in the right direction (including providing a sample IRequester implementation for them / a full-blown BinaryRequester they can use).

canton7 avatar Sep 20 '16 16:09 canton7

Hi,

I have to say, I really like RestEase. So I just tried it and it just worked. I've injected my own authorisation, used custom deserializer. Works great, but before switching from RestSharp (and transitioning to .NET core), I would like to ask for one modification.

NOTE: Actually I can implement changes myself, and send you pull request. I would like to have your approval prior to that, though.

My data always travels wrapped in some kind of envelope. Let's call it Envelope<T>. So proper definition of a interface method would be Task<Envelope<StuffDTO>> GetStuff(). Because it always come like this I would like to make it implicit and declare this methods as Task<StuffDTO> GetStuff(). It would be greate as for the interface consumer there is no difference between this one and "normal" interface. I would like to handle the fact of wrapping in deserializer, in fact, I did and it works. Summary: data travels wrapped in some kind of object across REST-boundary, it is just so implicit that it does not have to be visible on the interface.

The only problem is for Task<Envelope<string>> as I cannot hide it behind Task<string> as this one gets redirected to different method on Requester (and decition is made in ImplementationBuilder)

The non-breaking solution I can think of would be:

  • add new attribute, let's call it [ForceDeserializer] which forces redirection to deserializer regardless of the type.
  • change ImplementationBuilder.AddRequestMethodInvocation(ILGenerator, MethodInfo) accordingly (we have access to method info, so we can access its attributes)

In such case:

Task<string> GetStuff() would still return content string, but [ForceDeserializer] Task<string> GetStuff() would redirect result straight to deserializer.

Optional improvement: [ForceDeserializer] can be placed on whole interface Better, yet breaking change: Task<string> gets deserialized as any Task<T>, Task<StringResponse> does whatever Task<string> does now (where class StringResponse { string Value { get; set; } })

Note: I don't like the name [ForceDeserializer]. Can you think of better one?

MiloszKrajewski avatar Jun 10 '17 13:06 MiloszKrajewski

I wonder whether a better way would be to have two methods on the deserializer, one for string responses and one for T responses. That requires a breaking change to the deserializer interface, but that's less disruptive IMO. We can change it to be an abstract base class at the same time to avoid breakages in the future.

canton7 avatar Jun 11 '17 11:06 canton7

I think I can change to abstract base classes instead of interfaces without breaking backwards compatibility if I'm careful (I'll add deprecation notices but they won't be mandatory).

canton7 avatar Jun 11 '17 21:06 canton7

@canton7 As I understand binary response support was not implemented yet. Are there any plans to do it? Would be great if serializer/deserializer will have direct access to request/response stream and support async i/o.

cd21h avatar Aug 17 '18 23:08 cd21h

There are two things going on here. The first is that RestEase will try and read the response as a string in many cases, and a return type of Task<string> won't cause the deserializer to be called. That hasn't yet been fixed.

However deserializers do have direct access to the response stream, (and request serializers just have access to whatever type is being serialized). If your return type is Task<T> or Task<Response<T>> then the deserializer only gets called once the whole response has been downloaded, so you don't need to do anything asynchronous in the deserializer.

If you return a Task<HttpResponseMessage> or a Task<Stream>, then HttpCompletionOption.ResponseHeadersRead is used, so that you can choose whether or not the response body should be fetched (or report its download progress, etc). If however you return a Task<T>, Task<string>, or Task<Response<T>>, then HttpCompletionOption.ResponseContentRead is used, meaning that any CancellationToken that you pass will cancel the body download. If you return a Task, then the response body isn't fetched, unless an ApiException is thrown.

The long-term plan is to change the deserialize method to be async (as it's a little less confusing for people), and to not receive the body as a string (which can now be done in a backwards-compatible manner). We'll also pass responsibility to handling a return type of Task<string> to the deserializer.

canton7 avatar Aug 18 '18 09:08 canton7

a bare string is not valid JSON

That is true if you look at json.org, but has been superseded in 2014 by RFC-7159 and ECMA-404.

From RFC-7159:

Note that certain previous specifications of JSON constrained a JSON text to be an object or an array.

So arguably RestEase is doing the wrong thing here if the server returns a body of the form "foo" and Content-Type application/json by not stripping the quotes, if the interface method signature is Task<string> GetFoo().

herebebeasties avatar Feb 14 '19 18:02 herebebeasties

See #230 - as part of the fix for that, I've introduced ResponseDeserializer.HandlesStrings, which lets you write a custom deserializer which customises how strings are handled.

canton7 avatar Jul 07 '22 21:07 canton7