cloudstate icon indicating copy to clipboard operation
cloudstate copied to clipboard

Request/Response metadata

Open viktorklang opened this issue 5 years ago • 21 comments
trafficstars

Consider adding support in the Cloudstate protocol for propagating request and response metadata (think for instance headers) so that it is possible to inspect and return things like HTTP or HTTP/2 headers for a given request.

Open questions:

Should this apply to ALL requests/responses? Should they be preserved no matter what? Do we only allow standard headers? Do we only allow custom headers? Do we only allow specific headers? Do we exclude specific headers? Can we achieve most of the use-cases without changing user code or the protocol? Do we only allow it for some specific types? (think HttpBody requests)

Discuss!

viktorklang avatar Jan 07 '20 22:01 viktorklang

Maybe we should just delegate the headers via proxy to the user space. Implementers could access these headers via Context like this pseudo kotlin code:

@CommandHandler
fun getCart(ctx: CommandContext): ShoppingCart {
    val allHeadersFromRequest: MutableMap<String, Any> = ctx.headers
    val userAgentHeader = allHeadersFromRequest["User-Agent"]

    // send response with response header
    ctx.addHeader("Custom-Header", "Custom-Value")
    // or maybe ctx.addHeaders(headers)
    ....
   return .....
}

I think that would be simple

sleipnir avatar Feb 28 '20 20:02 sleipnir

Of course we could allow annotations on the protobuf to enable or disable this feature or allow filters for which headers we are interested in having access to.

I think a possible approach would be:

  1. The proxy must pass the metadata headers of the gRPC request by sending them via cloudstate grpc user custom header protocol to the user role;

  2. The same should happen for http headers from http x grpc transcoding;

  3. Implementers of the cloudstate protocol must or can access these headers via the grpc protocol by accessing the request metadata. Another possibility would be to change the cloudstate protocol to include a new metadata entity, the proxy would then encapsulate the headers received in that message. But I think that would be a lot of overlap and I prefer the simplicity of only using what the underlying protocols (http, grpc) already offer;

  4. Libraries like java-support, go, and others can make this metadata available via context in methods signatures, as I exemplified in a previous comment;

sleipnir avatar Feb 28 '20 20:02 sleipnir

@sleipnir I like 1 and 2. would you allow all the headers to be passed? or as you said to be filtered by the proxy possibly during entity discovery?

marcellanz avatar Feb 29 '20 06:02 marcellanz

Hi @marcellanz I believe that if there is no note in proto that indicates which headers should pass, then all headers should pass, otherwise only the headers specified in the proto should be forwarded.

Examples:

service ShoppingCart {

    rpc GetCart(GetShoppingCart) returns (Cart) {
        option (google.api.http) = {
          get: "/carts/{user_id}",
          additional_bindings: {
            get: "/carts/{user_id}/items",
            response_body: "items"
          }
        };
       option (.cloudstate.metadata.enable) = "true";
       option (.cloudstate.metadata.filter) = "(Cache-Control | Content-Type)";
    }
}

In this case the user is defining that he wants to use the metadata resource and is only interested in the cache and content type headers.

This opens the possibility for more filters (inclusion and exclusion). For example if it were declared like this:

option (.cloudstate.metadata.filter) = "(?! Cache-Control | Content-Type)";

This would mean that the user wants all headers except (Negative Lookahead) for Cache-Control and Content-Type.

If ".cloudstate.metadata.filter" is defined in the proto file then ".cloudstate.metadata.enable" can be optional, since it is implied that the user wants to receive headers.

sleipnir avatar Mar 02 '20 13:03 sleipnir

@viktorklang @marcellanz ping

sleipnir avatar Mar 04 '20 19:03 sleipnir

@sleipnir I'd like to avoid adding too much stuff into the proto definition, because of poor discoverability from a user perspective (I.e. how do I know I need to whitelist things there?)

I'd be fine with defining a set of supported headers and semantics for how they are to be interpreted by the proxy. Having a whitelist means more work adding support for it in the proxy, but at least the user does not need to recomile anything in order for them to be supported in this situation.

It would be good to come up with some canonical use-cases that we can use to drive the development of this feature forward. Wdyt?

viktorklang avatar Mar 04 '20 20:03 viktorklang

"I'd be fine with defining a set of supported headers and semantics for how they are to be interpreted by the proxy"

  • I'm not sure if I was able to visualize / understand what you meant by that

sleipnir avatar Mar 04 '20 21:03 sleipnir

Remembering that the proto definition would be an exception not in the rule. Seen everything I said above

sleipnir avatar Mar 04 '20 21:03 sleipnir

"It would be good to come up with some canonical use-cases that we can use to drive the development of this feature forward. Wdyt?"

  • Looks good!

sleipnir avatar Mar 04 '20 21:03 sleipnir

Hi @viktorklang and @marcellanz

We will make a small review of what has been proposed so far. I think that can help.

  • I made a first suggestion that the proxy would simply send all received headers (grpc, http, or from messaging systems) to the user role via grpc custom headers. In turn, implementers of the protocol would use the existing grpc infrastructure to deliver this to the end user. @marcellanz in comment seemed to like this approach too.

  • I also suggested that it was possible for the client to specify some rules or define which headers he would be interested in receiving. I gave an example that this was done via protobuf extensions via annotations. @viktorklang didn't like to include things in the protobuf. But as I am stubborn and I would like the feature I think we could, as Marcel well noted, do this in the discovery stage by extending EntityDiscovery to support the equivalent of the notes I suggested. This would allow, for example, to enable the feature and which headers you want during the entity registration step:

fun main() {
    
    cloudstate {
       // enable access to headers explicitly
       allowMetadata = true

        registerEventSourcedEntity {
            entityService = PingPongEntity :: class.java
            descriptor = Pingpong.getDescriptor (). findServiceByName ("PingPongService")
            additionalDescriptors = arrayOf (Pingpong.getDescriptor ())
            
            // Register filters here
            metadataFilter = arrayOf ("Cf-Ipcountry", "CF-Connecting-IP", "X-Forwarded-For")
        }
    } .start ()
            .toCompletableFuture ()
            .get ()
}

In addition I think it makes perfect sense that this is an overlap of a Cloudstate Operator configuration, that is, these behaviors could be defined in the Operator's config map and if the user explicitly defines it in the client library, he would be overlapping this configuration:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: cloudstate-operator-config
data:
  config: |
    cloudstate.operator {
      
      # Metada behaviours
      allowMetadata = "false"
      metadataFilters = " * "

      # Watch configuration
      .......

Of course, if we can define this in the config map then we could also do it, or override it, via the metadata of the StatefulService deployment like this:

---
apiVersion: cloudstate.io/v1alpha1
kind: StatefulService
metadata:
  name: shopping-cart
  labels:
    user-container: shopping-cart
    io.cloudstate/allow-metadata: "true"
    io.cloudstate/metada-filter: "*" 
spec:
  datastore:
    name: inmemory
  containers:
    - image: cloudstate/shopping-cart:0.4.3

What do you think?

Now regarding the cases I found some:

  1. CDNS, Load Balances, Proxies Headers Many users operate behind content networks and dns providers that use http headers to indicate behavior or information from the original client of the request. Cloudflare for example has many custom http headers that can, for example, be used to assist with the logic of content distribution or restriction of content to certain regions. A user might be interested in certain specific headers like Cf-Ipcountry or CF-Connecting-IP when processing the request in a user role. In this case, not all headers are important to the business and therefore being able to filter which headers it wants may be desired.

  2. Security: It is important to be able to send certain headers during a response to a request. Headings like: X-XSS-Protection Referrer-Policy Content-Security-Policy X-Frame-Options HSTS X-Content-Type-Options

and many others are crucial when distributing secure content. And so we must be able to send them during a response.

  1. Cache: It is interesting to be able to define values ​​for cache headers such as: Etag Cache-Control Last-Modified and etc..

  2. Feature not yet supported for external request: One of the arguments for Cloudstate and other Faas frameworks with or without a managed state is to hide the infrastructure from the developer, delegate it to the sidecar proxy and focus only on the business development side. That said, one of the things I miss is the possibility of handling a request and issuing another request to the other external http API as a result. Webhooks for example fit this. In this case, the proxy would be able to send any headers that I wanted in this request. X-Twilio-Signature or other HMAC security custom headers for example are very common in many API's. The ability to send headers becomes crucial

  3. Eventing Headers With Eventing support recently added and the natural tendency for new messaging systems other than GPubSub to be included in the solution, it would be natural for the proxy to send the headers of those systems in the same way as it would with http and grpc. As a reference, follow this link to the Google PubSub documentation that mentions message headers. Other types like AMQP, JMS and others have even more needs in this regard.

Excuse me if I said too much or if I insisted on some point overcome :)

sleipnir avatar Mar 05 '20 18:03 sleipnir

Thanks for driving this forward @sleipnir! I really appreciate it.

Would we get a mostly-beneficial solution if we put a whitelist in the proxy's configuration, and pair that with a spec about expected behaviors for the case where the app sends a header which is not allowed?

Then we can expose the functionality in the cloudstate protos and propose some APIs for the the JS and Java libs which can be used for other languages to be inspired by?

Thoughts?

viktorklang avatar Mar 06 '20 13:03 viktorklang

Hi @viktorklang , thanks!

I think we’ll have a good start if we can do that. @marcellanz let us know what you think

sleipnir avatar Mar 06 '20 14:03 sleipnir

I think it could be possible to add to the reference.conf for cloudstate-proxy:

custom-metadata-allow = ["Cache-Control", …]

Then read them out, put them into a Set, and only retain the appropriate ones, logging a warning about the others.

viktorklang avatar Mar 06 '20 15:03 viktorklang

Hi @viktorklang , thanks!

I think we’ll have a good start if we can do that. @marcellanz let us know what you think

@sleipnir sounds reasonable to me, thank you.

marcellanz avatar Mar 09 '20 09:03 marcellanz

I think it could be possible to add to the reference.conf for cloudstate-proxy:

custom-metadata-allow = ["Cache-Control", …]

Then read them out, put them into a Set, and only retain the appropriate ones, logging a warning about the others.

Beautiful ;)

sleipnir avatar Mar 09 '20 13:03 sleipnir

I think it could be possible to add to the reference.conf for cloudstate-proxy:

custom-metadata-allow = ["Cache-Control", …]

Then read them out, put them into a Set, and only retain the appropriate ones, logging a warning about the others.

If it were possible to put this in the yaml it would be great! But let's go in parts, it's already great.

sleipnir avatar Mar 09 '20 13:03 sleipnir

I’m not convinced that it makes sense to put in the YAML (by that I mean the YAML descriptor for the function deployment, i.e. the StatefulService resource)—the headers are set in compiled code, and the YAML can’t change that. However, the proxy controls in & output so it makes sense that you’d want to have a ”gatekeeper” there.

Cheers, √

viktorklang avatar Mar 09 '20 13:03 viktorklang

I’m not convinced that it makes sense to put in the YAML (by that I mean the YAML descriptor for the function deployment, i.e. the StatefulService resource)—the headers are set in compiled code, and the YAML can’t change that. However, the proxy controls in & output so it makes sense that you’d want to have a ”gatekeeper” there. -- Cheers, √

Hi @viktorklang , thanks for the reply. I work in a SRE team and I'm used to solving many problems just by writing annotations in yaml's kubernetes. So from this perspective I think that maybe the proxy should have its settings reflected in a kubernetes configmap and not just in reference.conf. Think of something like an application.conf that replaces what is defined in reference.conf. However, in the cluster! I think that would bring more deployment flexibility. I understand that it would take some time to achieve, but I think it would be good in general.

But I think we can start from some reasonable point and gradually evolve

sleipnir avatar Mar 09 '20 13:03 sleipnir

@sleipnir Agreed—it should be possible to override/add via k8s config. /cc @jroper

viktorklang avatar Mar 09 '20 14:03 viktorklang

@sleipnir Agreed—it should be possible to override/add via k8s config. /cc @jroper

Cool!

sleipnir avatar Mar 09 '20 14:03 sleipnir

@jroper any news?

sleipnir avatar Mar 19 '20 14:03 sleipnir