Golang filter: provide a way to get all headers
Title: Golang filter: provide a way to get all headers
Description:
Describe the desired behavior, what scenario it enables and how it would be used.
We can add a new method func (h *Header) GetAll(copy bool) map…. The method takes a boolean to indicate whether we should return a copy of internal headers. If the headers are not copied, it is the user's duty to ensure the headers won't be written later or accessed concurrently.
Here are the advantages:
- it is more effective than
Rangeas we don't need to lock for long - it is easier to write than
Rangeif people want to dump all the headers - Wasm provides a similar method so we can align to that implementation
If this method is added, we can retire the RangeWithCopy because if people are afraid of the deadlock, they can use GetAll with copy instead.
cc @doujiang24 @wangfakang @StarryVae golang's filter code-owners.
I'm fine with GetAll + copy, but I'm not sure with nocopy, since nocopy will expose the internal type map[string][]string.
I was thinking of an optimization: maybe we could use a higher performance type instead of map[string][]string.
In the current map[string][]string, one unique header name, requires at least two GC pointers:
- One for the slice
[]string - Another one for the string data of the value.
Compared to map[string]string, it requires one more GC pointer.
But, I haven't figured out a better way yet. So just a little worried about exposing the internal type to users.
map[string][]string is used as header because we follow the same type used by Go: https://github.com/golang/go/blob/495830acd6976c8a2b39dd4aa4fdc105ad72de52/src/net/http/header.go#L24.
If the value is string instead of []string, it will cause some trouble when handling repeated headers.
If nocopy is not needed, we can use Clone, which is the same as the http.Header type in the Go.
yep, I agree that we should handle the repeated header correctly. just a little worried about exposing the internal type to users, that will make it harder to optimize it in the feature.
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.
I also think a method like GetAllHeaders would be useful.
I don't know much about C++ programming so apologies if the following paragraph is uninformed.
Presumably it would be faster to pass a data structure containing all the headers to the Golang plugin, rather than having the Golang plugin retrieve them from the HeaderMap using Range
Also, the issue with Range is that you iterate over all the headers, which is a waste of time because the special ones like scheme etc are already set in the header map as values - so it would be good to have a capability to get all the other headers
I also think a method like GetAllHeaders would be useful.
I don't know much about C++ programming so apologies if the following paragraph is uninformed.
Presumably it would be faster to pass a data structure containing all the headers to the Golang plugin, rather than having the Golang plugin retrieve them from the
HeaderMapusingRange
We cache the entire header map on the Golang side, in a single C++ Golang interchange.
Range doesn't mean getting values from C++ one by one, there won't be much performance difference.
Also, the issue with
Rangeis that you iterate over all the headers, which is a waste of time because the special ones like scheme etc are already set in the header map as values - so it would be good to have a capability to get all the other headers
okay, you means all headers exclude the special ones like scheme, path and etc, right?
I think this might be a proper choice, it always do a copy for safety, thoughts?
func (h *Header) GetAll(skip_internal bool) map[string][]string
We cache the entire header map on the Golang side, in a single C++ Golang interchange. Range doesn't mean getting values from C++ one by one, there won't be much performance difference.
OK, great - I wasn't aware of this.
okay, you means all headers exclude the special ones like scheme, path and etc, right?
Yes exactly, because the special ones are already available as attributes - https://github.com/envoyproxy/envoy/blob/5aadcb0e68ae168562ab2842b875a75b1bbee6ce/contrib/golang/common/go/api/type.go#L138-L141
I think this might be a proper choice, it always do a copy for safety, thoughts?
That seems perfect to me. Will the looping occur on the golang side or the C++ side? Just thinking the optimal thing would be to do any sort of loop on the C++ side so the whole operation is as fast as possible.
Given the header map object is already cached, I wonder if the additional headers, i.e. the non-special ones, can't me made available in some sort of simple data structure in Golang - e.g. all_other_headers map[string][]string
This prevents the user of the filter needing to create and populate such a data structure themselves for every request
Will the looping occur on the golang side or the C++ side? Just thinking the optimal thing would be to do any sort of loop on the C++ side so the whole operation is as fast as possible.
Golang side should be good enough, it's fast enough, no cgo call overhead.
This prevents the user of the filter needing to create and populate such a data structure themselves for every request
It's a trade-off, since we already have a map[string][]string cache for the whole headers.
This depends on how often the API will be used.
It's a trade-off, since we already have a map[string][]string cache for the whole headers. This depends on how often the API will be used.
Fair enough, is it possible to access the existing cache without creating another map[string][]string?
Fair enough, is it possible to access the existing cache without creating another
map[string][]string?
it could be done in the current Range api, but it contains the special headers.
it could be done in the current Range api, but it contains the special headers.
Oh I see, I'm beginning to understand how it all works now.
So there's an existing map[string][]string under the hood, so to speak, and to access each element you can use Range
But the OP was proposing to write a func which would return a copy of the entire existing map[string][]string, is that right?
I think that would be a good idea because it would be much faster.
In some cases users will want to build other data structures using each header value as an attribute. With the existing Range approach we need to loop over every key, and also every value - so, in the the circumstance of repeated headers, a key will contain several values in the []string array and it will take time. However, with a func which returns map[string][]string we can pass the entire []string to the attribute without having to loop over every item in the array.
Also, could the same method be added to the cluster_specifier_plugin so we can get all the headers, then route to a particular cluster based on the headers of each request.
But the OP was proposing to write a func which would return a copy of the entire existing map[string][]string, is that right?
yep.
Also, could the same method be added to the cluster_specifier_plugin so we can get all the headers, then route to a particular cluster based on the headers of each request.
Do you really want all of the headers? maybe HeaderMap.Values(key string) []string works for your? to get the values of a specified key.
Do you really want all of the headers? maybe HeaderMap.Values(key string) []string works for your? to get the values of a specified key.
~~Well if it works like that I don't see the point of the plugin.~~
~~Let's say I use the golang http plugin to dynamically set a header, e.g. X-SPECIAL-HEADER: cluster-z, and then use the golang cluster_specifier plugin to get the value of that header and route to cluster-z~~
~~I can already use envoy's config to route to a cluster based on the value of X-SPECIAL-HEADER, so there's no point in using the golang cluster_specifier plugin because it will be slower than using native envoy config.~~
~~On the other hand, if I can use one plugin to read all the headers and then route to a particular cluster it's much faster; I can read the existing headers, decide on a cluster and then route to it, without needing to add X-SPECIAL-HEADER: cluster-z~~
Edit, I made a mistake. With Envoy's native config you can set a route based on a header value, but you can't dynamically route to any cluster based on a header value - which is something you can do with the golang cluster_specifier plugin.
So I guess the general idea is you can set a header in the golang http plugin, and then route to a cluster using that header with the golang cluster_specifier plugin.
Personally, I think it would be great if you could retrieve all headers in the golang cluster_specifier plugin because then the flow will be much faster. You can just read the headers, decide on a cluster, and route to it - without having to add a header or invoke two separate plugins in the request flow.
but you can't dynamically route to any cluster based on a header value
cluster_header might be it: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route_components.proto#envoy-v3-api-msg-config-route-v3-routeaction
I think it would be great if you could retrieve all headers in the golang
yep, GetAll could be better when you need all headers.
If you only need some specified headers with pre-known header names, Get/Values should be works too.
Anyway, there is no disagreements about adding the new API:
func (h *Header) GetAll(skip_internal bool) map[string][]string
Would you like to have a try? Seems it could be pure Golang implementation, no C++ code changes need.
cluster_header might be it: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route_components.proto#envoy-v3-api-msg-config-route-v3-routeaction
Ah yes! Thank you.
So you can set a header, then route based on that header's value using Envoy's native config.
So, if the golang cluster_specifier can also access the underlying map[string][]string header cache then this would be the fastest mechanism for routing using a golang plugin, without needing to call another plugin:
- Read all headers by accessing HeaderMap
map[string][]string - Decide on a cluster
- Return chosen cluster name
Therefore, could 1. be added to golang cluster_specifier?
In the mean time I will try the golang implementation you mentioned in for the golang http plugin.
Therefore, could 1. be added to golang cluster_specifier?
yep. it should be a common API for HeaderMap.
In the mean time I will try the golang implementation you mentioned in for the golang http plugin.
Cool!
I thought about it and I realised that it probably doesn't make sense to exclude the internal/special headers. This is because the user will most likely want all the headers - the only reason to exclude them is that in the case of Range it would be faster to do so. But if we're passing an entire copy of the underlying map[string][]string there's no time penalty incurred if we pass all the headers.
The draft PR is here - do you have any thoughts? It's very simple. https://github.com/envoyproxy/envoy/pull/33821
cluster_header might be it: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/route/v3/route_components.proto#envoy-v3-api-msg-config-route-v3-routeaction
Noticed that cluster_header actually doesn't work for this use-case, as detailed here: https://github.com/envoyproxy/envoy/issues/33878
the only reason to exclude them is that in the case of
Rangeit would be faster to do so.
okay, it's really not a good idea to design API by the performance that you thought. As said previously, headers already cached in Golang side, there shouldn't be much performance diff.
The draft PR is here - do you have any thoughts? It's very simple. #33821
added some comments there.
Therefore, could 1. be added to golang cluster_specifier?
yep. it should be a common API for HeaderMap.
I'm happy to make an attempt on the Golang side of this, but I'll need the c++ parts to be done by someone else because I don't have experience with it
I'm happy to make an attempt on the Golang side of this, but I'll need the c++ parts to be done by someone else because I don't have experience with it
maybe you can copy some code from golang http plugin, I'm happy to help if you need it.
I'm happy to make an attempt on the Golang side of this, but I'll need the c++ parts to be done by someone else because I don't have experience with it
maybe you can copy some code from golang http plugin, I'm happy to help if you need it.
Yes I thought I might be able to do that.. OK, I'll give that a go in another PR and then tag you
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.