gateway-api-rs
gateway-api-rs copied to clipboard
improvement(dev-ex): compatible/identical types should be aliased and reused where possible
It would dramatically improve the crate's usability if instead of defining discrete types for every nested object within a CRD, compatible/identical types were defined once and automatically reused.
For example, the HeaderMatch type is identical between GRPCRoute and HTTPRoute resources, and do not require discrete handling when writing code to handle them. Another good example would be the way that HTTPRouteFilters are a superset of GRPCRouteFilters and can therefore also be handled by the same code without special consideration (i.e. any code that handles all HTTPRouteFilter types inherently also handles all GRPCRouteFilter types).
As the crate currently is, a dev would have to write such code with the newtype pattern or work a bit of black magic to skirt the orphan rule. A much more ergonomic solution would be if types that were actually the same to simply have exported aliases where you'd expect them to be ([example]) or for types where the compatible type is effectively a subset to impl From<SubsetType> for SupersetType ([example]).
In the case of aliased types, I'd imagine we'd want something like:
//! NOTE - module structure and naming are for
//! illustrative purposes only, actual real world
//! names and structure should be assumed to carry
//! on as they currently are
pub mod shared {
// ...
#[derive(
Eq,
Clone,
// ...
)]
#[serde(tag = "type", rename_all = "PascalCase")]
pub enum HeaderMatch {
// ...
}
// ...
}
pub mod grpcroute {
// ...
pub type GRPCHeaderMatch = super::shared::HeaderMatch;
// ...
}
pub mod httproute {
// ...
pub type HTTPHeaderMatch = super::shared::HeaderMatch;
// ...
}
and in the case of compatible types:
//! NOTE - see note in example above re: naming and structure
pub mod shared {
// ...
#[derive(
Eq,
Clone,
// ...
)]
#[serde(tag = "type", rename_all = "PascalCase")]
pub struct RequestHeaderFilter {
// ...
}
#[derive(
Eq,
Clone,
// ...
)]
#[serde(tag = "type", rename_all = "PascalCase")]
pub struct RequestMirrorFilter {
// ...
}
// ...
}
pub mod grpcroute {
// ...
#[serde(tag = "type", rename_all = "PascalCase")]
pub enum GrpcRouteFilter {
// ...
/// ...
#[serde(rename_all = "camelCase")]
RequestHeaderModifier {
request_header_modifier: super::shared::RequestHeaderFilter,
}
// ...
}
impl From<GrpcRouteFilter> for super::httproute::HttpRouteFilter {
// ...
}
}
pub mod httproute {
// ...
#[serde(tag = "type", rename_all = "PascalCase")]
pub enum HttpRouteFilter {
// ...
#[serde(rename_all = "camelCase")]
RequestMirror {
request_mirror: super::shared::RequestMirrorFilter,
},
/// ...
#[serde(rename_all = "camelCase")]
RequestHeaderModifier {
request_header_modifier: super::shared::RequestHeaderFilter,
}
// ...
}
}
@the-wondersmith @kflynn and myself all sync'd about this recently. This would be a large boon to the ergonomics of using the library. There were a few options on the table, but we all agreed doing this at the CRD level in the upstream repository might be the best one because it has the additional potential to help other languages and clients in the future.
/cc @aryan9600 as this also relates to our statuses work in #37
@shaneutt @kflynn just as a general "keep alive", what are our next steps here?
(Don't worry, this has not fallen off my radar I have a reminder set for it)
I was hoping to hear from @aryan9600 (ping!) but I think the general idea of the Gateway API project providing CRD information for client generation seems generally helpful and reasonable to ask for at some level. Basically any proposal should follow the GEP process, and in this case I think it would be fair to start a discussion referencing both this issue and #39 and reasoning, and then we can continue discussion there as well as bring it up at the community meeting.
Once it's had some time for discussion, we'll see if we can gather any other allies and then ultimately find out who are going to be the champion(s) driving it forward to completion. Please feel free to start that discussion!
oh and @astoycos and @clux, you might find this interesting as well.
@the-wondersmith checking in?
@shaneutt apologies, I genuinely thought I'd opened that discussion. I've for sure done so now 😅.
Excited to get some feedback there
No worries! Let's discuss there for a bit and see what people think :saluting_face:
I think we should try to bring this topic up in a Gateway API community meeting to try to spark more interest. If no further interest can be sparked, then it might be a situation where we just ask (the Gateway API community) "is anyone opposed to us creating a GEP for this?" and see where we can go from there.
I think we should try to bring this topic up in a Gateway API community meeting {...}
Agreed. When / where is that?
https://gateway-api.sigs.k8s.io/contributing/#meetings ~<- it alternates, but right now: 3 weeks it's Monday at 3pm PST and one week it's Tuesday at 8am PST.~
(Don't we now alternate 3PM PST and 8AM PST every other week?)
Ooop yep, Flynn is right :+1:
So we haven't gotten any feedback in upstream, so I guess it's really up to us how motivated we are, and what specific way we're motivated to do this. My thoughts at this point is that this would be ultimately more beneficial to add in kopium because I expect it will have a positive impact on many projects which use that, whereas I wonder if we do this in Gateway API if it will be there just for us for a prolonged period of time. Thoughts?
So we haven't gotten any feedback in upstream, {...}. My thoughts at this point is that this would be ultimately more beneficial to add in
kopium{...}. Thoughts?
Apologies for the delay, the notification got buried 😖.
I couldn't agree more - kopium seems like exactly the right place for this, as it seems like it would be much easier to add type-reuse at the point of generation than as an after-thought.
What's a good starting point for us on that path?
@clux I think at this point we should ask your opinion on the matter:
tldr; when generating code from CRDs there can be duplicated types which are identical (e.g. re-used sub-types in the source API). We have been considering how we could influence kopium to identify these before emission, and collapse duplicates into single emitted types in the Rust code.
(more details above and in the links)
This would be a great ergonomic improvement in general to have.
My bet here would be on some extra key/info in the schema that can serves as some kind of namespaced $ref replacement. If we had that, then I imagine we could make the matching easier (cross a struct once in iteration, use this new key for naming the struct, short-circuit struct creation next occurrence to avoid producing duplicates).
NB1: It's a shame that we can't set $ref in crd schemas because they are very helpful for bigger code-generators like k8s-openapi / k8s-pb to ensure we have one instance of the structs. Upstream issue: https://github.com/kubernetes/kubernetes/issues/62872
NB2: The thing we* recently started doing; matching for well-known apimachinery type structs is so heavy handed that it's not really fit for this purpose / exposable (without injecting arbitrary code somewhere).
Thanks for your thoughts on this @clux we appreciate it!
In terms of priority this is probably our highest priority issue because we consider it a requirement for a v1 release, and we've heard that there's some interest from Linkerd folks in this project that hinges on getting this resolved.
This is definitely NOT a good-first-issue, however if someone is really interested in digging in (to kopium) it seems quite open for someone from the community to jump in here and take a look at what can be done so we'll mark it as help-wanted.
Potential brute-force solution, given the lack of standard signalling: https://github.com/kube-rs/kopium/issues/298
That sounds like a great first big step, thank you for writing that up @clux. :bow:
With that implemented I think we could consider this resolved as it covers a lot of the problems (cc @the-wondersmith @kflynn, let me know your thoughts on that). The remaining limitations (including de-duplicating with k8s-openapi types) sound like a good follow ups we should be able to track separately and iterate towards.
@shaneutt I think @clux's proposal seems exactly as advertised - a big ol' hammer, but importantly a functional hammer 😅. I agree that, once implemented, it would largely cover sizable portion of the problems and provide a solid base for future iterative solutions.
I'd be interested to hear @kflynn's thoughts though, as I know he's recently been doing some cross-language type work in the emissary codebase, and so might have more "recently been in the trenches" insight than I can offer at the current moment.
@shaneutt I was wondering what is the latest on this?
@shaneutt I was wondering what is the latest on this?
Check out https://github.com/kube-rs/kopium/issues/298, which covers the work to make this happen.
Currently I'm assigned to it and am planning to move it forward, as we didn't have any takers for some time. However, I'm not going to be able to get it done in any particular hurry for a variety of reasons, one of which is that I'm waiting for https://github.com/kube-rs/kopium/pull/327 to shake out as it has some connective tissue with https://github.com/kube-rs/kopium/issues/298 and I want to see that merge first.
This is one of the largest single requirements I want to have completed before we would consider a version 1 release of the library. If you are interested in working on this, please let us know. Otherwise for updates stay subscribed to both this issue and https://github.com/kube-rs/kopium/issues/298.
Thanks, I kinda started to hack around this problem just out of interest and created this PoC https://github.com/dawid-nowak/ast_parser I hope it might be helpful but it also highlights some problems.
-
Finding potential candidates which could become a shared type (based on the elements of a struct) seems to be straightforward enough.
-
The biggest problem at the moment is to derive a sensible name for a shared type based on names of candidates. At the moment this is kinda problematic algorithmically. For example HTTPRouteParentRefs, HTTPRouteStatusParentsParentRef, GRPCRouteParentRefs, GRPCRouteStatusParentsParentRef could be a shared type, but there is no "common" name.
Do you mean to say that it's difficult to find a name that's common, which doesn't seem wrong in some contexts? Because I think what we would ultimately ant to end up with in the specific situation you mentioned is just ParentRef.
Yeah, agreed...what would be the best algorihtm so it finds good names consistently without human input... and not much work 😄
so the usecases here are:
| Type Names | Mapping | |
|---|---|---|
| 1 | HTTPRouteParentRefs, HTTPRouteStatusParentsParentRef, GRPCRouteParentRefs, GRPCRouteStatusParentsParentRef | ParentRef |
| 2 | "HTTPRouteRulesBackendRefsFiltersRequestHeaderModifierAdd", "HTTPRouteRulesBackendRefsFiltersRequestHeaderModifierSet", "HTTPRouteRulesBackendRefsFiltersResponseHeaderModifierAdd", "HTTPRouteRulesBackendRefsFiltersResponseHeaderModifierSet", "HTTPRouteRulesFiltersRequestHeaderModifierAdd", "HTTPRouteRulesFiltersRequestHeaderModifierSet", "HTTPRouteRulesFiltersResponseHeaderModifierAdd", "HTTPRouteRulesFiltersResponseHeaderModifierSet", "GRPCRouteRulesBackendRefsFiltersRequestHeaderModifierAdd", "GRPCRouteRulesBackendRefsFiltersRequestHeaderModifierSet", "GRPCRouteRulesBackendRefsFiltersResponseHeaderModifierAdd", "GRPCRouteRulesBackendRefsFiltersResponseHeaderModifierSet", "GRPCRouteRulesFiltersRequestHeaderModifierAdd", "GRPCRouteRulesFiltersRequestHeaderModifierSet", "GRPCRouteRulesFiltersResponseHeaderModifierAdd", "GRPCRouteRulesFiltersResponseHeaderModifierSet" | HeaderModifier |
| 3 | "HTTPRouteRulesBackendRefsFiltersExtensionRef", "HTTPRouteRulesFiltersExtensionRef", "GRPCRouteRulesBackendRefsFiltersExtensionRef", "GRPCRouteRulesFiltersExtensionRef" | FiltersExtensionRef |
I have tried initially to find a longest matching substring... which works well for 1. not so well for 3. (it gives sFiltersRequestMirrorBackendRef) and doesn't work for 2.
I think if we split the names into separate words based on a letter capitalization we could get a better way of finding the common name ...
Right, in the before mentioned project I got to the stage where I can generate "shared" types for http and grpc routes and then properly change/generate new code so it compiles...
If you think this is useful, I could potentially add this code to the xtask so then the update script could generate either the original code or the code with shared types or maybe both ?
It sounds useful, please do feel free to drop a PR for that and we can talk through it 👍
So here is a draft PR just to get conversation going https://github.com/kube-rs/gateway-api-rs/pull/147