gateway-api-rs icon indicating copy to clipboard operation
gateway-api-rs copied to clipboard

improvement(dev-ex): compatible/identical types should be aliased and reused where possible

Open the-wondersmith opened this issue 1 year ago • 26 comments

It would dramatically improve the crate's usability if instead of defining discrete types for every nested object within a CRD, compatible/identical types were defined once and automatically reused.

For example, the HeaderMatch type is identical between GRPCRoute and HTTPRoute resources, and do not require discrete handling when writing code to handle them. Another good example would be the way that HTTPRouteFilters are a superset of GRPCRouteFilters and can therefore also be handled by the same code without special consideration (i.e. any code that handles all HTTPRouteFilter types inherently also handles all GRPCRouteFilter types).

As the crate currently is, a dev would have to write such code with the newtype pattern or work a bit of black magic to skirt the orphan rule. A much more ergonomic solution would be if types that were actually the same to simply have exported aliases where you'd expect them to be ([example]) or for types where the compatible type is effectively a subset to impl From<SubsetType> for SupersetType ([example]).

In the case of aliased types, I'd imagine we'd want something like:

//! NOTE - module structure and naming are for
//! illustrative purposes only, actual real world
//! names and structure should be assumed to carry
//! on as they currently are

pub mod shared {
    // ...

    #[derive(
        Eq,
        Clone,
        // ...
    )]
    #[serde(tag = "type", rename_all = "PascalCase")]
    pub enum HeaderMatch {
        // ...
    }

   // ...
}

pub mod grpcroute {
    // ...

    pub type GRPCHeaderMatch = super::shared::HeaderMatch;

    // ...
}

pub mod httproute {
    // ...

    pub type HTTPHeaderMatch = super::shared::HeaderMatch;

    // ...
}

and in the case of compatible types:

//! NOTE - see note in example above re: naming and structure

pub mod shared {
    // ...

    #[derive(
        Eq,
        Clone,
        // ...
    )]
    #[serde(tag = "type", rename_all = "PascalCase")]
    pub struct RequestHeaderFilter {
        // ...
    }

    #[derive(
        Eq,
        Clone,
        // ...
    )]
    #[serde(tag = "type", rename_all = "PascalCase")]
    pub struct RequestMirrorFilter {
        // ...
    }

   // ...
}

pub mod grpcroute {
    // ...

    #[serde(tag = "type", rename_all = "PascalCase")]
    pub enum GrpcRouteFilter {
        // ...

        /// ...
        #[serde(rename_all = "camelCase")]
        RequestHeaderModifier {
            request_header_modifier: super::shared::RequestHeaderFilter,
        }

        // ...
    }

    impl From<GrpcRouteFilter> for super::httproute::HttpRouteFilter {
        // ...
    }
}

pub mod httproute {
    // ...

    #[serde(tag = "type", rename_all = "PascalCase")]
    pub enum HttpRouteFilter {
        // ...
    
        #[serde(rename_all = "camelCase")]
        RequestMirror {
            request_mirror: super::shared::RequestMirrorFilter,
        },

        /// ...
        #[serde(rename_all = "camelCase")]
        RequestHeaderModifier {
            request_header_modifier: super::shared::RequestHeaderFilter,
        }

       // ...
    }
}

the-wondersmith avatar May 14 '24 14:05 the-wondersmith

@the-wondersmith @kflynn and myself all sync'd about this recently. This would be a large boon to the ergonomics of using the library. There were a few options on the table, but we all agreed doing this at the CRD level in the upstream repository might be the best one because it has the additional potential to help other languages and clients in the future.

/cc @aryan9600 as this also relates to our statuses work in #37

shaneutt avatar May 14 '24 15:05 shaneutt

@shaneutt @kflynn just as a general "keep alive", what are our next steps here?

the-wondersmith avatar May 17 '24 14:05 the-wondersmith

(Don't worry, this has not fallen off my radar I have a reminder set for it)

I was hoping to hear from @aryan9600 (ping!) but I think the general idea of the Gateway API project providing CRD information for client generation seems generally helpful and reasonable to ask for at some level. Basically any proposal should follow the GEP process, and in this case I think it would be fair to start a discussion referencing both this issue and #39 and reasoning, and then we can continue discussion there as well as bring it up at the community meeting.

Once it's had some time for discussion, we'll see if we can gather any other allies and then ultimately find out who are going to be the champion(s) driving it forward to completion. Please feel free to start that discussion!

oh and @astoycos and @clux, you might find this interesting as well.

shaneutt avatar May 21 '24 01:05 shaneutt

@the-wondersmith checking in?

shaneutt avatar Jun 21 '24 15:06 shaneutt

@shaneutt apologies, I genuinely thought I'd opened that discussion. I've for sure done so now 😅.

Excited to get some feedback there

the-wondersmith avatar Jun 24 '24 17:06 the-wondersmith

No worries! Let's discuss there for a bit and see what people think :saluting_face:

shaneutt avatar Jun 24 '24 19:06 shaneutt

I think we should try to bring this topic up in a Gateway API community meeting to try to spark more interest. If no further interest can be sparked, then it might be a situation where we just ask (the Gateway API community) "is anyone opposed to us creating a GEP for this?" and see where we can go from there.

shaneutt avatar Aug 05 '24 12:08 shaneutt

I think we should try to bring this topic up in a Gateway API community meeting {...}

Agreed. When / where is that?

the-wondersmith avatar Aug 05 '24 16:08 the-wondersmith

https://gateway-api.sigs.k8s.io/contributing/#meetings ~<- it alternates, but right now: 3 weeks it's Monday at 3pm PST and one week it's Tuesday at 8am PST.~

shaneutt avatar Aug 05 '24 16:08 shaneutt

(Don't we now alternate 3PM PST and 8AM PST every other week?)

kflynn avatar Aug 05 '24 16:08 kflynn

Ooop yep, Flynn is right :+1:

shaneutt avatar Aug 05 '24 16:08 shaneutt

So we haven't gotten any feedback in upstream, so I guess it's really up to us how motivated we are, and what specific way we're motivated to do this. My thoughts at this point is that this would be ultimately more beneficial to add in kopium because I expect it will have a positive impact on many projects which use that, whereas I wonder if we do this in Gateway API if it will be there just for us for a prolonged period of time. Thoughts?

shaneutt avatar Sep 09 '24 14:09 shaneutt

So we haven't gotten any feedback in upstream, {...}. My thoughts at this point is that this would be ultimately more beneficial to add in kopium {...}. Thoughts?

Apologies for the delay, the notification got buried 😖.

I couldn't agree more - kopium seems like exactly the right place for this, as it seems like it would be much easier to add type-reuse at the point of generation than as an after-thought.

What's a good starting point for us on that path?

the-wondersmith avatar Sep 26 '24 20:09 the-wondersmith

@clux I think at this point we should ask your opinion on the matter:

tldr; when generating code from CRDs there can be duplicated types which are identical (e.g. re-used sub-types in the source API). We have been considering how we could influence kopium to identify these before emission, and collapse duplicates into single emitted types in the Rust code.

(more details above and in the links)

shaneutt avatar Oct 01 '24 13:10 shaneutt

This would be a great ergonomic improvement in general to have.

My bet here would be on some extra key/info in the schema that can serves as some kind of namespaced $ref replacement. If we had that, then I imagine we could make the matching easier (cross a struct once in iteration, use this new key for naming the struct, short-circuit struct creation next occurrence to avoid producing duplicates).


NB1: It's a shame that we can't set $ref in crd schemas because they are very helpful for bigger code-generators like k8s-openapi / k8s-pb to ensure we have one instance of the structs. Upstream issue: https://github.com/kubernetes/kubernetes/issues/62872

NB2: The thing we* recently started doing; matching for well-known apimachinery type structs is so heavy handed that it's not really fit for this purpose / exposable (without injecting arbitrary code somewhere).

clux avatar Oct 01 '24 20:10 clux

Thanks for your thoughts on this @clux we appreciate it!

In terms of priority this is probably our highest priority issue because we consider it a requirement for a v1 release, and we've heard that there's some interest from Linkerd folks in this project that hinges on getting this resolved.

This is definitely NOT a good-first-issue, however if someone is really interested in digging in (to kopium) it seems quite open for someone from the community to jump in here and take a look at what can be done so we'll mark it as help-wanted.

shaneutt avatar Oct 18 '24 21:10 shaneutt

Potential brute-force solution, given the lack of standard signalling: https://github.com/kube-rs/kopium/issues/298

clux avatar Dec 17 '24 02:12 clux

That sounds like a great first big step, thank you for writing that up @clux. :bow:

With that implemented I think we could consider this resolved as it covers a lot of the problems (cc @the-wondersmith @kflynn, let me know your thoughts on that). The remaining limitations (including de-duplicating with k8s-openapi types) sound like a good follow ups we should be able to track separately and iterate towards.

shaneutt avatar Dec 17 '24 13:12 shaneutt

@shaneutt I think @clux's proposal seems exactly as advertised - a big ol' hammer, but importantly a functional hammer 😅. I agree that, once implemented, it would largely cover sizable portion of the problems and provide a solid base for future iterative solutions.

I'd be interested to hear @kflynn's thoughts though, as I know he's recently been doing some cross-language type work in the emissary codebase, and so might have more "recently been in the trenches" insight than I can offer at the current moment.

the-wondersmith avatar Dec 17 '24 16:12 the-wondersmith

@shaneutt I was wondering what is the latest on this?

dawid-nowak avatar Apr 18 '25 18:04 dawid-nowak

@shaneutt I was wondering what is the latest on this?

Check out https://github.com/kube-rs/kopium/issues/298, which covers the work to make this happen.

Currently I'm assigned to it and am planning to move it forward, as we didn't have any takers for some time. However, I'm not going to be able to get it done in any particular hurry for a variety of reasons, one of which is that I'm waiting for https://github.com/kube-rs/kopium/pull/327 to shake out as it has some connective tissue with https://github.com/kube-rs/kopium/issues/298 and I want to see that merge first.

This is one of the largest single requirements I want to have completed before we would consider a version 1 release of the library. If you are interested in working on this, please let us know. Otherwise for updates stay subscribed to both this issue and https://github.com/kube-rs/kopium/issues/298.

shaneutt avatar Apr 19 '25 09:04 shaneutt

Thanks, I kinda started to hack around this problem just out of interest and created this PoC https://github.com/dawid-nowak/ast_parser I hope it might be helpful but it also highlights some problems.

  1. Finding potential candidates which could become a shared type (based on the elements of a struct) seems to be straightforward enough.

  2. The biggest problem at the moment is to derive a sensible name for a shared type based on names of candidates. At the moment this is kinda problematic algorithmically. For example HTTPRouteParentRefs, HTTPRouteStatusParentsParentRef, GRPCRouteParentRefs, GRPCRouteStatusParentsParentRef could be a shared type, but there is no "common" name.

dawid-nowak avatar Apr 22 '25 18:04 dawid-nowak

Do you mean to say that it's difficult to find a name that's common, which doesn't seem wrong in some contexts? Because I think what we would ultimately ant to end up with in the specific situation you mentioned is just ParentRef.

shaneutt avatar Apr 24 '25 12:04 shaneutt

Yeah, agreed...what would be the best algorihtm so it finds good names consistently without human input... and not much work 😄

so the usecases here are:

Type Names Mapping
1 HTTPRouteParentRefs, HTTPRouteStatusParentsParentRef, GRPCRouteParentRefs, GRPCRouteStatusParentsParentRef ParentRef
2 "HTTPRouteRulesBackendRefsFiltersRequestHeaderModifierAdd", "HTTPRouteRulesBackendRefsFiltersRequestHeaderModifierSet", "HTTPRouteRulesBackendRefsFiltersResponseHeaderModifierAdd", "HTTPRouteRulesBackendRefsFiltersResponseHeaderModifierSet", "HTTPRouteRulesFiltersRequestHeaderModifierAdd", "HTTPRouteRulesFiltersRequestHeaderModifierSet", "HTTPRouteRulesFiltersResponseHeaderModifierAdd", "HTTPRouteRulesFiltersResponseHeaderModifierSet", "GRPCRouteRulesBackendRefsFiltersRequestHeaderModifierAdd", "GRPCRouteRulesBackendRefsFiltersRequestHeaderModifierSet", "GRPCRouteRulesBackendRefsFiltersResponseHeaderModifierAdd", "GRPCRouteRulesBackendRefsFiltersResponseHeaderModifierSet", "GRPCRouteRulesFiltersRequestHeaderModifierAdd", "GRPCRouteRulesFiltersRequestHeaderModifierSet", "GRPCRouteRulesFiltersResponseHeaderModifierAdd", "GRPCRouteRulesFiltersResponseHeaderModifierSet" HeaderModifier
3 "HTTPRouteRulesBackendRefsFiltersExtensionRef", "HTTPRouteRulesFiltersExtensionRef", "GRPCRouteRulesBackendRefsFiltersExtensionRef", "GRPCRouteRulesFiltersExtensionRef" FiltersExtensionRef

I have tried initially to find a longest matching substring... which works well for 1. not so well for 3. (it gives sFiltersRequestMirrorBackendRef) and doesn't work for 2.

I think if we split the names into separate words based on a letter capitalization we could get a better way of finding the common name ...

dawid-nowak avatar Apr 27 '25 12:04 dawid-nowak

Right, in the before mentioned project I got to the stage where I can generate "shared" types for http and grpc routes and then properly change/generate new code so it compiles...

If you think this is useful, I could potentially add this code to the xtask so then the update script could generate either the original code or the code with shared types or maybe both ?

dawid-nowak avatar Apr 30 '25 19:04 dawid-nowak

It sounds useful, please do feel free to drop a PR for that and we can talk through it 👍

shaneutt avatar May 01 '25 11:05 shaneutt

So here is a draft PR just to get conversation going https://github.com/kube-rs/gateway-api-rs/pull/147

dawid-nowak avatar Jun 09 '25 14:06 dawid-nowak