gateway-api-rs icon indicating copy to clipboard operation
gateway-api-rs copied to clipboard

Gateway API Inference Extension

Open shaneutt opened this issue 3 months ago • 15 comments

The purpose of this task is to add the APIs from the Gateway API Inference Extension (GIE), adapting our current generators to emit those alongside the regular APIs.

A new inference-extension feature should be added to enable these APIs.

shaneutt avatar Sep 08 '25 13:09 shaneutt

I can work on this. I am still familiarizing my self with project tooling but should be able to do so in some time.

nitishkumar71 avatar Oct 20 '25 06:10 nitishkumar71

Sounds good @nitishkumar71. Keep in mind that the GIE is an optional extension. It's related, but technically it's own set of APIs. We currently have standard and experimental as features, this would need to be an optional feature as well. It actually needs both dimensions: its own feature to enable it, and then it has its own standard and experimental. Let us know if you need any support!

shaneutt avatar Oct 23 '25 11:10 shaneutt

Thanks @shaneutt.

I am just not sure about how to support standard and experimental part of GIW in current project structure. So we can make GIW as feature but would want users to choose between standard and experimental.

├── project-root/
│ ├── gateway-api/
│ │ ├── src/
│ │ │ ├── apis/
│ │ │ │ └── experimental/
│ │ │ │ └── standard/
│ │ │ ├── inference-extension/
│ │ │ │ └── experimental/
│ │ │ │ └── standard/

I am not sure, how we can allow user to choose in the current structure. Should it be something like inference_standard and inference_experimental?

nitishkumar71 avatar Oct 23 '25 12:10 nitishkumar71

Indeed, we need to figure this out because more extensions are coming (see the agentic networking subproject).

These are called "extensions" in that they require and interact with the core Gateway API resources. From a user perspective, I think it makes sense to emit them close to the other APIs within experimental/ and standard/, just only when the extension is opted into via its feature (and the feature for extensions should never be a default feature). It makes this a little weird is that they have different groups. We are still a v0, so we have some flexibility on moving things around in later releases.

For now I think a good starting place would be new extensions/ sub-directories under both standard/ and experimental/ for GIE. Then the user has to opt into a combination of either standard and inference-extension features, OR experimental and inference-extension features, and we can create the corresponding #[cfg(all(feature = "standard", feature = "inference-extension"))] and #[cfg(all(feature = "experimental", feature = "inference-extension"))] configuration attributes.

The user could then enable standard and inference-extension and then import like:

apis::standard::gatewayclasses::{GatewayClass, GatewayClassSpec}
apis::standard::gateways::{Gateway, GatewaySpec, GatewayStatus, GatewayStatusAddresses, GatewayStatusListeners}
apis::standard::extensions::inference::{InferencePool, InferencePoolStatus}

LMKWYT? 🤔

shaneutt avatar Oct 23 '25 12:10 shaneutt

This makes more sense. I missed the extension POV. This will allow to easily add more extensions in future and we can re-iterate too if required.

Thanks, I think have clear path to work.

nitishkumar71 avatar Oct 23 '25 12:10 nitishkumar71

I am facing some issues with ./update.sh, even if try to run it on main branch code without any change. It is introducing changes in my local which it should not.

I am following contribution guide and my local OS is debian. Ideally, it should not generate any change. am i doing something wrong?

nitishkumar71 avatar Oct 26 '25 17:10 nitishkumar71

Could you be more specific about the changes? It does emit some artifacts [but these are not meant to be checked in and should be ignored by git].

shaneutt avatar Oct 27 '25 13:10 shaneutt

Changes are visible under gateway-api/src/apis/ in both standard and experimental folders. in some cases they are formatting changes. In other cases, the correct naming as per customized or rename mapped names are not being applied. Please see the screenshot attached.

Image

I can't see anything in logs generated by executing ./update.sh, attaching logs file

logs.txt

nitishkumar71 avatar Oct 27 '25 13:10 nitishkumar71

Which version of kopium are you using?

shaneutt avatar Oct 27 '25 13:10 shaneutt

kopium --version command produces output kopium 0.22.5

nitishkumar71 avatar Oct 27 '25 14:10 nitishkumar71

Thanks @nitishkumar71. I did some digging and reproduced the problem on main. Part of the problem related to changes in kopium 0.22.5, which should now be resolved on the head of main. There are also problems I've found with our new code generators. Some of these issues I've patched, but there's quite a bit more work needed on this which I'm tracking in #197.

With the fixes I've added, it may be possible to continue working on the inference extension. Just note that when you add new types, right now the type-reducer may try to remove and collide types that don't at first appear relevant to your changes. You might have to be a bit patient with it and read through the update.sh and type-reducer to resolve it. If you get stuck however, maybe just wait until I've resolved #197 and had time to fix and refactor the type-reducer. Let me know if you need any more help! 🖖

shaneutt avatar Oct 30 '25 14:10 shaneutt

Thank You, will Give it a try again in sometime.

nitishkumar71 avatar Oct 30 '25 14:10 nitishkumar71

@nitishkumar71: @dawid-nowak was suggesting alternative ways of going about this, over in #197. Particularly: separating the extension out into its own crate, maybe its own repository.

@dawid-nowak: can you go into more details about this alternative, and enumerate some of the motivating factors for this please?

shaneutt avatar Nov 17 '25 15:11 shaneutt

My proposal would be to publish two separate crates. One for Gateway API and one for Inference Extension and allow some code duplication across the crates.

  1. Gateway API and Inference Extensions while closely related are independent projects which are moving at completely different cadence. If we publish one crate with the APIs for both of them it is becoming that more difficult to handle versioning. My suggestion would be to keep versioning of gateway-api and inference-api in lockstep with published versions of Gateway API and Inference Extension standards. So gateway-api version should indicate whether it is tracking versions 1.3 or 1.4 and separately inference-api should indicate that it is tracking 1.1.0.

  2. I think we can achieve this without creating a new project and it should be sufficient to add a new create to current workspace. This means that we can re-use the existing scripts/code/approach.

  3. Unfortunately, this will still result in duplicate types across different crates. Particularly when it comes to simple types like references. The only way to achieve that would be to dump all generated code into a common folder, then run type-reducer and then either to publish as a singe create or split into two creates with a common dependency. At the moment, I am not convinced that it is worth the effort.

dawid-nowak avatar Nov 18 '25 10:11 dawid-nowak

Thanks @dawid-nowak, above suggestion sounds correct to me too. As both projects have their own release cycle, so they should have different crates too.

For the starter, i would keep the common code separate too. Later, we can look at the direction of keeping them together too.

Sorry, my response would be delayed. As i won't be able to access machine frequently till end of next week.

CC: @shaneutt

nitishkumar71 avatar Nov 20 '25 04:11 nitishkumar71