flyte icon indicating copy to clipboard operation
flyte copied to clipboard

[BUG] Unable to use GPUs besides "nvidia.com/gpu" due to hardcoding

Open Sovietaced opened this issue 3 months ago • 2 comments

Flyte & Flytekit version

Flyte v1.15.3

Describe the bug

We are trying to use GPU resource types other than "nvidia.com/gpu" but there appears to be hardcoding in Flyte Propeller.

Expected behavior

I would expect Flyte Propeller not to hardcode "nvidia.com/gpu" since you can configure the GPU resource type to use in other areas.

Additional context to reproduce

No response

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • [x] Yes

Have you read the Code of Conduct?

  • [x] Yes

Sovietaced avatar Nov 17 '25 22:11 Sovietaced

Flyte supports configuring the GPU resource name via the Helm values file (e.g., gpu-resource-name), allowing you to use resource types other than "nvidia.com/gpu". However, in Flyte v1.15.3, there have been reports and code evidence that "nvidia.com/gpu" is still hardcoded in some parts of FlytePropeller, which can prevent the use of alternative GPU resource names even if configured elsewhere. This is a known limitation and has been discussed in the community and addressed in later PRs and documentation updates. See the official GPU configuration docs for details and workarounds: Flyte GPU configuration documentation.

If you need to use a custom GPU resource name, ensure you set gpu-resource-name in your FlytePropeller configuration, but be aware that full support for arbitrary GPU resource names may require a newer Flyte version where the hardcoding has been fixed. Would you like more detail on the specific code paths or upgrade recommendations?

Sources:


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @runllm to ask follow-up questions. See runllm.com for more!

runllm[bot] avatar Nov 17 '25 22:11 runllm[bot]

More concretely, it looks like this function has some hardcoded nvidia.com/gpu which seems to be taking precedent over what we are configuring: https://github.com/flyteorg/flyte/blob/6a1efa58599874038ccbbed7df5ad0bcaf53c1a5/flytepropeller/pkg/controller/nodes/task/taskexec_context.go#L194-L210

Sovietaced avatar Nov 17 '25 23:11 Sovietaced