Update QUICKSTART-EKS.md
Adding details pertinent to EKS Auto Mode
Last week I reached out to AWS support. The NodePool provisioned nodes and the containers running on them did not see the neuron devices. The AWS Support wasn't aware of the QUICKSTART and it is only Provisioners that are described in AWS Documentation (the auto-generated response, before I got to speak to an engineer was: this is known issue, see screenshot)
I reached to internal slack channel where I got help. The critical piece was specifying the request so that the neuron-plugin exposes the device. And other critical piece was: EKS Auto Mode should work.
Here is a screenshot of AWS support Q:
Testing done:
After the changes, deployed via flyte. And tested the devices were doing inference. Here is my pod definition:
return PodTemplate(
primary_container_name="inferentia-primary",
pod_spec=V1PodSpec(
containers=[
V1Container(
name="inferentia-primary",
# IPC_LOCK is required for the neuron devices to be visible to the container
# https://awsdocs-neuron.readthedocs-hosted.com/en/latest/containers/docker-example/
security_context=V1SecurityContext(
capabilities=V1Capabilities(add=["IPC_LOCK"])
),
# must be requested, neuron devices aren't exposed automatically (7/22/25)
resources={
"requests": {"aws.amazon.com/neuroncore": str(num_cores)},
"limits": {"aws.amazon.com/neuroncore": str(num_cores)},
},
)
],
affinity=V1Affinity(
node_affinity=V1NodeAffinity(
required_during_scheduling_ignored_during_execution=V1NodeSelector(
node_selector_terms=[
V1NodeSelectorTerm(
match_expressions=[
V1NodeSelectorRequirement(
key="eks.amazonaws.com/instance-family",
operator="In",
values=[instance_type],
)
]
)
]
)
)
),
tolerations=[
V1Toleration(
key="aws.amazon.com/neuron",
operator="Exists",
effect="NoSchedule",
)
],
host_ipc=True,
host_network=True,
# must be set explicitly for the neuron devices to be visible to the container
# https://github.com/bottlerocket-os/bottlerocket/blob/develop/QUICKSTART-EKS.md#neuron-support
security_context=V1PodSecurityContext(
run_as_user=1001,
run_as_group=2001,
fs_group=3001,
),
),
)
Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.