pulumi-kubernetes icon indicating copy to clipboard operation
pulumi-kubernetes copied to clipboard

`.status.loadBalancer` field was not updated with a hostname/IP address.

Open yellowhat opened this issue 3 years ago • 11 comments

Hello!

  • Vote on this issue by adding a 👍 reaction
  • To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already)

Issue details

Hi, I am trying to run the following code:

monitoring = kubernetes.helm.v3.Release(
    "monitoring",
    name="monitoring",
    chart="kube-prometheus-stack",
    namespace="monitoring",
    repository_opts=kubernetes.helm.v3.RepositoryOptsArgs(
        repo="https://prometheus-community.github.io/helm-charts",
    ),
    values={
        "grafana": {
            "ingress": {
                "enabled": True,
            },
        },
    },
)

grafana_ingress = kubernetes.networking.v1.Ingress.get(
    "grafana-ingress",
    Output.concat(monitoring.status.namespace, "/", monitoring.status.name, "-grafana"),
    opts=ResourceOptions(depends_on=[monitoring]),
)
grafana_ip = grafana_ingress.status.load_balancer.ingress[0].ip

And I get:

$ pulumi up
...
kubernetes:networking.k8s.io/v1:Ingress (grafana-ingress):
    error: 2 errors occurred:
    	* Resource 'monitoring-grafana' was created but failed to initialize
    	* Ingress .status.loadBalancer field was not updated with a hostname/IP address.
        for more information about this error, see https://pulumi.io/xdv72s

I was monitoring the creation of ingress during the deployment. The helm.v3.Release waits for the creation of the monitoring-grafana ingress but the ingress itself got a public ip after 90 seconds. That's why it is failing.

@rawkode

yellowhat avatar Feb 24 '22 08:02 yellowhat

I am afraid as it stands this is more of a bug with Helm's await logic itself. This would likely replicate on the Helm CLI if you installed with await and tried to query the ingress. We have talked about some of the limitations of get in the context of getResource for v3.Chart resource here: https://github.com/pulumi/pulumi-kubernetes/issues/1656 but I believe these issues apply more generally across providers/resources. Perhaps some improvements to the semantics of get in general are worth considering here.

CC @mikhailshilkov and @lblackstone for input.

viveklak avatar Feb 28 '22 22:02 viveklak

Is there a wrapper I can write?

I have tried:

def get_ingress_ip(args):
    """Return Ingress ip"""

    ingress = args[0]
    for _ in range(10):
        ...
        sleep(5)

grafana_ingress = kubernetes.networking.v1.Ingress.get(
    "grafana-ingress",
    Output.concat(monitoring.status.namespace, "/", monitoring.status.name, "-grafana"),
)
grafana_ip = Output.all(grafana_ingress).apply(get_ingress_ip)

I am not sure how to check for the ip availability value.

yellowhat avatar Mar 01 '22 14:03 yellowhat

@yellowhat

Have made progress on a work around for this? It works for me if I pulumi refresh and then pulumi up again, but that's not optimal.

I'm on an older version of pulumi and pulumi-kubernetes where this is not a problem. I've been holding off on upgrading until there's a proper solution or at least a good workaround, so I'm curious.

Evan-S avatar Mar 08 '22 16:03 Evan-S

May I ask which version are you running?

yellowhat avatar Mar 08 '22 16:03 yellowhat

May I ask which version are you running?

pulumi 3.5.1 and pulumi-kubernetes 3.4.1

If it is a helm issue as stated I'm not sure why versioning should matter, but this works for us without problem until we update.

Evan-S avatar Mar 08 '22 16:03 Evan-S

Are you using kubernetes.helm.v3.Chart instead of kubernetes.helm.v3.Release? Unfortunately I do not have a workaround for it.

yellowhat avatar Mar 08 '22 16:03 yellowhat

Yes it's kubernetes.helm.v3.Chart, although I'm not aware of the difference.

I just realized that you are getting the ingress, while I am creating an ingress attached to the service created by the chart. I get the same error message, but it cannot be the issue with get as viveklak described.

I'd like to ping @mikhailshilkov and @lblackstone again as mentioned to see if they have any idea why this is only a problem for newer versions and if there's a workaround, so people getting ingress/creating ingress based on a chart are not locked into older releases. I was hopeful #1810 was the same issue and would fix this, but the resolution did not address my problem, however it differs.

Edit: It only happens for an ingress pointing to a service created by a chart, not for any ingress for our own custom deployments and corresponding services. So it's definitely a helm specific problem, but doesn't seem to be the same one viveklak mentioned.

Evan-S avatar Mar 08 '22 17:03 Evan-S

OK this problem is really only applicable to helm release (which abdicates all chart deployment duties to Helm). Helm Chart will use Pulumi's await logic which has had significant improvements to handle such await scenarios. I would expect if you are running the latest Kubernetes provider. Are you seeing issues after upgrading to newer Kubernetes provider? Lets use a separate issue if so.

viveklak avatar Mar 08 '22 17:03 viveklak

I wish was that simple:

monitoring = kubernetes.helm.v3.Chart(
    "monitoring",
    kubernetes.helm.v3.ChartOpts(
        chart="kube-prometheus-stack",
        version="33.2.0",
        namespace=monitoring_ns.metadata.name,
        fetch_opts=kubernetes.helm.v3.FetchOpts(
            repo="https://prometheus-community.github.io/helm-charts",
        ),
        values={
            "grafana": {
                "ingress": {
                    "enabled": True,
                },
            },
        },
    ),
    opts=ResourceOptions(provider=k8s_provider),
)
$ pulumi version
v3.25.1
$  pip list
Package           Version
----------------- -----------
...
pulumi            3.25.1
pulumi-aws        4.38.0
pulumi-command    0.0.3
pulumi-gcp        6.14.0
pulumi-kubernetes 3.16.0
...
$  pulumi up --yes 
Previewing update (dev):
     Type                             Name           Plan       Info
     pulumi:pulumi:Stack              gcp-gke-dev               3 errors; 2 messages
 +   ├─ kubernetes:helm.sh/v3:Chart   monitoring     create     
 +   └─ kubernetes:core/v1:Namespace  monitoring-ns  create     
 
Diagnostics:
  pulumi:pulumi:Stack (gcp-gke-dev): 
    error: Program failed with an unhandled exception:
    error: Traceback (most recent call last):
      File "/usr/local/bin/pulumi-language-python-exec", line 107, in <module>
        loop.run_until_complete(coro)
      File "/usr/lib/python3.10/asyncio/base_events.py", line 641, in run_until_complete
        return future.result()
      File "/usr/lib/python3.10/site-packages/pulumi/runtime/stack.py", line 126, in run_in_stack
        await run_pulumi_func(lambda: Stack(func))
      File "/usr/lib/python3.10/site-packages/pulumi/runtime/stack.py", line 51, in run_pulumi_func
        await wait_for_rpcs()
      File "/usr/lib/python3.10/site-packages/pulumi/runtime/stack.py", line 110, in wait_for_rpcs
        raise exception
      File "/usr/lib/python3.10/site-packages/pulumi/runtime/resource.py", line 685, in do_register_resource_outputs
        serialized_props = await rpc.serialize_properties(outputs, {})
      File "/usr/lib/python3.10/site-packages/pulumi/runtime/rpc.py", line 172, in serialize_properties
        result = await serialize_property(
      File "/usr/lib/python3.10/site-packages/pulumi/runtime/rpc.py", line 343, in serialize_property
        value = await serialize_property(
      File "/usr/lib/python3.10/site-packages/pulumi/runtime/rpc.py", line 326, in serialize_property
        future_return = await asyncio.ensure_future(awaitable)
      File "/usr/lib/python3.10/site-packages/pulumi/output.py", line 169, in run
        value = await self._future
      File "/usr/lib/python3.10/site-packages/pulumi/output.py", line 123, in get_value
        val = await self._future
      File "/usr/lib/python3.10/site-packages/pulumi/output.py", line 206, in run
        return await transformed.future(with_unknowns=True)
      File "/usr/lib/python3.10/site-packages/pulumi/output.py", line 123, in get_value
        val = await self._future
      File "/usr/lib/python3.10/site-packages/pulumi/output.py", line 206, in run
        return await transformed.future(with_unknowns=True)
      File "/usr/lib/python3.10/site-packages/pulumi/output.py", line 123, in get_value
        val = await self._future
      File "/usr/lib/python3.10/site-packages/pulumi/output.py", line 169, in run
        value = await self._future
      File "/usr/lib/python3.10/site-packages/pulumi/output.py", line 447, in gather_futures
        return await asyncio.gather(*value_futures_list)
      File "/usr/lib/python3.10/site-packages/pulumi/output.py", line 123, in get_value
        val = await self._future
      File "/usr/lib/python3.10/site-packages/pulumi/output.py", line 169, in run
        value = await self._future
      File "/usr/lib/python3.10/site-packages/pulumi/output.py", line 194, in run
        transformed: Input[U] = func(value)
      File "/usr/lib/python3.10/site-packages/pulumi_kubernetes/yaml/yaml.py", line 543, in <lambda>
        CustomResourceDefinition(f"{x}", opts, **obj)))]
      File "/usr/lib/python3.10/site-packages/pulumi_kubernetes/apiextensions/v1/CustomResourceDefinition.py", line 126, in __init__
        __self__._internal_init(resource_name, *args, **kwargs)
    TypeError: CustomResourceDefinition._internal_init() got an unexpected keyword argument 'status'
    error: an unhandled error occurred: Program exited with non-zero exit code: 1

kubernetes.helm.v3.Release as least allows me to deploy. I think that got an unexpected keyword argument 'status' is a know problem for pulumi #1481.

yellowhat avatar Mar 08 '22 19:03 yellowhat

I am trying to implement a way to wait for the ip to be available:

def get_ingress_ip(args):
    """Return Ingress ip"""

    ingress = args[0]
    for _ in range(10):
        ...
        sleep(5)

grafana_ingress = kubernetes.networking.v1.Ingress.get(
    "grafana-ingress",
    Output.concat(monitoring.status.namespace, "/", monitoring.status.name, "-grafana"),
)
grafana_ip = Output.all(grafana_ingress).apply(get_ingress_ip)

I am not sure how to check for the ip availability value.

yellowhat avatar Mar 21 '22 08:03 yellowhat

Traefik doesn't update the status field by default -- you'll need to explicitly enable that behavior by passing something like

  • --providers.kubernetesingress.ingressendpoint.ip=127.0.0.1 or
  • --providers.kubernetesingress.ingressendpoint.hostname=localhost

I was able to reproduce this interaction with Pulumi and confirmed that setting ip/hostname allows everything to complete as expected.

See also https://github.com/traefik/traefik/issues/2173 and the docs here.

blampe avatar Jun 11 '22 22:06 blampe