pulumi-kubernetes
pulumi-kubernetes copied to clipboard
hang with 100% cpu during preview of ConfigFile resources
Hello!
- Vote on this issue by adding a 👍 reaction
- To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already)
Issue details
Trying to pulumi up
my prod environment after not touching it for a while, and it is hanging (100% cpu on python process) in preview. I believe is it related to the large (alb and cert-manager) k8s.yaml.ConfigFile
resources I have. Everything has been fine for many months, but since the last time I touched it, pulumi, python, pulumi-kubernetes, and my laptop (new m1) have gone through many updates (which I have just applied). I've tried logging as suggested on the troubleshooting page, but can't see anything interesting. pulumi refresh
seems to work ok. aws cli
and kubectl
are connecting. If I comment out the ConfigFile
resource, then preview completes normally (and offers to delete my resources).
$ pulumi version
v3.22.0
$ pip freeze
Arpeggio==1.10.2
attrs==21.4.0
certifi==2021.10.8
charset-normalizer==2.0.10
dill==0.3.4
grpcio==1.43.0
idna==3.3
parver==0.3.1
protobuf==3.19.3
pulumi==3.22.0
pulumi-aws==4.34.0
pulumi-eks==0.36.0
pulumi-kubernetes==3.14.0
PyYAML==6.0
requests==2.27.1
semver==2.13.0
six==1.16.0
urllib3==1.26.8```
I saw #1731 but there was no solution for me there. I tried uninstalling awscli but got this error message: Error: Could not find aws CLI for EKS.
Also, I'm on a MacBook 16 M1 Max.
$ aws --version
aws-cli/2.4.11 Python/3.9.9 Darwin/21.2.0 source/arm64 prompt/off
I spun up a new debian 11 vm, installed aws v1, kubectl, pulumi, created a new python venv, installed the above list of python packages, copied the source from my machine, logged into pulumi cloud, ran pulumi up, and it works as expected.
results of:
pulumi up --stack xxxx/yyyy.devx --logflow --logtostderr -v=9 2> out.txt
out.txt
This may be enough to reproduce:
cert_manager_crds = k8s.yaml.ConfigFile(
'cert-manager',
# opts=ResourceOptions(provider=k8s_provider),
file='manifests/cert-manager-v1.8.0.crds.yaml', # from: https://github.com/cert-manager/cert-manager/releases/download/v1.8.0/cert-manager.crds.yaml
)
This hang occurs just trying to preview. If I comment out the above code in my project, everything runs fine.
Copy/paste from slack:
Sorry, I'm back again. This issue is still not resolved for me. I have updated pulumi and libraries to current releases, but python still hangs. I managed to attach a python debugger to the process and it seems to get stuck forever in grpc/protobuf code. I tried stepping through but the stack was 50 deep and just low level serialization code. This is the yaml I'm trying to apply: https://github.com/cert-manager/cert-manager/releases/download/v1.8.0/cert-manager.crds.yaml If I comment out half of it, it seems fine. If I put some back in, it hangs. But it doesn't seem to matter which bit I comment out, more so the amount. To recap: this used to work on my Intel mac. It works now on a debian arm64 VM running on my m1 mac. It hangs on the mac using python 3.9 arm64 build.
I'm running into the same problem, but with KEDA. Am also on a Mac using python 3.9 arm64 build.
# # Install KEDA
keda = k8s.yaml.ConfigFile(
"keda",
file="https://github.com/kedacore/keda/releases/download/v2.2.0/keda-2.2.0.yaml",
transformations=[remove_status],
)
I've narrowed it down to grpc/protobuf...
I have attached a script to reproduce with only the pulumi python package as a dependency. test.py.txt
The summary is that it takes 15s to serialize a small in-memory structure to a byte buffer. On Linux it takes 1ms.
No idea why. It's really tough to debug deep recursive structures in protobuf code. I suppose this issue should really be reported to them, but would appreciate some guidance here first.
Thanks.
I see https://github.com/protocolbuffers/protobuf/issues/9839 has already been raised. Our internal testing suggests that this is just a protobuf issue and there's nothing pulumi specific about it. We'll watch and assist that issue although we're limited in engineers who have access to M1s to develop with.
Confirming on my m1 for more data points:
parse=0.0079
serialize_1=14.2372
serialize_2=2.4117
Running on a linux VM on the same machine (Linux fedora 5.11.12-300.fc34.aarch64 #1 SMP Wed Apr 7 16:12:21 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux
):
parse=0.0001
serialize_1=0.0003
serialize_2=0.0004
So it was pretty simple in the end. The current protobuf python package does not build the extension for M1, and the pure python implementation either doesn't work at all, or works too slowly with large chunks of yaml.
see: https://github.com/protocolbuffers/protobuf/issues/9839
Forcing the build of the local extension in the venv of my pulumi project resolves the issue.
A recipe for users with brew:
$ cd my-project
$ source venv/bin/activate
$ export CFLAGS="-I$(brew --prefix protobuf)/include"; export LDFLAGS="-L$(brew --prefix protobuf)/lib"
$ pip install --force-reinstall protobuf=="$(brew list --version protobuf | awk '{print $2}')" --install-option="--cpp_implementation"
w list --version protobuf | awk '{print $2}')" --install-option="--cpp_implementation"
This didn't work for me. I am pretty novice to Macs in general but I got an error
pip install --force-reinstall protobuf=="$(brew list --version protobuf | awk '{print $2}')" --install-option="--cpp_implementation"
WARNING: Disabling all use of wheels due to the use of --build-option / --global-option / --install-option.
ERROR: Could not find a version that satisfies the requirement protobuf== (from versions: 2.0.0b0, 2.0.3, 2.3.0, 2.4.1, 2.5.0, 2.6.0, 2.6.1, 3.0.0a2, 3.0.0a3, 3.0.0b1.post2, 3.0.0b2, 3.0.0b2.post1, 3.0.0b2.post2, 3.0.0b3, 3.0.0b4, 3.0.0, 3.1.0.post1, 3.2.0rc1, 3.2.0rc1.post1, 3.2.0rc2, 3.2.0, 3.3.0, 3.4.0, 3.5.0.post1, 3.5.1, 3.5.2, 3.5.2.post1, 3.6.0, 3.6.1, 3.7.0rc2, 3.7.0rc3, 3.7.0, 3.7.1, 3.8.0rc1, 3.8.0, 3.9.0rc1, 3.9.0, 3.9.1, 3.9.2, 3.10.0rc1, 3.10.0, 3.11.0rc1, 3.11.0rc2, 3.11.0, 3.11.1, 3.11.2, 3.11.3, 3.12.2, 3.12.4, 3.13.0rc3, 3.13.0, 3.14.0rc1, 3.14.0rc2, 3.14.0rc3, 3.14.0, 3.15.0rc1, 3.15.0rc2, 3.15.0, 3.15.1, 3.15.2, 3.15.3, 3.15.4, 3.15.5, 3.15.6, 3.15.7, 3.15.8, 3.16.0rc1, 3.16.0rc2, 3.16.0, 3.17.0rc1, 3.17.0rc2, 3.17.0, 3.17.1, 3.17.2, 3.17.3, 3.18.0rc1, 3.18.0rc2, 3.18.0, 3.18.1, 3.19.0rc1, 3.19.0rc2, 3.19.0, 3.19.1, 3.19.2, 3.19.3, 3.19.4, 3.20.0rc1, 3.20.0rc2, 3.20.0, 3.20.1rc1, 3.20.1, 4.0.0rc1, 4.0.0rc2)
ERROR: No matching distribution found for protobuf==
I tried just stripping it down to
pip install --force-reinstall protobuf==3.20.1 --install-option="--cpp_implementation" and it uninstalls okay and then
In file included from google/protobuf/pyext/descriptor.cc:33:
./google/protobuf/pyext/descriptor.h:39:10: fatal error: 'google/protobuf/descriptor.h' file not found
#include <google/protobuf/descriptor.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
error: command '/usr/bin/clang' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
WARNING: No metadata found in ./venv/lib/python3.9/site-packages
Rolling back uninstall of protobuf
Moving to /Users/gabrielmccoll/quickstart/venv/lib/python3.9/site-packages/google/protobuf/
from /Users/gabrielmccoll/quickstart/venv/lib/python3.9/site-packages/google/~rotobuf
Moving to /Users/gabrielmccoll/quickstart/venv/lib/python3.9/site-packages/protobuf-3.20.1-nspkg.pth
from /private/var/folders/84/9sk86cr1095dwf1_yn3fxlj00000gn/T/pip-uninstall-ew65e1cj/protobuf-3.20.1-nspkg.pth
Moving to /Users/gabrielmccoll/quickstart/venv/lib/python3.9/site-packages/protobuf-3.20.1.dist-info/
from /Users/gabrielmccoll/quickstart/venv/lib/python3.9/site-packages/~rotobuf-3.20.1.dist-info
error: legacy-install-failure
× Encountered error while trying to install package.
╰─> protobuf
note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
any hints appreciated @cbcmg
@gabrielmccoll It looks like you don't have protobuf installed with brew. Do you have brew installed?
@gabrielmccoll It looks like you don't have protobuf installed with brew. Do you have brew installed?
Thank you for the fast reply. I seemed to get it when I installed Pulumi via pip?
I tried just installing brew install protobuf but it brought version 3.19 and Pulumi seemed to think the sdk wasn't installed anymore.
Apologies as I'm probably just being very novice
Yes, installing pulumi will install protobuf-3.20.1. Brew does not yet have that version, but 3.19.4 seems to work fine with pulumi.
Pulumi seemed to think the sdk wasn't installed anymore
Not sure what happened there. If you have a clean pulumi project, brew installed, and protobuf installed with brew, and the script above completes without errors, it should work.
Yes, installing pulumi will install protobuf-3.20.1. Brew does not yet have that version, but 3.19.4 seems to work fine with pulumi.
Pulumi seemed to think the sdk wasn't installed anymore
Not sure what happened there. If you have a clean pulumi project, brew installed, and protobuf installed with brew, and the script above completes without errors, it should work.
Okay got it thank you. I ran pip uninstall protobuf. then brew install protobuf then ran your script above and it seems to work now.
thanks a lot!