[Failed to deploy openwhisk deployment]Unable to success deploy openwhisk.yml
Environment details:
- Ubuntu 18.04.6 LTS (Bionic Beaver)
Steps to reproduce the issue:
- https://github.com/apache/openwhisk/blob/master/ansible/README.md
- Run the command "~/openwhisk/ansible# ansible-playbook -i environments/local/ openwhisk.yml"
Provide the expected results and outputs:
Unable to
fatal: [controller0]: FAILED! => {"attempts": 12, "changed": false, "content": "", "msg": "Status code was -1 and not [200]: Request failed: <urlopen error [Errno 111] Connection refused>", "redirected": false, "status": -1, "url": "https://172.17.0.1:10001/ping"}
Status code was -1 and not [200]: Request failed: <urlopen error [Errno
111] Connection refused>
PLAY RECAP *********************************************************************
controller0 : ok=25 changed=3 unreachable=0 failed=1
etcd0 : ok=0 changed=0 unreachable=0 failed=0
kafka0 : ok=10 changed=4 unreachable=0 failed=0
Monday 10 October 2022 02:12:15 +0000 (0:02:09.112) 0:04:54.659
===============================================================================
controller : wait until the Controller in this host is up and running - 129.11s
zookeeper : (re)start zookeeper ---------------------------------------- 71.94s
kafka : (re)start kafka using 'wurstmeister/kafka:2.13-2.7.0' --------- 17.49s
kafka : wait until the kafka server started up ------------------------- 15.52s
zookeeper : wait until the Zookeeper in this host is up and running ----- 8.65s
controller : (re)start controller --------------------------------------- 5.12s
controller : copy certificates ------------------------------------------ 3.67s
controller : copy nginx certificate keystore ---------------------------- 3.07s
Gathering Facts --------------------------------------------------------- 2.83s
controller : copy jmxremote password file ------------------------------- 2.73s
Gathering Facts --------------------------------------------------------- 2.56s
Gathering Facts --------------------------------------------------------- 2.47s
controller : populate environment variables for controller -------------- 2.31s
controller : copy jmxremote access file --------------------------------- 1.88s
controller : check if whisk_local_whisks with CouchDB exists ------------ 1.74s
controller : Add akka environment to controller environment ------------- 1.68s
controller : ensure controller config directory is created with permissions --- 1.61s
kafka : add kafka default env vars -------------------------------------- 1.48s
controller : add seed nodes to controller environment ------------------- 1.40s
controller : prepare controller port ------------------------------------ 1.38s
root@ubuntu-s-2vcpu-4gb-sgp1-01:~/openwhisk/ansible#
Please help me where and what I am missing in settings?
How can I get success deploy?
Thank you in advance.
Try to figure out the reason in the controller logs.
If you used the default configuration, it would be under the /tmp/wsklogs/controller directory.
@style95 Thank you for your suggestion but I am unable to find the /tmp/wsklogs/controller directory in my environment. Is there any additional setting or need to open ports for successful installation.
@style95
Now I am getting this error when I execute the commnad: ansible-playbook -i environments/local/ openwhisk.yml
fatal: [kafka0]: FAILED! => {"attempts": 10, "changed": true, "cmd": "(echo dump; sleep 1) | nc 172.17.0.1 2181 | grep /brokers/ids/0", "delta": "0:00:01.017253", "end": "2022-10-16 05:55:31.438724", "msg": "non-zero return code", "rc": 1, "start": "2022-10-16 05:55:30.421471", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
[FAILED]
(echo dump; sleep 1) | nc 172.17.0.1 2181 | grep /brokers/ids/0 non-zero return code
PLAY RECAP ********************************************************************* etcd0 : ok=0 changed=0 unreachable=0 failed=0 kafka0 : ok=9 changed=3 unreachable=0 failed=1
We can't figure out the reason with what you provided. Please share the container logs or components logs.
You can find logs under this directory or with the docker logs command.
Hi @style95 , sorry for late reply. Let me explain in simple. In my new machine "Ubuntu 18.04.6 LTS", I am trying to install openwhisk. After successful build, I tried to execute ansible command one by one: ansible-playbook -i environments/local/ couchdb.yml ansible-playbook -i environments/local/ initdb.yml ansible-playbook -i environments/local/ wipe.yml ansible-playbook -i environments/local/ apigateway.yml
And encounter the error when I run this command
root@ubuntu-s-1vcpu-2gb-sgp1-01:~/openwhisk/ansible# ansible-playbook -i environments/local/ openwhisk.yml
fatal: [172.17.0.1]: FAILED! => {"changed": false, "msg": "The header parameter requires a key:value,key:value syntax to be properly parsed."}
The header parameter requires a key:value,key:value syntax to be properly parsed.
PLAY RECAP ********************************************************************* 172.17.0.1 : ok=11 changed=9 unreachable=0 failed=1
Please share the container logs or components logs.
Here is a log:
root@ubuntu-s-1vcpu-2gb-sgp1-01:~/openwhisk/tmp/wsklogs# cat controller0/controller0_logs.log [2022-10-17T23:10:35.980Z] [INFO] Slf4jLogger started [2022-10-17T23:10:37.158Z] [INFO] Remoting started with transport [Artery tcp]; listening on address [akka://[email protected]:25520] with UID [9046092925284476410] [2022-10-17T23:10:37.227Z] [INFO] Cluster Node [akka://[email protected]:25520] - Starting up, Akka version [2.6.12] ... [2022-10-17T23:10:37.520Z] [INFO] Cluster Node [akka://[email protected]:25520] - Registered cluster JMX MBean [akka:type=Cluster] [2022-10-17T23:10:37.522Z] [INFO] Cluster Node [akka://[email protected]:25520] - Started up successfully [2022-10-17T23:10:37.784Z] [INFO] Cluster Node [akka://[email protected]:25520] - No downing-provider-class configured, manual cluster downing required, see https://doc.akka.io/docs/akka/current/typed/cluster.html#downing [2022-10-17T23:10:37.786Z] [INFO] Cluster Node [akka://[email protected]:25520] - No seed nodes found in configuration, relying on Cluster Bootstrap for joining [2022-10-17T23:10:39.666Z] [WARN] Failed to attach the instrumentation because the Kamon Bundle is not present on the classpath [2022-10-17T23:10:40.076Z] [INFO] Started the Kamon StatsD reporter [2022-10-17T23:10:41.333Z] [INFO] [#tid_sid_unknown] [Config] environment set value for limits.triggers.fires.perMinute [2022-10-17T23:10:41.334Z] [INFO] [#tid_sid_unknown] [Config] environment set value for limits.actions.sequence.maxLength [2022-10-17T23:10:41.334Z] [INFO] [#tid_sid_unknown] [Config] environment set value for limits.actions.invokes.concurrent [2022-10-17T23:10:41.336Z] [INFO] [#tid_sid_unknown] [Config] environment set value for whisk.api.host.name [2022-10-17T23:10:41.336Z] [INFO] [#tid_sid_unknown] [Config] environment set value for limits.actions.invokes.perMinute [2022-10-17T23:10:41.336Z] [INFO] [#tid_sid_unknown] [Config] environment set value for whisk.api.host.proto [2022-10-17T23:10:41.336Z] [INFO] [#tid_sid_unknown] [Config] environment set value for whisk.api.host.port [2022-10-17T23:10:41.337Z] [INFO] [#tid_sid_unknown] [Config] environment set value for runtimes.manifest [2022-10-17T23:10:41.337Z] [INFO] [#tid_sid_unknown] [Config] environment set value for port [2022-10-17T23:10:41.759Z] [INFO] [#tid_sid_unknown] [LeanMessagingProvider] topic completed0 created [2022-10-17T23:10:41.759Z] [INFO] [#tid_sid_unknown] [LeanMessagingProvider] topic health created [2022-10-17T23:10:41.759Z] [INFO] [#tid_sid_unknown] [LeanMessagingProvider] topic cacheInvalidation created [2022-10-17T23:10:41.760Z] [INFO] [#tid_sid_unknown] [LeanMessagingProvider] topic events created [2022-10-17T23:10:42.460Z] [INFO] [#tid_sid_controller] [Controller] starting controller instance 0 [marker:controller_startup0_counter:1190] [2022-10-17T23:10:44.458Z] [INFO] [#tid_sid_dispatcher] [MessageFeed] handler capacity = 128, pipeline fill at = 128, pipeline depth = 256 [2022-10-17T23:10:44.656Z] [INFO] [#tid_sid_dispatcher] [MessageFeed] handler capacity = 128, pipeline fill at = 128, pipeline depth = 256 [2022-10-17T23:10:45.060Z] [INFO] [#tid_sid_unknown] [InvokerReactive] LogStoreProvider: class org.apache.openwhisk.core.containerpool.logging.DockerToActivationLogStore [2022-10-17T23:10:45.933Z] [INFO] [#tid_sid_unknown] [DockerClientWithFileAccess] Detected docker client version 18.06.3-ce [2022-10-17T23:10:46.124Z] [INFO] [#tid_sid_invoker] [DockerClientWithFileAccess] running /usr/bin/docker ps --quiet --no-trunc --all --filter name=wsk0_ (timeout: 1 minute) [marker:invoker_docker.ps_start:4858] [2022-10-17T23:10:46.406Z] [INFO] [#tid_sid_invoker] [DockerClientWithFileAccess] [marker:invoker_docker.ps_finish:5140:145] [2022-10-17T23:10:46.412Z] [INFO] [#tid_sid_invoker] [DockerContainerFactory] removing 0 action containers. [2022-10-17T23:10:48.650Z] [INFO] [#tid_sid_invoker] [CouchDbRestStore] [QUERY] 'whisk_local_subjects' searching 'namespaceThrottlings/blockedNamespaces [marker:database_queryView_start:7384] [2022-10-17T23:10:50.224Z] [INFO] [#tid_sid_dispatcher] [MessageFeed] handler capacity = 2000, pipeline fill at = 2000, pipeline depth = 4000 [2022-10-17T23:10:50.325Z] [INFO] [#tid_sid_invoker] [CouchDbRestStore] [marker:database_queryView_finish:9058:1672] [2022-10-17T23:10:50.327Z] [INFO] [#tid_sid_unknown] [InvokerReactive] updated blacklist to 0 entries [2022-10-17T23:10:50.703Z] [INFO] [#tid_sid_invokerWarmup] [ContainerPool] found 0 started and 0 starting; initing 2 pre-warms to desired count: 2 for kind:nodejs:14 mem:256 MB [2022-10-17T23:10:50.727Z] [INFO] [#tid_sid_controller] [Controller] loadbalancer initialized: LeanBalancer [2022-10-17T23:10:51.084Z] [INFO] [#tid_sid_controller] [KindRestrictor] all kinds are allowed, the white-list is not specified [2022-10-17T23:10:51.624Z] [INFO] [#tid_sid_invokerWarmup] [DockerClientWithFileAccess] running /usr/bin/docker run -d --cpu-shares 256 --memory 256m --memory-swap 256m --network bridge -e __OW_API_HOST=https://172.17.0.1 -e __OW_ALLOW_CONCURRENT=True --name wsk0_1_prewarm_nodejs14 --cap-drop NET_RAW --cap-drop NET_ADMIN --ulimit nofile=1024:1024 --pids-limit 1024 --log-driver json-file openwhisk/action-nodejs-v14:nightly (timeout: 1 minute) [marker:invoker_docker.run_start:10358] [2022-10-17T23:10:51.630Z] [INFO] [#tid_sid_invokerWarmup] [DockerClientWithFileAccess] running /usr/bin/docker run -d --cpu-shares 256 --memory 256m --memory-swap 256m --network bridge -e __OW_API_HOST=https://172.17.0.1 -e __OW_ALLOW_CONCURRENT=True --name wsk0_2_prewarm_nodejs14 --cap-drop NET_RAW --cap-drop NET_ADMIN --ulimit nofile=1024:1024 --pids-limit 1024 --log-driver json-file openwhisk/action-nodejs-v14:nightly (timeout: 1 minute) [marker:invoker_docker.run_start:10362] [2022-10-17T23:10:53.314Z] [INFO] [#tid_sid_invokerWarmup] [DockerClientWithFileAccess] [marker:invoker_docker.run_finish:12047:1680] [2022-10-17T23:10:53.354Z] [INFO] [#tid_sid_invokerWarmup] [DockerClientWithFileAccess] [marker:invoker_docker.run_finish:12088:1728] [2022-10-17T23:10:56.377Z] [INFO] [#tid_sid_controller] [ActionsApi] actionSequenceLimit '50' [2022-10-17T23:10:57.534Z] [WARN] Binding with a connection source not supported with HTTP/2. Falling back to HTTP/1.1.
Thank you in advance.
You need to provide the step you faced the error. What you provided is just an error log and cannot figure out the step itself. And you said you provided the controller logs but it seems that's an invoker log?
Thank you for the comments, I am not sure about steps you are asking but here are the steps I am doing for setup:
I am following the tutorial from here https://github.com/apache/openwhisk/blob/master/ansible/README.md
(1) apt install git Next, clone the repo to the local directory: (2)git clone https://github.com/apache/openwhisk.git openwhisk (3)cd openwhisk cd openwhisk && cd tools/ubuntu-setup && ./all.sh (4) Next, configure a persistent storage database for OpenWhisk, with CouchDB. export OW_DB=CouchDB export OW_DB_USERNAME=root export OW_DB_PASSWORD=root123 export OW_DB_PROTOCOL=http export OW_DB_HOST=172.17.0.1 export OW_DB_PORT=5984
(5)In the openwhisk/ansible directory, ansible-playbook -i environments/local/ setup.yml
Next, use CouchDB to deploy OpenWhisk and make sure that db_local.ini is available locally.
(6)Execute the deployment command in the openwhisk/ directory: ./gradlew distDocker
(7)Next enter the openwhisk/ansible directory: ansible-playbook -i environments/local/ couchdb.yml ansible-playbook -i environments/local/ initdb.yml ansible-playbook -i environments/local/ wipe.yml ansible-playbook -i environments/local/ apigateway.yml
(8) ansible-playbook -i environments/local/ openwhisk.yml ansible-playbook -i environments/local/ postdeploy.yml
Till 1 to 7, installation and build without any trouble. But at step 8 I faced the error as explained in: https://github.com/apache/openwhisk/issues/5331#issuecomment-1281621529
Hey. Since this is still open, I am having the same issue with setting up the controller when I try to run openwhisk.yml with ansible playbook. the controller log says that the keystore password is wrong, I assume this password is generated with setup.yml and then later copied when I run openwhisk.yml or controller.yml. Any suggestions to solve this?
The Step that failed with openwhisk.yml
TASK [controller : add seed nodes to controller environment] *******************************************************************************
Monday 27 February 2023 12:29:27 -0600 (0:00:00.114) 0:00:28.125 *******
ok: [controller0] => (item=[0, '172.17.0.1'])
TASK [controller : Add akka environment to controller environment] *************************************************************************
Monday 27 February 2023 12:29:27 -0600 (0:00:00.169) 0:00:28.295 *******
ok: [controller0]
TASK [controller : lean controller setup] **************************************************************************************************
Monday 27 February 2023 12:29:27 -0600 (0:00:00.242) 0:00:28.537 *******
skipping: [controller0]
TASK [controller : (re)start controller] ***************************************************************************************************
Monday 27 February 2023 12:29:27 -0600 (0:00:00.038) 0:00:28.576 *******
changed: [controller0]
TASK [controller : wait until the Controller in this host is up and running] ***************************************************************
Monday 27 February 2023 12:29:28 -0600 (0:00:01.088) 0:00:29.664 *******
FAILED - RETRYING: wait until the Controller in this host is up and running (12 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (11 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (10 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (9 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (8 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (7 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (6 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (5 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (4 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (3 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (2 retries left).
FAILED - RETRYING: wait until the Controller in this host is up and running (1 retries left).
fatal: [controller0]: FAILED! => {"attempts": 12, "changed": false, "elapsed": 0, "msg": "Status code was -1 and not [200]: Request failed: <urlopen error [Errno 111] Connection refused>", "redirected": false, "status": -1, "url": "https://172.17.0.1:10001/ping"}
Status code was -1 and not [200]: Request failed: <urlopen error [Errno
111] Connection refused>
PLAY RECAP *********************************************************************************************************************************
controller0 : ok=25 changed=7 unreachable=0 failed=1 skipped=14 rescued=0 ignored=0
etcd0 : ok=0 changed=0 unreachable=0 failed=0 skipped=7 rescued=0 ignored=0
kafka0 : ok=10 changed=4 unreachable=0 failed=0 skipped=7 rescued=0 ignored=0
Monday 27 February 2023 12:31:33 -0600 (0:02:04.909) 0:02:34.574 *******
===============================================================================
controller : wait until the Controller in this host is up and running ------------------------------------------------------------- 124.91s
kafka : wait until the kafka server started up -------------------------------------------------------------------------------------- 7.49s
zookeeper : (re)start zookeeper ----------------------------------------------------------------------------------------------------- 2.07s
kafka : (re)start kafka using 'wurstmeister/kafka:2.13-2.7.0' ---------------------------------------------------------------------- 1.90s
controller : copy certificates ------------------------------------------------------------------------------------------------------ 1.68s
zookeeper : wait until the Zookeeper in this host is up and running ----------------------------------------------------------------- 1.42s
controller : populate environment variables for controller -------------------------------------------------------------------------- 1.36s
Gathering Facts --------------------------------------------------------------------------------------------------------------------- 1.23s
controller : (re)start controller --------------------------------------------------------------------------------------------------- 1.09s
Gathering Facts --------------------------------------------------------------------------------------------------------------------- 0.92s
controller : check if whisk_local_activations with CouchDB exists ------------------------------------------------------------------- 0.87s
controller : copy nginx certificate keystore ---------------------------------------------------------------------------------------- 0.86s
controller : check if whisk_local_whisks with CouchDB exists ------------------------------------------------------------------------ 0.85s
Gathering Facts --------------------------------------------------------------------------------------------------------------------- 0.83s
controller : copy jmxremote password file ------------------------------------------------------------------------------------------- 0.80s
controller : check if whisk_local_subjects with CouchDB exists ---------------------------------------------------------------------- 0.79s
controller : copy jmxremote access file --------------------------------------------------------------------------------------------- 0.50s
controller : ensure controller config directory is created with permissions --------------------------------------------------------- 0.41s
kafka : create kafka certificate directory ------------------------------------------------------------------------------------------ 0.40s
controller : check, that required databases exist ----------------------------------------------------------------------------------- 0.31s
Controller logs
[2023-02-27T18:29:37.129Z] [INFO] [#tid_sid_controller] [Controller] loadbalancer initialized: ShardingContainerPoolBalancer
[2023-02-27T18:29:37.135Z] [INFO] [#tid_sid_dispatcher] [MessageFeed] handler capacity = 128, pipeline fill at = 128, pipeline depth = 256
[2023-02-27T18:29:37.282Z] [INFO] [#tid_sid_controller] [KindRestrictor] all kinds are allowed, the white-list is not specified
[2023-02-27T18:29:38.217Z] [INFO] [#tid_sid_controller] [ActionsApi] actionSequenceLimit '50'
Exception in thread "main" java.io.IOException: keystore password was incorrect
at java.base/sun.security.pkcs12.PKCS12KeyStore.engineLoad(PKCS12KeyStore.java:2117)
at java.base/sun.security.util.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:222)
at java.base/java.security.KeyStore.load(KeyStore.java:1479)
at org.apache.openwhisk.common.Https$.applyHttpsConfig(Https.scala:58)
at org.apache.openwhisk.common.Https$.connectionContextServer(Https.scala:92)
at org.apache.openwhisk.http.BasicHttpService$.$anonfun$startHttpService$1(BasicHttpService.scala:174)
at org.apache.openwhisk.http.BasicHttpService$$$Lambda$2199/00000000D6E0BEB0.apply(Unknown Source)
at scala.Option.map(Option.scala:230)
at org.apache.openwhisk.http.BasicHttpService$.startHttpService(BasicHttpService.scala:174)
at org.apache.openwhisk.core.controller.Controller$.start(Controller.scala:285)
at org.apache.openwhisk.core.controller.Controller$.main(Controller.scala:233)
at org.apache.openwhisk.core.controller.Controller.main(Controller.scala)
Caused by: java.security.UnrecoverableKeyException: failed to decrypt safe contents entry: javax.crypto.BadPaddingException: Given final block not properly padded. Such issues can arise if a bad key is used during decryption.
Hello @peimanfth
I am facing the exact same issue you've described above. Did you find a fix to this?
Finding a similar issue , did you find a fix? @vishalvrv9 @amitbatajoo @peimanfth
I believe it was an issue with pre-existing openWhisk credentials. I cleaned the openWhisk installation and and deleted the directory outside the openWhisk directory that is associated with openWhisk credentials. I also cleaned the couchdb instance and re-installed everything again, This solved the issue for me. @vishalvrv9 @Dakzh10
I believe it was an issue with pre-existing openWhisk credentials. I cleaned the openWhisk installation and and deleted the directory outside the openWhisk directory that is associated with openWhisk credentials. I also cleaned the couchdb instance and re-installed everything again, This solved the issue for me. @vishalvrv9 @Dakzh10
Hi I am facing the exact same issue. Could you please be more specific about the files you deleted? And by "cleaned the couchDB instance" , do you mean removing the couchDB container ? @peimanfth
I believe it was an issue with pre-existing openWhisk credentials. I cleaned the openWhisk installation and and deleted the directory outside the openWhisk directory that is associated with openWhisk credentials. I also cleaned the couchdb instance and re-installed everything again, This solved the issue for me. @vishalvrv9 @Dakzh10
Hi I am facing the exact same issue. Could you please be more specific about the files you deleted? And by "cleaned the couchDB instance" , do you mean removing the
couchDB container? @peimanfth
If you are running it on Ubuntu you can find the credentials under /var/tmp/wskconf there are credentials for each component. For instance, if you only have a single controller in your deployment, its credentials should be under controller/controller0. if you delete the whole directory it will be regenerated each time you run setup.yml.
On further note, since you are reproducing rainbowCake, make sure openwhisk can access your CouchDB instance. It could be because you already have a local CouchDB on your machine and also another instance deployed on your docker engine. In that case, I would recommend removing your manually deployed CouchDB and let the yml scripts install CouchDB using docker images.
I did the followings:
- Removed the directories inside the /var/tmp/wskconf
- removed all the docker containers, images, networks and volumes using the followings:
docker stop $(docker ps -aq) && docker rm $(docker ps -aq)
docker kill $(docker ps -q)
docker image rm $(docker image ls -aq)
docker volume rm $(docker volume ls -q)
docker network rm $(docker network ls -q)
docker system prune -a --volumes -f
- Then cleaned the OpenWhisk Deployment as suggested here using
ansible-playbook -i environments/$ENVIRONMENT openwhisk.yml -e mode=clean
ansible-playbook -i environments/$ENVIRONMENT controller.yml -e mode=clean
But still no luck. I think the problem is I have two python versions installed python 3.10.12 and python 3.9.0.I have configured this project to use python 3.9.0. But there are some commands in the setup script that are still using Python 3.10.12
I spawn up a cloud VM having Ubuntu 20.04 LTS with Python 3.8.0 (not Python 3.10) and when i ran the scripts inside this VM, they ran to completion and I didn't face any issue. So it seems like Python version is the main cause.