puppetlabs-havana icon indicating copy to clipboard operation
puppetlabs-havana copied to clipboard

ceilometer-dbsync fails on first run of controller role

Open tangestani opened this issue 11 years ago • 10 comments

ceilometer-dbsync exits with a failure when applying the controller role for the first time on a clean system.

Notice: /Stage[main]/Ceilometer::Db/Exec[ceilometer-dbsync]/returns: 2014-01-30 12:12:31.182 26917 TRACE ceilometer ConnectionFailure: could not connect to 172.16.33.4:27017: [Errno 111] ECONNREFUSED
Notice: /Stage[main]/Ceilometer::Db/Exec[ceilometer-dbsync]/returns: 2014-01-30 12:12:31.182 26917 TRACE ceilometer
Error: /Stage[main]/Ceilometer::Db/Exec[ceilometer-dbsync]: Failed to call refresh: ceilometer-dbsync --config-file=/etc/ceilometer/ceilometer.conf returned 1 instead of one of [0]
Error: /Stage[main]/Ceilometer::Db/Exec[ceilometer-dbsync]: ceilometer-dbsync --config-file=/etc/ceilometer/ceilometer.conf returned 1 instead of one of [0]

The problem seems to be that puppet executes ceilometer-dbsync immediately after starting the mongod service, which does not always work out well because mongod takes some time to allocate a journal before it will accept incoming connections on 27017. On my VM this process takes about 15 seconds.

tangestani avatar Jan 31 '14 23:01 tangestani

Yes. While I would consider this to be a bug in the MongoDB startup scripts (my opinion is that they should not return until the database is initialized, precisely because of problems like this), it's something that needs to be reliably addressed. I'm thinking a script that tries n times with m seconds between each try.

hogepodge avatar Jan 31 '14 23:01 hogepodge

Mine's actually running the ceilometer-dbsync before it installs mongo. So there is some dependency ordering issue here.

benh57 avatar Mar 12 '14 23:03 benh57

(which is odd considering the explicit arrows in the controller role, mongo is before ceilometer-api)

benh57 avatar Mar 12 '14 23:03 benh57

Oh, the role ordering will do almost nothing to ensure the dependency ordering. Contained classes will float away and become unordered. There are workarounds that I find offensive. I'll take a look at making stronger dependency ordering within the profile. It should be possible. Sorry for taking so long to close this.

hogepodge avatar Mar 13 '14 00:03 hogepodge

(I'm actually pulling out that ordering in future versions since "the goggles do nothing").

hogepodge avatar Mar 13 '14 00:03 hogepodge

Ordering is the main reason I can't use the stackforge modules for anything other than demo envs :-(

I'm hoping 'contains' will make the situation better in the future

beddari avatar Mar 13 '14 09:03 beddari

I'm not sure this should especially be fixed this way, but I submitted a patch at https://review.openstack.org/#/c/81950/ to cause ceilometer-dbsync to retry on a failed connection.

It's really mongodb's fault, but we can't really help that.

hunner avatar Mar 21 '14 00:03 hunner

A better way to solve this would be to make the mongodb::server::service class block on the service using a "validate connection" resource similar to the one in the puppetdb module.

hunner avatar Mar 21 '14 01:03 hunner

As a note I'm currently using this solution https://github.com/Katello/puppet-service_wait

beddari avatar Mar 21 '14 14:03 beddari

Workaround: http://openstack.redhat.com/Workarounds_2014_01#Failed_to_start_mongodb

ltartarini90 avatar Jul 04 '14 13:07 ltartarini90