maestro-ng icon indicating copy to clipboard operation
maestro-ng copied to clipboard

maestro restart doesn't handle service dependencies correctly

Open petrkalina opened this issue 7 years ago • 11 comments

containers are restarted in the order in which they are stopped

the correct behaviour is to wait starting the dependent services after their prerequsite services have been restarted

the easiest way how to achieve the behaviour is to decompose the execution internally performing maestro stop and maestro start sequentially

petrkalina avatar Dec 13 '16 14:12 petrkalina

Might be a silly question but are you using the -d flag?

Can you give more details about your services, containers and their dependencies, and the order you're seeing when doing maestro restart?

Thanks

mpetazzoni avatar Dec 13 '16 17:12 mpetazzoni

​Hi Maxime,

IMO, considering that service A depends on service B, the sequence should be: stop A stop B start B start A

whereas it seems to be stop A start A stop B start B

my concrete example is this:

this is the skeleton of the maestro yaml file I'm using.

elasticsearch:

kibana:
    requires: [elasticsearch]

logstash:
    requires: [elasticsearch]

db-master:

ldap:

archive:
    requires: [logstash, db-master, ldap]
    instances:
        archive1b:
        archive2b:

the services seem to stop in the correct order (dependent services prior to the prerequisites), however (as illustrated by the picture below) they are re-started immediately after the stop completes. might be, that their prerequisites are considered running as they were not stopped yet. the "-d" flag doesn't affect this.

this picture is taken while the maestro restart is in process - as you can see, the archive1a and archive1b are being started, while the prerequisite ldap, db-master and elasticsearch services were not yet even stopped. this means, they are going to be restarted DURING the archives' start-up which is not correct.

thanks! ​petr

On Tue, Dec 13, 2016 at 5:24 PM, Maxime Petazzoni [email protected] wrote:

Might be a silly question but are you using the -d flag?

Can you give more details about your services, containers and their dependencies, and the order you're seeing when doing maestro restart?

Thanks

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/signalfx/maestro-ng/issues/186#issuecomment-266803075, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGNjmX-gSIJtzJ1SRqT9M-WsyVA79ONks5rHtTBgaJpZM4LLxRQ .

petrkalina avatar Dec 14 '16 14:12 petrkalina

I could also reproduce the same issue faced by @petrkalina

Is restating B first (stop B, start B), then restarting A (stop A, start A), an acceptable solution? or stop all basing on dependencies, and then start all stopped service basing on dependencies ('Stop' play and then 'Start' play)?

zsuzhengdu avatar Apr 11 '17 02:04 zsuzhengdu

The current behavior is definitely not ideal, and most likely not what one would expect when restarting multiple services with some dependencies.

First, the important thing to understand is that the -d flag does not control in what order things are done, only what containers will be affected. By specifying -d, you explicitly ask Maestro to consider the dependencies (or the dependents) of the things you specified on the command line. In the case of stop and restart, specifying -d will make Maestro include all downstream dependencies. The logic behind this being that if B depends on A, maestro stop -d A should also consider B and stop it before A.

With regard to the order of a restart, the problem today is that Maestro does a per-container restart instead of a ascending stop + descending start of all identified containers. So if Maestro has identified the orchestration play will involves containers A1, A2, B1 and B2, from services A and B (with B depending on A), today the order will be:

  1. stop B1, start B1
  2. stop B2, start B2
  3. stop A1, start A1
  4. stop A2, start A2

Instead of:

  1. stop B1
  2. stop B2
  3. stop A1
  4. stop A2
  5. start A1
  6. start A2
  7. start B1
  8. start B2

It gets a little tricky to keep the --only-if-changed functionality, but I'll try to work on a fix.

mpetazzoni avatar Apr 11 '17 16:04 mpetazzoni

So, giving more thoughts to this, the main issue with the proposal above is that it's not a graceful restart at all. At step 4, the whole stack is down. With the current behavior of restarting one container at a time, the stack is always operational.

Thoughts?

mpetazzoni avatar Apr 11 '17 21:04 mpetazzoni

Restart all A service containers first, the B service containers, since B depends on A.

On Apr 11, 2017 5:10 PM, "Maxime Petazzoni" [email protected] wrote:

So, giving more thoughts to this, the main issue with the proposal above is that it's not a graceful restart at all. At step 4, the whole stack is down. With the current behavior of restarting one container at a time, the stack is always operational.

Thoughts?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/signalfx/maestro-ng/issues/186#issuecomment-293400975, or mute the thread https://github.com/notifications/unsubscribe-auth/AB5Pm1GIhxM_jt0HRmVz8pLRcgpiLZURks5ru-xNgaJpZM4LLxRQ .

zsuzhengdu avatar Apr 11 '17 23:04 zsuzhengdu

It's hard to say what is correct here. I feel like their could be different answers based on what the nature of the dependency is.

mpetazzoni avatar Apr 12 '17 00:04 mpetazzoni

Given a clear and universal acceptable definition to dependency is the key to make Play of restart right.

My understanding, in my use case, is if service A depends on service B; in another word, service A requires service B, service B is lower or closer to the root in the dependency tree. When restarting, service B should be restarted before its dependent, which is service A.

E.g. Kafka requires zookeeper. When restarting both with '-d', we better 're-start' zookeeper before kafka. Either stop kafka, stop zookeeper (stop play with '-d'), start zookeeper, start kafka (start play with '-d'); or stop zookeeper, start zookeeper (restart zookeeper); stop kafka, start kafka (restart kafka).

Or we could mimic docker-compose with links or depends_on.

zsuzhengdu avatar Apr 12 '17 01:04 zsuzhengdu

My scenarios is:

wildfly db ldap logstash

wildfly depends on db, logstash and ldap

the only correct way to start things up is that wildfly starts only after prerequisites are running - in case it is not so, some things in the archive fail to deploy. on the other hand, once started, I may want to intentionally restart the db or ldap while keeping the archive running. Therefore, in my case, there is good reason to implement a switch to ignore or consider the dependencies

I often use maestro start/stop/restart service/instance. I never was missing the fact that the dependencies are not considered there. I think this is correct.

I often use maestro start - and I'm happy that here the dependencies are considered and the services start in the right order - otherwise I could not use it.

However, currently I cannot use maestro restart, because the containers are started in the wrong order if I do so. For my usecase it would be best, if the default behavior for restart was

  • only containers that are specified are considered, unless -d is specified. i.e. maestro restart ldap archive would not affect logstash and ldap. or maestro restart ldap would not affect the archive.
  • no containers are started before their prerequisites have already been restarted
  • no containers are restarted before the services depending on them have been stopped

full maestro restart then boils down to:

  1. stop the archive
  2. after it has stopped, in arbitrary order, restart db, ldap and logstash
  3. start the archive

On Wed, Apr 12, 2017 at 3:20 AM, Du Zheng [email protected] wrote:

Given a clear and universal acceptable definition to dependency is the key to make Play of restart right.

My understanding, in my use case, is if service A depends on service B; in another word, service A requires service B, service B is lower or closer to the root in the dependency tree. When restarting, service B should be restarted before its dependent, which is service A.

E.g. Kafka requires zookeeper. When restarting both with '-d', we better 're-start' zookeeper before kafka. Either stop kafka, stop zookeeper (stop play with '-d'), start zookeeper, start kafka (start play with '-d'); or stop zookeeper, start zookeeper (restart zookeeper); stop kafka, start kafka (restart kafka).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/signalfx/maestro-ng/issues/186#issuecomment-293445213, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGNjui6LwJzwx_DRKnYwUpAhhn25_lwks5rvCbegaJpZM4LLxRQ .

petrkalina avatar Apr 12 '17 04:04 petrkalina

in my previous mail, I use archive and wildfly inconsistently - it is the same thing, sorry!..

On Wed, Apr 12, 2017 at 6:54 AM, Petr Kalina [email protected] wrote:

My scenarios is:

wildfly db ldap logstash

wildfly depends on db, logstash and ldap

the only correct way to start things up is that wildfly starts only after prerequisites are running - in case it is not so, some things in the archive fail to deploy. on the other hand, once started, I may want to intentionally restart the db or ldap while keeping the archive running. Therefore, in my case, there is good reason to implement a switch to ignore or consider the dependencies

I often use maestro start/stop/restart service/instance. I never was missing the fact that the dependencies are not considered there. I think this is correct.

I often use maestro start - and I'm happy that here the dependencies are considered and the services start in the right order - otherwise I could not use it.

However, currently I cannot use maestro restart, because the containers are started in the wrong order if I do so. For my usecase it would be best, if the default behavior for restart was

  • only containers that are specified are considered, unless -d is specified. i.e. maestro restart ldap archive would not affect logstash and ldap. or maestro restart ldap would not affect the archive.
  • no containers are started before their prerequisites have already been restarted
  • no containers are restarted before the services depending on them have been stopped

full maestro restart then boils down to:

  1. stop the archive
  2. after it has stopped, in arbitrary order, restart db, ldap and logstash
  3. start the archive

On Wed, Apr 12, 2017 at 3:20 AM, Du Zheng [email protected] wrote:

Given a clear and universal acceptable definition to dependency is the key to make Play of restart right.

My understanding, in my use case, is if service A depends on service B; in another word, service A requires service B, service B is lower or closer to the root in the dependency tree. When restarting, service B should be restarted before its dependent, which is service A.

E.g. Kafka requires zookeeper. When restarting both with '-d', we better 're-start' zookeeper before kafka. Either stop kafka, stop zookeeper (stop play with '-d'), start zookeeper, start kafka (start play with '-d'); or stop zookeeper, start zookeeper (restart zookeeper); stop kafka, start kafka (restart kafka).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/signalfx/maestro-ng/issues/186#issuecomment-293445213, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGNjui6LwJzwx_DRKnYwUpAhhn25_lwks5rvCbegaJpZM4LLxRQ .

petrkalina avatar Apr 12 '17 04:04 petrkalina

Another potential fix demo @ https://www.youtube.com/watch?v=c0DzG4m5vSY

Meets your requirement @petrkalina ??

Stop wildfly, stop ldap, stop ldap-db, start ldap-db, start ldap and start wildfly.

@mpetazzoni May I have your opinion? Shall we restart service with dependencies by adopting PR or the captured demo, or provide an option for user basing on use case?

Noticed that the links and depends_on option in docker-compose, do randomly restart, not considering/following the service 'dependencies'.

zsuzhengdu avatar Apr 19 '17 16:04 zsuzhengdu