incubator-uniffle icon indicating copy to clipboard operation
incubator-uniffle copied to clipboard

Execute start script with nohup

Open xianjingfeng opened this issue 2 years ago • 13 comments

What changes were proposed in this pull request?

Execute start script with nohup

Why are the changes needed?

Process don't exit if exec start script using ansible. Therefore, we can't do batch start operation for this.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Have already verify in our production enviroment.

xianjingfeng avatar Jul 30 '22 09:07 xianjingfeng

Codecov Report

Merging #110 (698875e) into master (deb7fe4) will not change coverage. The diff coverage is n/a.

@@            Coverage Diff            @@
##             master     #110   +/-   ##
=========================================
  Coverage     56.52%   56.52%           
  Complexity     1183     1183           
=========================================
  Files           149      149           
  Lines          8019     8019           
  Branches        767      767           
=========================================
  Hits           4533     4533           
  Misses         3242     3242           
  Partials        244      244           

Help us with your feedback. Take ten seconds to tell us how you rate us.

codecov-commenter avatar Jul 30 '22 09:07 codecov-commenter

@zuston I don't have ansible environment, Could you help me review and verify this path?

jerqi avatar Jul 30 '22 09:07 jerqi

I use the shell module of ansible to start the shuffle server without this PR, it wont block. command as: ansible -i hosts_xxx -m shell -a "./start-shuffle-server.sh"

Could u help provide some detailed info that which ansible module dont work? I will test it. @xianjingfeng

zuston avatar Jul 30 '22 12:07 zuston

I use the shell module of ansible to start the shuffle server without this PR, it wont block. command as: ansible -i hosts_xxx -m shell -a "./start-shuffle-server.sh"

Could u help provide some detailed info that which ansible module dont work? I will test it. @xianjingfeng

I forget. I'll ask my colleague next week. I tried it myself just now, and i found the jvm will exit immediately without this pr.

xianjingfeng avatar Jul 30 '22 13:07 xianjingfeng

@zuston Blocked is because we have made a layer of encapsulation for ansible. We will check if the process start successful. Process will not start successful by using native ansible without this pr.

xianjingfeng avatar Aug 01 '22 03:08 xianjingfeng

But the process could be started successfully by native ansible for me.

zuston avatar Aug 02 '22 03:08 zuston

If this pr only solve the issue of company's internal specific implement, I prefer not merging it.

jerqi avatar Aug 02 '22 03:08 jerqi

@jerqi @zuston 截图202282134548

xianjingfeng avatar Aug 02 '22 05:08 xianjingfeng

@zuston What's your suggestion?

jerqi avatar Aug 03 '22 03:08 jerqi

Hadoop, Spark, Hive, and Trino support run service both on frontend or as daemon, can we follow them?

pan3793 avatar Aug 03 '22 06:08 pan3793

I also use the ansible of 2.9.27 but no such problems. @xianjingfeng And i dont see any blocking in your sceenshot.

Hadoop, Spark, Hive, and Trino support run service both on frontend or as daemon, can we follow them?

Agree with u @pan3793

zuston avatar Aug 03 '22 07:08 zuston

And i dont see any blocking in your sceenshot.

Not blocked, jvm process have not start if start by ansible

xianjingfeng avatar Aug 03 '22 08:08 xianjingfeng

Got your point. @xianjingfeng But i think we should dig the cause rather than using the nohup directly? But i have no solutions currently

zuston avatar Aug 04 '22 02:08 zuston