flink-sql-gateway icon indicating copy to clipboard operation
flink-sql-gateway copied to clipboard

How to specify different user to execute sql in yarn per-job mode

Open wooplevip opened this issue 5 years ago • 18 comments

Hi all,

We can set HADOOP_USER_NAME for each job when to submit job in yarn per-job mode via flink command. But how to specify different user for each job via REST API in yarn per-job mode? Thanks.

wooplevip avatar Apr 24 '20 09:04 wooplevip

which flink command supports "HADOOP_USER_NAME" ?

do you mean --yarnname ?

godfreyhe avatar Apr 24 '20 10:04 godfreyhe

Hi @godfreyhe ,

Maybe it is not correct expression for set HADOOP_USER_NAME. For example, we add HADOOP_USER_NAME to optsArray and cmd is ${FLINK_HOME}/bin/flink run ... then invoke Process.apply(cmd, None, optsArray: _*).!.

wooplevip avatar Apr 24 '20 11:04 wooplevip

can you give a completed flink run command ?

godfreyhe avatar Apr 24 '20 11:04 godfreyhe

The command is like flink-1.10.0/bin/flink run --jobmanager yarn-cluster --detached --class com.xxx.StreamApp --parallelism 1 --yarnname test --yarnjobManagerMemory 1024 --yarnqueue default --yarnslots 1 --yarntaskManagerMemory 1024 -yD flink.master=yarn-cluster -yD group.id=test1586745136882 xxx.jar

Process.apply(cmd, None, optsArray: _*).! is equivalent to run export HADOOP_USER_NAME=user1 firstly then run above command.

wooplevip avatar Apr 24 '20 12:04 wooplevip

so user1 is not from flink run command. you run export HADOOP_USER_NAME=user1 just before flink run ?

godfreyhe avatar Apr 26 '20 03:04 godfreyhe

Yes. As I understand it, we have to set HADOOP_USER_NAME to system environment for each different user. So I did a further testing. I set HADOOP_USER_NAME to system environment forcedly before submit a select sql job refer to this blog. It works as my expectation. But maybe it is not common way for this issue. Any other better suggestion? Thanks.

wooplevip avatar Apr 26 '20 04:04 wooplevip

have you ever tried adding "HADOOP_USER_NAME" into env.java.opts.taskmanager option which is flink-conf.yaml ?

godfreyhe avatar Apr 26 '20 07:04 godfreyhe

It does not work for yarn-per job mode. HADOOP_USER_NAME must be in gateway process system environment from my testing.

wooplevip avatar Apr 26 '20 08:04 wooplevip

If I only add System.setProperty("HADOOP_USER_NAME", "user1"); before final ProgramDeployer deployer = new ProgramDeployer(configuration, jobName, pipeline);, it also works. I will do more testing to double check.

wooplevip avatar Apr 26 '20 08:04 wooplevip

Why not copying what Livy does for spark and add a doAs() service to the REST APIs?

fpompermaier avatar Apr 26 '20 08:04 fpompermaier

@fpompermaier thanks, good idea, I will look into Livy implement.

@godfreyhe as workaround, I can invoke UserGroupInformation.setLoginUser(null); and System.setProperty("HADOOP_USER_NAME", "user1"); before execute job operation.

Could you think about this requirement? Thanks very much.

wooplevip avatar Apr 26 '20 09:04 wooplevip

@wooplevip, maybe @fpompermaier 's suggestion is good approach.

godfreyhe avatar Apr 26 '20 10:04 godfreyhe

Thanks. Do you have any plan to support multiple users can share the same gateway server?

wooplevip avatar Apr 26 '20 10:04 wooplevip

you can take this ticket if you have time.

godfreyhe avatar Apr 26 '20 11:04 godfreyhe

OK, I will have a try.

wooplevip avatar Apr 26 '20 23:04 wooplevip

@wooplevip could you briefly describe your design before submitting a pr ?

godfreyhe avatar Apr 27 '20 01:04 godfreyhe

Hi @godfreyhe ,

Add proxyUser and doAs user for REST API

image.png

wooplevip avatar Apr 28 '20 01:04 wooplevip

Hi @godfreyhe ,

Can we add a Map<String, String> executionConf field in StatementExecuteRequestBody to get properties user specified, for example, proxyUser,queue, owner and so on ?

wooplevip avatar Apr 29 '20 03:04 wooplevip