incubator-streampark
incubator-streampark copied to clipboard
[Proposal] Unify flink configuration
Search before asking
- [X] I had searched in the feature and found no similar feature requirement.
Description
Currently, there is no unified specification for parameter settings in the streampark project, so this time we will solve the problem of the specification of the whole parameter setting It involves setting the env environment of the job (stream env | table env), user's parameters, and user's flinksql content
before:
flink:
deployment:
property:
${StreamExecutionEnvironment.key} : $value
# table
table:
planner: blink # (blink|old|any)
mode: streaming #(batch|streaming)
after:
env:
option: #cli opiton args
target: yarn-application # yarn-application, yarn-perjob
shutdownOnAttachedExit:
jobmanager:
...
property:
${StreamExecutionEnvironment.key} : $value
...
table:
${TableEnvironment.key} : $value
...
sql: # flinksql
my_flinksql: |
CREATE TABLE datagen (
f_sequence INT,
ts AS localtimestamp,
WATERMARK FOR ts AS ts
) WITH (
....
);
...
Usage Scenario
No response
Related issues
No response
Are you willing to submit a PR?
- [X] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Hi @wolfboys , thanks for your great proposal.
I have some questions:
- Why do you write the sql content in the config file? As I understand, the proposal want to unify the table config, right?
- For the table config, could we use
env.table-property
as the prefix? In other word, I don't think theenv.property.table
is a good idea. Because all configs underenv.property
will pass to Flink Env.
I use the table.exec.mini-batch.enabled
as an example.
env:
option: #cli opiton args
target: yarn-application # yarn-application, yarn-perjob
shutdownOnAttachedExit:
jobmanager:
...
property:
${StreamExecutionEnvironment.key} : $value
...
table-property:
table.exec.mini-batch.enabled : true
hi @1996fanrui :
table config definition in env.property.table
, env.property
will pass to Flink Env(Exclude table
prefix), e.g:
env:
property:
${key1} : ${value2}
table:
${key2} : ${value2}
${key1} is Flink Env config, key2 is Flink table config, not Flink Env config, All configurations with the env.property.table
prefix are Flink table config
use the table.exec.mini-batch.enabled as an example:
env:
option: #cli opiton args
target: yarn-application # yarn-application, yarn-perjob
shutdownOnAttachedExit:
jobmanager:
property:
taskmanager.numberOfTaskSlots: 1
parallelism.default: 2
table:
exec.mini-batch.enabled : true
Hi @wolfboys ,
If the prefix of flink table config isn't table, what can we do? StreamPark should not be affected by flink parameter naming.
For example, the prefix of some table configs are sql-client
.
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/table/config/#sql-client-display-max-column-width
BTW, could you share these information to mail list? More developers can discuss with us.