flink
flink copied to clipboard
[FLINK-31223] sql-client.sh fails to start with ssl enabled
What is the purpose of the change
SqlClient now uses SqlGateway for job submission. It will connect to the gateway's RestServer through the RestClient to interact. when we create the RestClient, we load all the configuration options, but when we create the rest server, we only load the EndpointOptions. As a result, if we want to ensure the security of the connection through SSL, the client will enable SSL, but the server does not. The problem of parameter inconsistency will lead to potential problems in the future. We'd better avoid it directly.
Brief change log
- Pass all flink parameters to the
SqlGatewayRestEndpoint.
Verifying this change
This change added tests and can be verified by SqlClientSSLTest.
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changed class annotated with
@Public(Evolving): yes - The serializers: no
- The runtime per-record code paths (performance sensitive): no
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
- The S3 file system connector: no
Documentation
- Does this pull request introduce a new feature? no
CI report:
- 0e59da21f04c1b4a63226fa2deb6cdca559df686 Azure: SUCCESS
Bot commands
The @flinkbot bot supports the following commands:@flinkbot run azurere-run the last Azure build
Any update on this? Still found this happening in 1.18.1
@reswqa We have hit this issue as well; I noticed there has been no activity on this for 14 months. It appears to be a regression from Flink 1.16 , running SQL Client with security now fails. We would like this to be merged urgently; I am happy to help if required.
@davidradl, this is stale as no reviewer focus on this part has enough time to review. Maybe I should try reach out someone to take a look recently.
@davidradl, this is stale as no reviewer focus on this part has enough time to review. Maybe I should try reach out someone to take a look recently.
That would be fabulous - thank so much for moving this forward.
That would be fabulous - thank so much for moving this forward.
@davidradl I saw your comment in jira ticket.
If I remember correctly, I should have tested this. But our code base has gone through too many version iterations, so I'm not sure if it still works correctly. It would be great if you could test it locally.
That would be fabulous - thank so much for moving this forward.
@davidradl I saw your comment in jira ticket.
If I remember correctly, I should have tested this. But our code base has gone through too many version iterations, so I'm not sure if it still works correctly. It would be great if you could test it locally.
Absolutely - could you resolve the conflicts and I will test.
@flinkbot run azure
That would be fabulous - thank so much for moving this forward.
@davidradl I saw your comment in jira ticket. If I remember correctly, I should have tested this. But our code base has gone through too many version iterations, so I'm not sure if it still works correctly. It would be great if you could test it locally.
Absolutely - could you resolve the conflicts and I will test.
@reswqa I have built the fix locally on the latest main and tested with a self signed certificate with security.ssl.enabled and separately with security.ssl.rest.enabled setting the keystores, truststores and passwords. The SQL client starts with no errors. So as far as I can see, the fix works for us. Can we go to merge?
@davidradl Thanks for testing this 👍 I think you should be able to review it, right? If so, I will merge it after get your +1 approval.
Thanks for the review! merging...
Thanks for the review! merging...
@reswqa thanks for doing this :-) We are looking for this to me back ported to 1.18 and 1.19. Are you ok to do this and I can review - or would you like me to back port and you review?
@davidradl Make sense to back port this as we should treat this as a bugfix because sql client previously supported SSL, which is a kind of regresssion.
If you want, just go ahead.
@davidradl Make sense to back port this as we should treat this as a bugfix because sql client previously supported SSL, which is a kind of regresssion.
If you want, just go ahead.
Ok will do
@davidradl Make sense to back port this as we should treat this as a bugfix because sql client previously supported SSL, which is a kind of regresssion. If you want, just go ahead.
Ok will do
@reswqa Hi I have had a quick look at the back port it is not straight forward. I forgot to ask for the commits to be squashed; the first 2 commits come in nicely with cherry pick but the 3rd with the 10 files does not. It makes changes to files that are not present at 118 for example flink-table/flink-sql-gateway-api/src/main/java/org/apache/flink/table/gateway/api/endpoint/SqlGatewayEndpointFactory.java. I think that more files need to be backported. Could you advise on what else is required for me to do the backport please, unless you want to take over. kind regards, David.
@davidradl IIRC, SqlGatewayEndpointFacory should be already present in release-1.18 🤔
see: https://github.com/apache/flink/blob/release-1.18/flink-table/flink-sql-gateway-api/src/main/java/org/apache/flink/table/gateway/api/endpoint/SqlGatewayEndpointFactory.java
@reswqa my bad - I was cherry picking the commit shas from you branch -when I needed to get them from master. Just building and testing then I will create the pr.
@reswqa I have raised pr for the backport to 1.18 (I squashed the commits). - I will do 1.19 now.
@reswqa I have raised https://github.com/apache/flink/pull/24742 for the backport to 1.19 (I squashed the commits)