zeppelin icon indicating copy to clipboard operation
zeppelin copied to clipboard

[ZEPPELIN-6095] add jdbc interpreter url validation check condition

Open s2moon98 opened this issue 1 year ago • 2 comments

What is this PR for?

Add some validation check conditions to existing url validator in jdbc interpreter. So now it can check URLs with the conditions below if it has an unallowable configuration.

  • UTF-8 encoded

What type of PR is it?

Improvement

Todos

  • [ ] - Task

What is the Jira issue?

ZEPPELIN-6095

How should this be tested?

Input the url with unallowable configurations in UTF-8 encoded in JDBC type interpreter. Then run the command in notebook and see if the command is blocked from running.

Screenshots (if appropriate)

Questions:

  • Does the license files need to update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

s2moon98 avatar Sep 19 '24 14:09 s2moon98

do we really need to remove white spaces here?

I have the same question too. Is there any potential risk if we have encoded while spaces in the JDBC URL?

In Apache Kyuubi cases, it allows the user to pass the spark.app.name as part of JDBC URL, so the encoded URL might be like

jdbc:kyuubi://host:port/db;#spark.app.name=Hello%20World;spark.driver.memory=4g

For example,

$  bin/beeline -u 'jdbc:kyuubi://localhost:10009/#spark.app.name=Hello%20World'
0: jdbc:hive2://localhost:10009/#spark.app.na> set spark.app.name;

24/09/20 00:30:25 INFO ExecuteStatement:
           Spark application name: Hello World
                 application ID: local-1726763393107
                 application web UI: http://192.168.8.150:53761
                 master: local[*]
                 deploy mode: client
                 version: 3.3.1
           Start time: 2024-09-20T00:29:52.154
           User: anonymous
...
2024-09-20 00:30:26.123 INFO org.apache.kyuubi.operation.ExecuteStatement: Processing anonymous's query[a8771241-5233-44fe-8c69-ad94a8f0217d]: RUNNING_STATE -> FINISHED_STATE, time taken: 0.128 seconds
+-----------------+--------------+
|       key       |    value     |
+-----------------+--------------+
| spark.app.name  | Hello World  |
+-----------------+--------------+
1 row selected (0.401 seconds)

pan3793 avatar Sep 19 '24 16:09 pan3793

@tbonelee @pan3793 Thank you for mentioning the white space issue.

I put the eliminating white spaces logic to block the unallowable configurations with white space included such as : allow%20LoadLocalInfile=true

The URL with eliminating white spaces is used only when validating, not when creating connection with the URL. So I think it does not cause risky situation like a URL containing white space is not connected.

However I confirmed that if user put url with white space contained, the configuration does not apply and JDBC throws an error. So I will remove the eliminating white spaces logic.

s2moon98 avatar Sep 22 '24 09:09 s2moon98

https://github.com/apache/zeppelin/pull/4882 (Java release to 11) is merged to master branch. Can @pan3793 review this pr, so I can merge this branch?

s2moon98 avatar Oct 27 '24 07:10 s2moon98

some thoughts:

  • PR title and desc are misleading, the key change of this PR is, validating the decoded URL instead of the encoded URL
  • if the decoded URL is an illegal UTF-8 sequence, the new code just skips validation, I'm not sure this is the correct behavior. from the security perspective, we should fail immediately.

pan3793 avatar Oct 28 '24 03:10 pan3793

  • changed the PR title to show the key change.
  • changed code to fail immediately if url decoding throws error. Previously, I thought if decoding step fails, validating logic has a fault. However since @pan3793 says above, fail immediately is more close behavior to secure, I changed it.
  • use StandardCharsets in URLDecoder.decode method's second input parameter as @Reamer says above

s2moon98 avatar Nov 05 '24 12:11 s2moon98

LGTM, I have no further concerns.

pan3793 avatar Nov 05 '24 12:11 pan3793

Thanks for @Reamer, I fix the URLDecoder.decode method to fit for java 11. Additionally, in java 11 URLDecoder, it never throws UnsupportedEncodingException so I removed the relevant try-catch statement.

s2moon98 avatar Nov 06 '24 01:11 s2moon98

@Reamer I rebased my pr after the master branch, and applied your reviews (fix indent & remove unused import). Please check for it!

s2moon98 avatar Nov 06 '24 11:11 s2moon98

Merged into master/branch-0.12

Reamer avatar Nov 07 '24 07:11 Reamer