PHOENIX-7393 Update transitive dependency of woodstox-core to 5.4.0
Update transitive dependency of woodstox-core to 5.4.0
This dependency is a transitive dependency of hadoop, which is using a 5.3.0 version which is currently flagged with 5 CVEs.
https://nvd.nist.gov/vuln/detail/CVE-2022-40152
This is a transitive dependency from hadoop, it is most likely not needed for phoenix. Notice that any product that is using phoenix-client-embedded to use Phoenix internally, is flagged with this CVEs
@lhofhansl @JamesRTaylor @palashc Can you please take a look? This is used in Trino phoenix connector. Then it makes entire Trino flagged with his CVE.
CC: @wendigo @kgniew
Thank you. I have created the jira ticket, see https://issues.apache.org/jira/browse/PHOENIX-7393
Has anyone looked at what Hadoop uses this for ?
Looking at https://github.com/apache/hadoop/pull/5087 this seems to have the potential to break Phoenix in rare cases.
While I'm not wild about managing versions for transitive dependencies, in this case I would prefer dependencymanaging the woodstock version instead.
Yeah this has been problematic dependency. @stoty do you know if any/all of our Jackson usages need woodstox?
Build results: https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1957/
@kokosing btw did you also check dependency:tree output and saw woodstox no longer present from any other upstream?
We are certainly not referring the classes directly, and based on the test results it is not needed for any of our test cases either. I am not familiar with Jackson internals, and when exactly Woodstox is needed.
Reading the Hadoop ticket, it seems that Woodstock is only necessary if hbase-site.xml contains a DTD definition i.e. a header.
I think it would wiser not risk breaking Phoenix in case someone has decided to a DTD header to their config files.
btw did you also check dependency:tree output and saw woodstox no longer present from any other upstream
Yes. I have checked that. After this change there is no more woodstox in phoenix build.
Reading the Hadoop ticket, it seems that Woodstock is only necessary if hbase-site.xml contains a DTD definition i.e. a header.
Can we have an unit test to expose that behavior? That way it would be easier to see what it can be done to make sure such use case is still supported.
If we keep Woodstock and dependency manage it to a non-vulnerable version then we can be sure that we won't break anything, and the scanners will be happy. We already do this for a few components.
Sounds good @stoty, it does not break any compatibility. @kokosing since you have already made nice progress, could you please also include the dependency on pom directly? or you want someone else to do that? The tests are not impacted so we are good anyway but would be great to include CVE fixed version directly in Phoenix pom.
Yes. I am happy to do it. I will update this PR within few days.
Before we have following woodstox-core usage:
kokosing@m2:~/src/apache/phoenix$ mvn dependency:tree -o | grep woodstox | cut -d + -f 2 | sort -u
- com.fasterxml.woodstox:woodstox-core:jar:5.3.0:compile
- com.fasterxml.woodstox:woodstox-core:jar:5.3.0:provided
- com.fasterxml.woodstox:woodstox-core:jar:5.3.0:test
- org.codehaus.woodstox:stax2-api:jar:4.2.1:compile
- org.codehaus.woodstox:stax2-api:jar:4.2.1:provided
- org.codehaus.woodstox:stax2-api:jar:4.2.1:test
5.3.0 is flagged with 5 CVEs, see https://mvnrepository.com/artifact/com.fasterxml.woodstox/woodstox-core/5.3.0, and 5.4.0 is without CVEs, see https://mvnrepository.com/artifact/com.fasterxml.woodstox/woodstox-core/5.4.0
After a change:
kokosing@m2:~/src/apache/phoenix$ mvn dependency:tree -o | grep woodstox | cut -d + -f 2 | sort -u
- com.fasterxml.woodstox:woodstox-core:jar:5.4.0:compile
- com.fasterxml.woodstox:woodstox-core:jar:5.4.0:provided
- com.fasterxml.woodstox:woodstox-core:jar:5.4.0:test
- org.codehaus.woodstox:stax2-api:jar:4.2.1:compile
- org.codehaus.woodstox:stax2-api:jar:4.2.1:provided
- org.codehaus.woodstox:stax2-api:jar:4.2.1:test
Yes. I am happy to do it. I will update this PR within few days.
Actually, it took me shorter :)
The challenge with the current approach is that someone would need to update this whenever the hadoop is updated. I hope it is not going to downgrade the transitive dependency in future.
+1, thank you @kokosing!
@stoty anything from your side?
The challenge with the current approach is that someone would need to update this whenever the hadoop is updated. I hope it is not going to downgrade the transitive dependency in future.
You are right, this is an anti-pattern. However, as long as we ship uberjars, we have limited options.
@stoty anything from your side?
Please update the jira descirption to match the commit message.
At some point we should group the transitive-only version overrides into a single section, but that's outside the scope of this ticket.
Please update the jira descirption to match the commit message.
Done
Fixed
Is there anything else I am required to do to push it forward? Are tests failure related?
No, our tests are flakey.
Thank you.
Are there any plans about the release?
Yes, 5.2.1 release planning related thread is just going on over dev@phoenix mailing list. I just mentioned that we should now move forward for 5.2.1 release after merging the PR.
@kokosing any other CVE that is coming from Hadoop and you think we should resolve it now? It would be great to get them in too if Trino is getting impacted.
Let me check that. Will get back on that.
@kgniew did a scan and returned with the following information:
With regards to the phoenix connector there are several CVEs related to
jackson-databindpackage located here phoenix5/phoenix-client-embedded-hbase-2.5.0-5.2.0.jar current package version is 2.4.0 and it should be updated to 2.12.7.1snappy-javapackage located here phoenix5/phoenix-client- embedded-hbase-2.5.0- 5.2.0.jar current package version is 1.0.5 and it should be updated to 1.1.10.4protobuf-javapackage located here phoenix5/phoenix-client- embedded-hbase-2.5.0- 5.2.0.jar current package version is 2.5.0 and it should be updated to 3.16.3avropackage located here phoenix5/phoenix-client- embedded-hbase-2.5.0- 5.2.0.jar current package version is 1.7.7 and it should be updated to 1.11.3json-smartpackage located here phoenix5/phoenix-client- embedded-hbase-2.5.0- 5.2.0.jar current package version is 1.3.2 and it should be updated to 2.4.9jettisionpackage located here phoenix5/phoenix-client- embedded-hbase-2.5.0- 5.2.0.jar current package version is 1.1 and it should be updated to 1.5.4nettyis also problematic however grype does not provide a fixed version for critical CVE-2019-20444