gatk icon indicating copy to clipboard operation
gatk copied to clipboard

Serious Security Vulnerabilities in GATK

Open mohitmathew opened this issue 2 years ago • 17 comments

I am looking at using GATK and first checked at the docker image using docker pull broadinstitute/gatk

this container image has 1460 vulnerabilities and a lot of them are critical. Screenshot 2023-02-21 212830

Then I decided not to use this image and instead create my own image and just deploy the released version 4.2.6.1 from here (https://github.com/broadinstitute/gatk/releases/download/4.2.6.1/gatk-4.2.6.1.zip).

Even this has many vulnerabilities include things stemming from log4j 1.2.17. These have been fixed by log4j team years back in version 2.17.1 onwards. I am really stunned that a popular library like gatk is not keeping up with basic security fixes.

Screenshot 2023-02-21 212751

the latest version of docker desktop has integrated image scanning and can very easily highlight the issues listed above.

Can we start addressing these issues sooner than later.

mohitmathew avatar Feb 22 '23 02:02 mohitmathew

@mohitmathew Thanks for the report! We are currently in the process of updating GATK to Java 17, which necessarily involves updating many of our dependencies. We are also updating our docker image to be based off of the latest Ubuntu LTS release. This should greatly reduce the number of critical vulnerabilities in our release image. After the Java 17 switchover we can revisit this and see what security issues remain.

droazen avatar Feb 23 '23 18:02 droazen

Java 17! Great news.

gokalpcelik avatar Feb 26 '23 09:02 gokalpcelik

I will say that a lot of the listed vulnerabilities are not actually problematic for us. Many of the scariest ones are only relevant in the context of reading untrusted data from the internet which is not something that GATK is typically doing.

lbergelson avatar Feb 27 '23 20:02 lbergelson

@droazen sorry for a late response. I agree moving to java 17 would help. I do see that GATK itself is using the newer version of log4j but then its the transitive dependencies for the libraries used that bring in the older version of log4j.

this creates situations that the final compiled jar has both version of the log4j and this could create problems.

Gatk being a very useful tool gets integrated in multiple other tools and pipelines so in a way affecting the security posture of where its being used. The risk might be low being a standalone cli tool but its a very hard conversation with info security :) .

May I ask for a ballpark ETA for the new version? Appreciate the work thats gone into this tool.

mohitmathew avatar Feb 28 '23 20:02 mohitmathew

It was released last week.

DarioS avatar Mar 22 '23 00:03 DarioS

The latest GATK release does significantly cut down on the number of critical vulnerabilities (mainly by moving to the latest Ubuntu 18.04 image), but there is definitely more work to be done here, so I'll keep this ticket open

droazen avatar Apr 04 '23 15:04 droazen

I am still receiving security warnings about GATK 4.4.0.0:

Detected by File Paths: gatk-4.4.0.0/gatk-package-4.4.0.0-local.jar Detected by Library: pkg:java/log4j:log4j CPE: cpe:/a:apache:log4j:1.2.17 Version End of Life Date: August 4th, 2015 at 7:00 PM

superbsky avatar Apr 18 '23 21:04 superbsky

The vulnerabilities reduced a bit but most serious once continue to be there. Dependency upkeep is really needed to iron this out these.

mohitmathew avatar May 11 '23 19:05 mohitmathew

Yes, this issue is not yet fully resolved. We intend to make additional progress in reducing vulnerabilities in our dependencies in the next GATK release.

droazen avatar May 12 '23 18:05 droazen

HI @droazen I see you were on this issue and generated a PR but could not merge because test case failures. I wanted to check if you were able to make progress on this. Within my organization infosec independently reviewed and have denied use of GATK :( . Let me know if you have an ETA for security fix update.

Thank you!

mohitmathew avatar Aug 15 '23 14:08 mohitmathew

@mohitmathew Yes, we are still working on this! The PR is not yet in a usable state, but we intend to finish it for the next release.

droazen avatar Aug 16 '23 15:08 droazen

Thanks @droazen ! . Eagerly waiting for the next release

mohitmathew avatar Aug 17 '23 19:08 mohitmathew

@mohitmathew With the GATK 4.5 release, we've again made significant progress on the known vulnerabilities in our dependencies, as well as in our docker image. There are still a few left, in "dependencies of our dependencies" that will be difficult to update, but we're getting there.

Note that the known vulnerabilities in log4j 1.x reported above are not the same as the infamous (and extremely serious) log4j 2.x vulnerabilities that were discovered a few years back. log4j 1.x completely lacks the feature that was exploited in the log4j 2.x vulnerability, and we patched our version of log4j 2.x in GATK almost as soon as that vulnerability was reported.

droazen avatar Dec 14 '23 00:12 droazen

@droazen : Thanks a lot for prioritizing and attending to this. The security posture has greatly improved from where we started. Community greatly benefits from your effort.

I have migrated to using the 4.5 release after some regression testing. Below is a list of critical and high findings with 4.5 release. There are links to snyk version update recommendations. I know sometimes its not easy just to upgrade the library version as we could end up with run time errors. I am adding this here so that its handy when ever you look at this further.

Thanks again.

packageName version severity language module_id
com.google.protobuf:protobuf-java 3.7.1 high java SNYK-JAVA-COMGOOGLEPROTOBUF-2331703
com.google.protobuf:protobuf-java 3.7.1 high java SNYK-JAVA-COMGOOGLEPROTOBUF-3167772
io.netty:netty-codec-http2 4.1.96.Final high java SNYK-JAVA-IONETTY-5953332
log4j:log4j 1.2.17 high java SNYK-JAVA-LOG4J-2342645
log4j:log4j 1.2.17 high java SNYK-JAVA-LOG4J-2342646
log4j:log4j 1.2.17 high java SNYK-JAVA-LOG4J-2342647
log4j:log4j 1.2.17 critical java SNYK-JAVA-LOG4J-572732
net.minidev:json-smart 1.3.2 high java SNYK-JAVA-NETMINIDEV-3369748
org.apache.zookeeper:zookeeper 3.6.3 high java SNYK-JAVA-ORGAPACHEZOOKEEPER-5961102
org.codehaus.jettison:jettison 1.1 high java SNYK-JAVA-ORGCODEHAUSJETTISON-3168085
org.codehaus.jettison:jettison 1.1 high java SNYK-JAVA-ORGCODEHAUSJETTISON-3367610
org.eclipse.jetty:jetty-http 9.4.52.v20230823 high java SNYK-JAVA-ORGECLIPSEJETTY-5958847

mohitmathew avatar Jan 13 '24 17:01 mohitmathew

@droazen , We made fixes for the vulnerabilities after java17 which was release last week.

Can you help to integrate this into GATK so that we can have new release. We have the files with patch ready.

Thanks

vilay-nference avatar Jul 11 '24 04:07 vilay-nference

@vilay-nference You are always very welcome to submit a pull request on github with any proposed changes to GATK!

Most of the remaining vulnerabilities are in dependencies-of-dependencies which can be difficult to update, but we are slowly chipping away at them. For example, log4j 1.x is a dependency of the latest release of Apache Spark 3.x, and 4.x is still in preview (and note again that the log4j 1.x vulnerabilities are not the same as the infamous and very serious vulnerability that affected log4j 2.x some years ago). We don't believe that any of the remaining library vulnerabilities pose a real-world threat to GATK in practice, but it would still be good to eliminate them.

droazen avatar Jul 11 '24 14:07 droazen

@droazen,

Apologies for the delay in getting back to you.

Given the nature of our work, it's essential that we address and remove any high and critical vulnerabilities, regardless of their real-world threat level. Ensuring our system remains secure is our top priority.

Here is the pull request with the modifications to address the high and critical vulnerabilities: #8950.

Please review and let me know if you have any feedback.

vilay-nference avatar Aug 13 '24 11:08 vilay-nference