Apache Pinot vulnerabilities issues
I have scanned following image, and found some vulnerabilities. This is from master branch
apachepinot/pinot:1.2.0-SNAPSHOT-ddce06f9cc-20240620-17-ms-openjdk
below critical and high vulnerabilities?
https://www.cve.org/CVERecord?id=CVE-2024-32002 https://www.cve.org/CVERecord?id=CVE-2020-19726 https://www.cve.org/CVERecord?id=CVE-2022-47695 https://www.cve.org/CVERecord?id=CVE-2023-47038 https://www.cve.org/CVERecord?id=CVE-2022-2000 https://www.cve.org/CVERecord?id=CVE-2022-2042 https://www.cve.org/CVERecord?id=CVE-2023-4733 https://www.cve.org/CVERecord?id=CVE-2023-4735 https://www.cve.org/CVERecord?id=CVE-2023-4750 https://www.cve.org/CVERecord?id=CVE-2023-4751 https://www.cve.org/CVERecord?id=CVE-2023-4752 https://www.cve.org/CVERecord?id=CVE-2023-4781 https://www.cve.org/CVERecord?id=CVE-2023-5535
https://www.cve.org/CVERecord?id=CVE-2022-44840 https://www.cve.org/CVERecord?id=CVE-2022-45703 https://www.cve.org/CVERecord?id=CVE-2024-22667 https://www.cve.org/CVERecord?id=CVE-2022-1897 https://www.cve.org/CVERecord?id=CVE-2022-3510
https://www.cve.org/CVERecord?id=CVE-2022-3171 https://www.cve.org/CVERecord?id=CVE-2022-3509 https://www.cve.org/CVERecord?id=CVE-2021-46174 https://www.cve.org/CVERecord?id=CVE-2023-2976
https://www.opencve.io/cve/CVE-2024-32002 https://www.opencve.io/cve/CVE-2020-19726 https://www.opencve.io/cve/CVE-2022-47695 https://www.opencve.io/cve/CVE-2023-47038 https://www.opencve.io/cve/CVE-2022-2000 https://www.opencve.io/cve/CVE-2022-2042 https://www.opencve.io/cve/CVE-2023-4733 https://www.opencve.io/cve/CVE-2023-4735 https://www.opencve.io/cve/CVE-2023-4750 https://www.opencve.io/cve/CVE-2023-4751 https://www.opencve.io/cve/CVE-2023-4752 https://www.opencve.io/cve/CVE-2023-4781 https://www.opencve.io/cve/CVE-2023-5535
https://www.opencve.io/cve/CVE-2022-44840 https://www.opencve.io/cve/CVE-2022-45703 https://www.opencve.io/cve/CVE-2024-22667 https://www.opencve.io/cve/CVE-2022-1897 https://www.opencve.io/cve/CVE-2022-3510
https://www.opencve.io/cve/CVE-2022-3171 https://www.opencve.io/cve/CVE-2022-3509 https://www.opencve.io/cve/CVE-2021-46174 https://www.opencve.io/cve/CVE-2023-2976
altogether 22 CVEs: 1x 9.0/10 Critical 1x 8.8/10 High 15x 7.8/10 High 4x 7.5/10 High 1x 7.1/10 High
next time one should follow https://github.com/apache/pinot/security
the question is: how can this partly relatively old CVEs with high severity remain in master,
- if we have dependabot active which should keep everything up-to-date?
- some of them were already reported several times see: https://github.com/apache/pinot/issues?q=is%3Aissue+is%3Aopen+CVE
Do we have any ideas for approaches how we can fine-tune our processes to prevent this/lower the number?
maybe of interest comments in: https://github.com/apache/pinot/issues/12341#issuecomment-2183461835
PLease may I know the SLA to get a fixable version for these vulnerabilities?
PLease may I know the SLA to get a fixable version for these vulnerabilities?
if you are using the paid pinot offer from startree, maybe you can contact their support directly. Otherwise, well... I'm afraid there is no SLA.. everybody likes this to be fixed.. its open source, everyone can contribute...
@vpriyam Could you please share what tool was used to flag these CVEs, and if possible reproduce with latest release of Pinot (jdk21)?
Also, I spot checked a few, I am unclear how are they related to Pinot. For example, the first one is about GIT recursive clone, the second one is about some binutils library that I don't see in Pinot (when grepping the code base), yet another one was about vim.
I suppose these might be coming from the docker image?
The following command shows, below vulnearbilties
docker scout cves apachepinot/pinot:1.2.0-21-openjdk
127 vulnerabilities found in 36 packages
UNSPECIFIED 9
LOW 69
MEDIUM 20
HIGH 23
CRITICAL 6
I am assigning myself to work on this.
We are using this script to build docker runtime image: https://github.com/apache/pinot/blob/master/docker/images/pinot-base/pinot-base-runtime/amazoncorretto.dockerfile
I think openjdk 21 base image is not updated for a year, and mentioned deprecated (https://hub.docker.com/_/openjdk) I changed the build to use both Amazon Corretto and MS OpenJDK.
The sample run for base runtime image build is: https://github.com/apachepinot/pinot-fork/actions/workflows/build-pinot-base-runtime-docker-image.yml
And here are the 2 CVEs for Corretto: For amazon jdk: the 2 CVEs are: CVE-2023-31486 and CVE-2023-31484 It's all about perl related libraries.
For MS OpenJDK, the one CVE is CVE-2024-26800, but no fix yet.
@abhioncbr Please let me know if there is anything more you need.
Thanks @xiangfu0.
I think part of the problem can be resolved by building the pinot-base-runtime images once daily. If I am interpreting the data correctly, before today's run, we ran the build image six months ago. Is that correct?
Secondly, I am not sure why we are installing these packages procps vim less wget curl git python sysstat perf libtasn1 zstd (reference). Some of the vulnerabilities are coming from these packages. I am trying to understand why we need it. I am planning to remove some of these packages, or if required, we can build a new Pinot image, say Pinot-slim, with the minimum required packages installed.
Please comment If it makes sense. Thanks
we can build a new Pinot image, say
Pinot-slim, with the minimum required packages installed.
+1 on building a smaller image. There is also an issue about this: https://github.com/apache/pinot/issues/8718 and another one: https://github.com/apache/pinot/issues/13726 and a 3rd one: https://github.com/apache/pinot/issues/12564 and a 4th one: https://github.com/apache/pinot/issues/5692
=> not sure if we need different images, or only one cleaned up (if there is anything to clean)
edit: its possibly also somehow related to https://github.com/apache/pinot/issues/11507
as added in comment above, there are at least 4-5 issues talking about we want/need to go smaller: maybe we should open a final issue on this as summarization or parent issue and decouple discussion from this security issue?
You know I'm strongly against using slim images based on alpine and to be honest I think it is a debate we should open in the context of vulnerabilities. But in my reasons to not use alpine are:
- It is a cheap, uninformative solution. Docker layers are designed to pay for base images only once. For example, one pinot image deriving from the same microsoft jdk 21 image pays less than 400MB for the base image. One thousand docker pinot images deriving from the same microsoft jdk 21 pay the exact same 400 MBs (all together).
- On the contrary, each pinot image adds:
- A layer with around 200 MBs when installing extra software with apt-get (this could be reused if we actually reuse the pinot base image, which I'm not sure if we are doing).
- 959 MBs of Pinot code. This layer is always new. Of which:
- 120 MBs belong examples (these are the same on each version, a new layer for them would reduce the actual size by 120 MBs!!!)
- 304 MBs belong to our shaded jar. By using a layer for libraries (which don't change that often) and another for our code and specially do not using shading, we could reduce this significantly.
- 463 MBs belong to plugins. Probably without shading we can reduce this by half. Of which:
- 25 belong to confluent-avro
- 78 belong to orc
- 114 belong to parquet
- 50 to gcs
- 46 to kafka 2.0
- 45 to pulsar
- 36 to hadoop
- 36 to spark
In case we want to actually reduce docker space what we need to do is to optimize our Pinot layer to divide it into more than one layer, do not include large examples (or dedicate a layer to them) and specially remove shading. We should also be sure we are reusing the base image when we create pinot images (instead of running apt-get for each one). By applying these changes we could reduce the actual image size from around 1.5GBs to something like 500MBs.
By using alpine we just reduce the size of the base image by at most a third. And that change won't actually matter in environments where we already have pulled another image sharing the same base (which will be common when upgrading pinot or storing images in a local repository).
But migrating to alpine is not free. Alpine doesn't use libc and that can mean difficult to prevent issues. See for example this post.
If we want to offer a pinot alpine version it is fine to me (each user may decide if they want to face the risk of using musl instead of libc) but IMHO it is mandatory to offer another docker image (derived from microsoft jdk, corretto or temurin) as the main docker image.
Also notice that I've just executed trivy image apachepinot/pinot:latest-21-ms-openjdk, which returns no vuls in the OS layer, and trivy image apachepinot/pinot:latest-21-amazoncorretto, which returned only low and medium vulns in the OS layer. Meanwhile there are several high and critical issues in the Java layer. I don't know whch tool used @vpriyam to find this vulns, but either trivy wasn't able to detect them or (more probably) they have been fixed automatically by the distributor of the base images we use.