pinot Reduce docker image size

The current pinot docker image size is pretty large, 1GB after 0.10.0 release. This is majorly due to the shaded plugins.

144K	plugins/pinot-batch-ingestion/pinot-batch-ingestion-standalone
144K	plugins/pinot-batch-ingestion
87M	plugins/pinot-environment/pinot-azure
87M	plugins/pinot-environment
17M	plugins/pinot-file-system/pinot-adls
14M	plugins/pinot-file-system/pinot-gcs
12K	plugins/pinot-file-system/pinot-hdfs
29M	plugins/pinot-file-system/pinot-s3
60M	plugins/pinot-file-system
3.4M	plugins/pinot-input-format/pinot-avro
17M	plugins/pinot-input-format/pinot-confluent-avro
360K	plugins/pinot-input-format/pinot-csv
328K	plugins/pinot-input-format/pinot-json
46M	plugins/pinot-input-format/pinot-orc
49M	plugins/pinot-input-format/pinot-parquet
2.0M	plugins/pinot-input-format/pinot-protobuf
2.1M	plugins/pinot-input-format/pinot-thrift
119M	plugins/pinot-input-format
20M	plugins/pinot-metrics/pinot-dropwizard
20M	plugins/pinot-metrics/pinot-yammer
40M	plugins/pinot-metrics
92K	plugins/pinot-minion-tasks/pinot-minion-builtin-tasks
92K	plugins/pinot-minion-tasks
16K	plugins/pinot-segment-uploader/pinot-segment-uploader-default
16K	plugins/pinot-segment-uploader
20K	plugins/pinot-segment-writer/pinot-segment-writer-file-based
20K	plugins/pinot-segment-writer
19M	plugins/pinot-stream-ingestion/pinot-kafka-0.9
32M	plugins/pinot-stream-ingestion/pinot-kafka-2.0
14M	plugins/pinot-stream-ingestion/pinot-kinesis
60M	plugins/pinot-stream-ingestion/pinot-pulsar
124M	plugins/pinot-stream-ingestion
428M	plugins

Here are a few ways to reduce the docker image size:

Remove unnecessary shaded jars, e.g. Hadoop ingestion jars.
Have the build docker script taking parameters for which plugins to be included. And we can just package the most popular shaded jars.
Have a script associated with pinot-admin.sh to take the env variable of loading plugins then download released shaded jars from the release maven Artifactory before startup.

For users:

For using jars not packaged, users can load jars when starting pinot. It requires an extra env variable to specify the plugins.

For plugin developers:

If the plugin is not packaged, users can build their customized pinot image by using the existing pinot image as the base image, build and copy the jar to the plugin directory then publish their own jars.

cc: @kishoreg

May 17 '22 09:05 xiangfu0

I've just made an analysis of the docker image.

It seems that the three bigger layers are:

344MBs of base image (jdk-slim), which is highly reusable. That means that given two images, they will probably use the same base and therefore the space/download time would be paid only once
617MBs of apt-update and apt-install, which is not reusable at all. We can improve this by creating a specific base image and reusing it.
716MBs of apache-pinot, which are copied in a single layer. Of which:
- 100MBs are examples, which are very static (they almost never change)
- 454MBs are plugins, which are mostly shaded dependencies. They are highly optimizable with docker layers, but we cannot use that because they are shaded
- 150MBs are pinot itself and their dependencies. I guess most of it would be the dependencies, which again could be layered, but they are shaded.

This means that each time we change a single character and create a new docker image, we are storing and downloading in our pods (617 + 716)MBs of data. I think that almost 1GB of that data is static information we could just reuse if correctly using docker layers.

What @xiangfu0 suggested about having different images with more or less plugins or that are able to download plugins at start time can be a solution, but I think it would have the side effect of making quite more difficult to understand to customers. If instead of doing that we just correctly use the layers, the first time a user downloads an image will need to download 1.3GBs of data, but if then he/she downloads a second version, it is very probable that most of the layers would be the same, so he/she would only need to download around 150MBs of data. It also applies to our own pods, which would only need to download these 150MBs instead of 1.3GBs on most upgrades.

Jul 01 '22 15:07 gortiz

BTW, I had very good experience with Jib in the past. Given the complexities of Apache Pinot building process, I don't think we can directly use Jib, but we can replicate the way Jib layers the images (link and link)

Jul 01 '22 15:07 gortiz

I've just made an analysis of the docker image.

It seems that the three bigger layers are:

344MBs of base image (jdk-slim), which is highly reusable. That means that given two images, they will probably use the same base and therefore the space/download time would be paid only once

617MBs of apt-update and apt-install, which is not reusable at all. We can improve this by creating a specific base image and reusing it.

716MBs of apache-pinot, which are copied in a single layer. Of which:

100MBs are examples, which are very static (they almost never change)

454MBs are plugins, which are mostly shaded dependencies. They are highly optimizable with docker layers, but we cannot use that because they are shaded

150MBs are pinot itself and their dependencies. I guess most of it would be the dependencies, which again could be layered, but they are shaded.

This means that each time we change a single character and create a new docker image, we are storing and downloading in our pods (617 + 716)MBs of data. I think that almost 1GB of that data is static information we could just reuse if correctly using docker layers.

What @xiangfu0 suggested about having different images with more or less plugins or that are able to download plugins at start time can be a solution, but I think it would have the side effect of making quite more difficult to understand to customers. If instead of doing that we just correctly use the layers, the first time a user downloads an image will need to download 1.3GBs of data, but if then he/she downloads a second version, it is very probable that most of the layers would be the same, so he/she would only need to download around 150MBs of data. It also applies to our own pods, which would only need to download these 150MBs instead of 1.3GBs on most upgrades.

This is interesting, I feel just make a pinot-base image can actually help reduce docker pull size a lot!

Also for the base image, do you see when this image should be updated?

Aug 16 '22 09:08 xiangfu0