spark-on-kubernetes-docker
spark-on-kubernetes-docker copied to clipboard
Not all Spark configurations are supported by the mask
The 4th container mask isn't currently supported, as the property does not get mapped to a Spark configuration. Specifically:
https://github.com/jahstreet/spark-on-kubernetes-docker/blob/1dc45d3053d1cf7689e9ba07c7270a52f2a16aa9/livy/0.7.0-incubating-spark_2.4.5_2.11-hadoop_3.1.0_cloud/entrypoint.sh#L47
Per the 4th mask, lower case LIVY_SPARK should be picked up but here it is not. Furthermore we need a way to provide configurations without using numeric characters as replacements because numeric characters can be present in Spark configuration properties, specifically the keys. This is especially true for Azure and configuring storage accounts access.
@marcjimz , not quite sure I get your point about the Per the 4th mask, lower case LIVY_SPARK should be picked up but here it is not
, could you please provide an example?
What comes about numeric property names, there is a limited character set that can be used in the names of ENV vars, so using some numbers seemed a good idea to me, though I'm opened for suggestions.
One alternative as I see now is providing configs through env vars following the pattern: LIVY_SPARK_CONF_ANY_SUFFIX="spark.conf.key=spark.conf.value"
(same for Livy). In that case you can write the configs as-is. But then you need to control the _SUFFIX
es on your own not to have duplicates, and (or) the config precedency will be given according to the alphabetical order of the env var name, so that later overrides the former.
In case Azure configurations (and others that require numbers to be a part of the property name) you can still mount them with no limitations as processed here and described here.
Does it makes sense for your use case?