spark-on-kubernetes-docker icon indicating copy to clipboard operation
spark-on-kubernetes-docker copied to clipboard

Not all Spark configurations are supported by the mask

Open marcjimz opened this issue 4 years ago • 1 comments

The 4th container mask isn't currently supported, as the property does not get mapped to a Spark configuration. Specifically:

https://github.com/jahstreet/spark-on-kubernetes-docker/blob/1dc45d3053d1cf7689e9ba07c7270a52f2a16aa9/livy/0.7.0-incubating-spark_2.4.5_2.11-hadoop_3.1.0_cloud/entrypoint.sh#L47

Per the 4th mask, lower case LIVY_SPARK should be picked up but here it is not. Furthermore we need a way to provide configurations without using numeric characters as replacements because numeric characters can be present in Spark configuration properties, specifically the keys. This is especially true for Azure and configuring storage accounts access.

marcjimz avatar May 13 '20 17:05 marcjimz

@marcjimz , not quite sure I get your point about the Per the 4th mask, lower case LIVY_SPARK should be picked up but here it is not, could you please provide an example? What comes about numeric property names, there is a limited character set that can be used in the names of ENV vars, so using some numbers seemed a good idea to me, though I'm opened for suggestions. One alternative as I see now is providing configs through env vars following the pattern: LIVY_SPARK_CONF_ANY_SUFFIX="spark.conf.key=spark.conf.value" (same for Livy). In that case you can write the configs as-is. But then you need to control the _SUFFIXes on your own not to have duplicates, and (or) the config precedency will be given according to the alphabetical order of the env var name, so that later overrides the former. In case Azure configurations (and others that require numbers to be a part of the property name) you can still mount them with no limitations as processed here and described here. Does it makes sense for your use case?

jahstreet avatar May 24 '20 14:05 jahstreet