spark-jobserver Separate job config and system config in SparkJob API

At the moment SparkJob.validate() and SparkJob.runJob() have Config arguments that contain merged system config and request config values.

This can lead to a scenario where a SparkJob implementation that uses a system config parameter can have that config parameter changed by a request config that uses the same parameter name.

I propose that we don't merge system and and request configs. Instead, runJob() should only receive request config and a separate init() method should be introduced which will be called only once and will receive just the system config when called. I think this can be achieved without breaking the API.

Sep 29 '15 05:09 shizambles

This sounds good to me. @zeitos @noorul?

On Sep 28, 2015, at 10:01 PM, Sriram Seshadri [email protected] wrote:

At the moment SparkJob.validate() and SparkJob.runJob() have Config arguments that contain merged system config and request config values.

This can lead to a scenario where a SparkJob implementation that uses a system config parameter can have that config parameter changed by a request config that uses the same parameter name.

I propose that we don't merge system and and request configs. Instead, runJob() should only receive request config and a separate init() method should be introduced which will be called only once and will receive just the system config when called. I think this can be achieved without breaking the API.

— Reply to this email directly or view it on GitHub https://github.com/spark-jobserver/spark-jobserver/issues/264.

Sep 29 '15 05:09 velvia

I don't get why the request goes to init() and the system goes in runJob(). runJob will be called by each request while init will be called once by the system.

Anyway, I like the idea of not overriding default settings without letting the job know.

Sep 29 '15 14:09 zeitos

I think he meant it the other way around. Let’s prototype the API:

def init(sysConfig: Config): Unit // This gets the system config passed into the ActorSystem def runJob(contextLike: ContextLike, jobConfig: Config): Any

On Sep 29, 2015, at 7:55 AM, Fernando Otero [email protected] wrote:

I don't get why the request goes to init() and the system goes in runJob(). runJob will be called by each request while init will be called once by the system.

Anyway, I like the idea of not overriding default settings without letting the job know.

— Reply to this email directly or view it on GitHub https://github.com/spark-jobserver/spark-jobserver/issues/264#issuecomment-144085358.

Sep 29 '15 17:09 velvia