spark-jobserver icon indicating copy to clipboard operation
spark-jobserver copied to clipboard

Separate job config and system config in SparkJob API

Open shizambles opened this issue 10 years ago • 3 comments

At the moment SparkJob.validate() and SparkJob.runJob() have Config arguments that contain merged system config and request config values.

This can lead to a scenario where a SparkJob implementation that uses a system config parameter can have that config parameter changed by a request config that uses the same parameter name.

I propose that we don't merge system and and request configs. Instead, runJob() should only receive request config and a separate init() method should be introduced which will be called only once and will receive just the system config when called. I think this can be achieved without breaking the API.

shizambles avatar Sep 29 '15 05:09 shizambles

This sounds good to me. @zeitos @noorul?

On Sep 28, 2015, at 10:01 PM, Sriram Seshadri [email protected] wrote:

At the moment SparkJob.validate() and SparkJob.runJob() have Config arguments that contain merged system config and request config values.

This can lead to a scenario where a SparkJob implementation that uses a system config parameter can have that config parameter changed by a request config that uses the same parameter name.

I propose that we don't merge system and and request configs. Instead, runJob() should only receive request config and a separate init() method should be introduced which will be called only once and will receive just the system config when called. I think this can be achieved without breaking the API.

— Reply to this email directly or view it on GitHub https://github.com/spark-jobserver/spark-jobserver/issues/264.

velvia avatar Sep 29 '15 05:09 velvia

I don't get why the request goes to init() and the system goes in runJob(). runJob will be called by each request while init will be called once by the system.

Anyway, I like the idea of not overriding default settings without letting the job know.

zeitos avatar Sep 29 '15 14:09 zeitos

I think he meant it the other way around. Let’s prototype the API:

def init(sysConfig: Config): Unit // This gets the system config passed into the ActorSystem def runJob(contextLike: ContextLike, jobConfig: Config): Any

On Sep 29, 2015, at 7:55 AM, Fernando Otero [email protected] wrote:

I don't get why the request goes to init() and the system goes in runJob(). runJob will be called by each request while init will be called once by the system.

Anyway, I like the idea of not overriding default settings without letting the job know.

— Reply to this email directly or view it on GitHub https://github.com/spark-jobserver/spark-jobserver/issues/264#issuecomment-144085358.

velvia avatar Sep 29 '15 17:09 velvia