grails-spring-batch
grails-spring-batch copied to clipboard
Overlapping batch definitions and namespaces
I've got two batch configs in one Grails application, call them OneBatchConfig.groovy and TwoBatchConfig.groovy. They do very similar things, just with slightly different processing. For example:
OneBatchConfig.groovy:
beans {
xmlns batch:"http://www.springframework.org/schema/batch"
batch.job(id: "batchJob1") {
batch.step(id: 'loadFile') {
batch.tasklet {
batch.chunk(
reader: 'fileReader',
processor: 'compositeProcessor',
writer: 'dbWriter')
}
}
}
// Ignore the fileReader and dbWriter for now
compositeProcessor(CompositeItemProcessor) {
delegates = [
ref('filterInvalidNames'),
ref('filterInvalidDepartments')
]
}
// Ignore the details of the filter* processors
}
}
TwoBatchConfig.groovy
beans {
xmlns batch:"http://www.springframework.org/schema/batch"
batch.job(id: "batchJob2") {
batch.step(id: 'loadFile') {
batch.tasklet {
batch.chunk(
reader: 'fileReader',
processor: 'compositeProcessor',
writer: 'dbWriter')
}
}
}
// Ignore the fileReader and dbWriter for now
compositeProcessor(CompositeItemProcessor) {
delegates = [
ref('filterInvalidNames'),
ref('filterInvalidCities')
]
}
// Ignore the details of the filter* processors
}
}
Both batch jobs read in a file, perform some processing, and save the results to the database. The configuration of the fileReader
is different for each batch job, in that they are reading different files, for instance, but what's more obvious is that the processing applied to each file is different. The first batch job filters out some records based on names and departments, while the second filters out records based on names and cities.
However, those filtering steps are configurable, in that the first job might be filtering out records with name = 'John', while the second job might be filtering out records with name = 'Steve'. That configuration is inside the batch job definition itself, not in the code.
So my question is this... in my testing, it appears as though all of the beans that are defined in both of these BatchConfig.groovy files are part of a global namespace; including the names of the steps. This means that having a step called loadFile
in both OneBatchConfig.groovy and TwoBatchConfig.groovy is a bad idea, and only one of those steps will really exist after the Grails application starts up. This also means that the filterInvalidNames
bean can only exist once, and therefore the same filtering will be applied to both jobs.
First, is that correct? It makes some sense, given how other bean names, like dataSource
, are just available without having to do anything.
But if so, it also creates some havoc when managing a large number of BatchConfig.groovy files within a single Grails application. Is there a solution other than requiring the use of unique names for every single step and every single bean across all BatchConfig.groovy files that are created? Like the ability to apply a namespace of some kind to the bean names within each batch.job
block?
Maybe I'm just missing something simple in how to configure each job so it exists separately from other jobs?
I posted this to stack overflow as well, that might be a more appropriate forum.
Could you post the link to your stack overflow question. I'll be interested to see if anyone responds.
Ah, I found it: http://stackoverflow.com/questions/24946610/multiple-spring-batch-jobs-in-one-grails-application.
The components in a job are simply spring beans that are loaded into the global Grails (Spring) context.
Looks like we could provide this feature by using AutomaticJobRegistrar (from http://docs.spring.io/spring-batch/2.2.x/reference/html/configureJob.html). Job names would have to be unique, but the beans within the jobs shouldn't conflict then.
Until that is completed, you'll have to give your components globally unique names. I simply prefix each step with the job name, though this can make for some long bean names.
I'll put this on the todo list.
Thanks Daniel, I appreciate the answer and pointers. I have pretty long job names currently, so I may need to come up with a shorter convention of some kind, but that should be fine. And the ability to purposely create some shared tasklets or other beans is kind of handy.
Shared bits are handy, it means you can access any bean/service in your app.
I believe the AutomaticJobRegistrar ought to allow access to everything in the parent context, but nothing would be shared in the children, so only your jobs would be separated.