sidekiq-iteration
sidekiq-iteration copied to clipboard
Pull into Sidekiq core?
Hey @fatkodima, would you be interested in integrating this functionality into Sidekiq core for 7.3 or have me do it? I've had several customers report this gem as very useful for solving their problems with long-running jobs, making deployments quicker and safer, etc. I think it's a good pattern/API to encourage people to use.
Hey! Wow, thats awesome to get this merged into sidekiq itself!
I will try to do that on this weekend (or next weekend) and see how it goes. Let me know if you have plans to release 7.3 sooner.
I have a 7.3 milestone targeting a summer release. 7.2.3 will be out very soon.
Wanted to ask, what API would you prefer?
- (my preference)
class MyJob
include Sidekiq::Job
include Sidekiq::Iteration
end
or something like 2.
class MyJob
include Sidekiq::Job
sidekiq_options iteration: true, ...
end
And what API would you prefer for throttling (https://github.com/fatkodima/sidekiq-iteration/blob/master/guides/throttling.md)? Currently it is configured via a top level call in the class' body.
I'd probably go with:
class SomeJob
include Sidekiq::Job
include Sidekiq::Job::Iterable
sidekiq_options iteration: { whatever: 123 }
end
Unlike Rails, I dislike top-level class methods like throttle_on
as they can be hard to test and mock. I would prefer that be an instance method, server middleware provides an instance:
class ThrottleMiddleware
include Sidekiq::ServerMiddleware
def call(instance, job, queue)
if instance.throttle_on?
# do something
end
end
end
As suggestion @mperham, I feel like the framework should be pulled into Sidekiq but not the concrete implementations.
AR can be suggested to be used as I reported on #9:
def build_enumerator(cursor:)
Enumerator.new do |yielder|
MyModel.in_batches(start: cursor) do |relation|
yielder.yield(relation, relation.maximum(:id))
end
end
end
def each_iteration(relation)
relation.update_all(...)
end
Or for batches:
def build_enumerator(cursor:)
Enumerator.new do |yielder|
MyModel.find_in_batches(start: cursor) do |batch|
yielder.yield(batch, batch.last.id)
end
end
end
def each_iteration(batch)
batch.each { ... }
end
Or for individual records:
def build_enumerator(cursor:)
Enumerator.new do |yielder|
MyModel.find_each(start: cursor) do |record|
yielder.yield(record, record.id)
end
end
end
def each_iteration(record)
record.update(...)
end
Feels like having the CSV, Array and AR may be too much, I'm not sure, just throwing ideas out here.
Having optimized support for a few well known types/libraries is useful but we should have generic Enumerable support too.