sidekiq-batch-jobs

Batch tracking and completion callbacks for Sidekiq, backed by ActiveRecord (PostgreSQL).

A hand-rolled alternative to Sidekiq Pro batches. Group a set of perform_async calls into a batch, persist their state in the database, and fire a callback worker when the batch finishes — success or failure.

Installation

Add to your Gemfile:

gem "sidekiq-batch-jobs"

Then run the install generator and migrate:

bin/rails g sidekiq:batch:jobs:install
bin/rails db:migrate

The gem registers its client middleware, server middleware, and death handler automatically at boot.

To opt out (e.g. you need full control over middleware order), set this in config/application.rb before Rails boots:

Sidekiq::Batch::Jobs.auto_install = false

Then either call Sidekiq::Batch::Jobs.install! from your own config/initializers/sidekiq.rb, or wire the three pieces by hand:

# config/initializers/sidekiq.rb
Sidekiq.configure_client do |config|
  # Outermost on the client chain so any dedupe/suppression middleware
  # registered later with `chain.add` gets the final say on whether a
  # push actually happens. We only enroll jobs that the chain agrees to push.
  config.client_middleware do |chain|
    chain.prepend SidekiqBatch::ClientMiddleware
  end
end

Sidekiq.configure_server do |config|
  # Same prepend for the server's own client chain — covers `perform_async`
  # calls made from inside a running worker.
  config.client_middleware do |chain|
    chain.prepend SidekiqBatch::ClientMiddleware
  end

  # `add` (not `prepend`) so this runs LAST in the server chain — outside
  # Sidekiq's retry middleware. That way we see the terminal disposition
  # of each attempt and only mark the row failed on the final retry.
  config.server_middleware do |chain|
    chain.add SidekiqBatch::Middleware
  end

  # Reconciles jobs that died without re-entering middleware (SIGKILL, OOM,
  # pod eviction). Without this, a killed job's batch row stays `pending`
  # forever and the batch never completes.
  config.death_handlers << ->(job, ex) { SidekiqBatch::Middleware.handle_death(job, ex) }
end

Usage

A batch is a group of Sidekiq jobs whose collective fate you care about. Once every job in the batch ends in a terminal state, a callback worker fires — exactly once. You typically use a batch when there's "fan-in" work to do after a bunch of independent jobs finish: rebuilding a derived dataset after rescoring, sending one summary email after a thousand individual notifications, marking an import as ready once all rows are processed.

The four steps

# 1. Create the batch. Use `context` to stash any state the callback will need —
#    the callback only receives the batch id, not the surrounding closure, so
#    anything you'd otherwise close over goes here.
batch = SidekiqBatch.create!(
  description: "Rescore leaderboard #{leaderboard.id}",
  context: {
    "leaderboard_id" => leaderboard.id,
    "triggered_by"   => current_user.id,
    "reason"         => "manual rescore from admin panel",
  },
)

# 2. Register what happens when the batch finishes
batch.on(:complete, RebuildLeaderboardCacheWorker)
batch.on(:failure, AlertOpsOfFailedRescoreWorker)

# 3. Enqueue jobs *inside* the batch context — they get enrolled automatically
batch.jobs do
  leaderboard.entries.find_each do |entry|
    ScoreWorker.perform_async(entry.id)
  end
end

# 4. The batch is now running. Your callback worker will be invoked when
#    the last job lands in a terminal state — possibly hours later, on a
#    completely different worker process.

The context column

SidekiqBatch#context is a jsonb column for arbitrary per-batch metadata. The gem doesn't read it — it's a slot for you to pass information from the code that created the batch through to the callback worker, since the callback only receives batch_id and has to rehydrate everything else from the database.

Useful things to stash there:

  • Foreign keys the callback needs (leaderboard_id, import_id, tenant_id).
  • Provenance for debugging or audit (triggered_by, via, request_id).
  • Configuration the callback should branch on (notify_slack: true, recompute_strategy: "fast").

Keep it small and stable — it's metadata, not a payload. If you find yourself stuffing large arrays in there, that's a sign the data should live in its own table with a sidekiq_batch_id.

What batch.jobs do … end actually does

The block establishes a thread-local "enrollment context." While the block runs:

  • Every perform_async / perform_bulk / set(...).perform_async call made on this thread is intercepted by the client middleware.
  • For each pushed job, a SidekiqBatchJob row is written to Postgres with the job's jid, worker class, and args — before Sidekiq pushes the payload to Redis. That ordering is the whole point: when the worker later runs and the server middleware looks for a matching row, it's guaranteed to find one.
  • When the block returns, the batch transitions from pending to running and the context is cleared.
  • Jobs enqueued outside the block are not enrolled and don't count toward this batch's completion. Same for jobs that child workers enqueue while running — only the original thread's enqueues are captured (by design — keeps the batch's scope predictable).

You don't have to think about jid tracking, race conditions, or middleware ordering — that's the gem's job. You just write the block.

What the callback worker receives

The callback gets one argument: the batch id. From there it can inspect the batch's full state:

class RebuildLeaderboardCacheWorker
  include Sidekiq::Worker

  def perform(batch_id)
    batch          = SidekiqBatch.find(batch_id)
    leaderboard_id = batch.context.fetch("leaderboard_id")

    Rails.logger.info "Batch #{batch.description} finished: #{batch.progress}"
    # => Batch Rescore leaderboard 42 finished: {total: 1247, complete: 1247, failed: 0, pending: 0}

    LeaderboardCacheRebuilder.run(leaderboard_id)
  end
end

class AlertOpsOfFailedRescoreWorker
  include Sidekiq::Worker

  def perform(batch_id)
    batch = SidekiqBatch.find(batch_id)

    batch.failed_jobs.find_each do |bj|
      Ops.notify(
        "Rescore worker died: #{bj.worker_class} args=#{bj.args} " \
        "error=#{bj.error_class}: #{bj.error_message}",
      )
    end
  end
end

Either :complete or :failure will fire — never both, never twice. :complete fires when every enrolled job succeeded; :failure fires the moment any job ends in a failed terminal state (retries exhausted, or retry: false workers that raised).

Inspecting a batch from anywhere

batch.progress
# => { total: 1247, complete: 1101, failed: 3, pending: 143 }

batch.pending_jobs    # ActiveRecord relation
batch.failed_jobs
batch.completed_jobs

batch.status          # "pending" | "running" | "complete" | "failed"
batch.completed_at    # nil while running, set on terminal transition
batch.context         # the jsonb hash you stashed when creating the batch

Development

After checking out the repo, run bin/setup to install gem dependencies. Then run the suite with:

bin/test                  # starts Postgres via docker compose, then runs rspec
bin/test spec/models      # forwards args through to rspec

The test harness uses Combustion to boot a minimal Rails app under spec/internal/, and a Postgres container defined in docker-compose.yml. The container uses an ephemeral tmpfs for its data directory, so it's safe to docker compose down at any time.

If you'd rather skip the wrapper, the manual flow is:

docker compose up -d
bundle exec rspec
docker compose down

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/douglasgreyling/sidekiq-batch-jobs.