Class: ChronoForge::Executor::RetryPolicy

Inherits:
Object
  • Object
show all
Defined in:
lib/chrono_forge/executor/retry_policy.rb

Overview

A single, unified description of retry behavior shared by every retry site (workflow-level uncaught errors, durably_execute, durably_repeat, and wait_until’s condition errors).

It answers the only two questions a retry site ever asks:

- retryable?(error, attempts) — should this failure be retried?
- backoff_for(attempts)       — how long until the next attempt?

‘attempts` is always the 1-based count of attempts made so far, including the one that just failed (matching ExecutionLog#attempts). So on the first failure `attempts == 1`.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(max_attempts: 3, base: 1, cap: 30, jitter: true, retry_on: nil) ⇒ RetryPolicy

Returns a new instance of RetryPolicy.

Parameters:

  • max_attempts (Integer, nil) (defaults to: 3)

    cap on total attempts; nil = no count cap (bounded elsewhere, e.g. wait_until’s timeout)

  • base (Numeric, ActiveSupport::Duration) (defaults to: 1)

    delay of the first retry

  • cap (Numeric, ActiveSupport::Duration) (defaults to: 30)

    ceiling for a single delay

  • jitter (Boolean) (defaults to: true)

    apply equal jitter to spread retries

  • retry_on (Array<Class>, nil) (defaults to: nil)

    nil = retry any StandardError; an array = retry only those classes (and subclasses); [] = retry nothing



24
25
26
27
28
29
30
# File 'lib/chrono_forge/executor/retry_policy.rb', line 24

def initialize(max_attempts: 3, base: 1, cap: 30, jitter: true, retry_on: nil)
  @max_attempts = max_attempts
  @base = base
  @cap = cap
  @jitter = jitter
  @retry_on = retry_on
end

Instance Attribute Details

#baseObject (readonly)

Returns the value of attribute base.



15
16
17
# File 'lib/chrono_forge/executor/retry_policy.rb', line 15

def base
  @base
end

#capObject (readonly)

Returns the value of attribute cap.



15
16
17
# File 'lib/chrono_forge/executor/retry_policy.rb', line 15

def cap
  @cap
end

#jitterObject (readonly)

Returns the value of attribute jitter.



15
16
17
# File 'lib/chrono_forge/executor/retry_policy.rb', line 15

def jitter
  @jitter
end

#max_attemptsObject (readonly)

Returns the value of attribute max_attempts.



15
16
17
# File 'lib/chrono_forge/executor/retry_policy.rb', line 15

def max_attempts
  @max_attempts
end

#retry_onObject (readonly)

Returns the value of attribute retry_on.



15
16
17
# File 'lib/chrono_forge/executor/retry_policy.rb', line 15

def retry_on
  @retry_on
end

Class Method Details

.compose(*policies) ⇒ Object

Build a composite policy from an ordered list of RetryPolicy objects.



92
93
94
# File 'lib/chrono_forge/executor/retry_policy.rb', line 92

def self.compose(*policies)
  CompositeRetryPolicy.new(policies)
end

.step_defaultObject



71
72
73
# File 'lib/chrono_forge/executor/retry_policy.rb', line 71

def self.step_default
  new(max_attempts: 3, base: 1, cap: 30, jitter: true, retry_on: nil)
end

.wait_defaultObject



87
88
89
# File 'lib/chrono_forge/executor/retry_policy.rb', line 87

def self.wait_default
  new(max_attempts: nil, base: 1, cap: 30, jitter: true, retry_on: [])
end

.workflow_defaultObject

Workflow-level (uncaught) errors retry the whole workflow from the top (replaying completed steps). They cover two populations the default can’t distinguish: transient infra blips — worth riding out — and deterministic bugs, where every replay is waste. 10 attempts gives a tolerant window of up to ~8.5 min (≈4 min typical, since equal jitter puts each wait in [d/2, d]) — enough for a DB failover or deploy restart — without dragging out the bug case; cap (600s / 10 min) bounds any single backoff and only binds if a caller configures more attempts.



83
84
85
# File 'lib/chrono_forge/executor/retry_policy.rb', line 83

def self.workflow_default
  new(max_attempts: 10, base: 1, cap: 600, jitter: true, retry_on: nil)
end

Instance Method Details

#backoff_for(attempts) ⇒ Object

Equal jitter: half the computed delay plus a random portion of the other half. Computed once at re-enqueue time and never persisted, so the randomness does not affect replay determinism.



39
40
41
42
43
44
# File 'lib/chrono_forge/executor/retry_policy.rb', line 39

def backoff_for(attempts)
  exponent = [attempts - 1, 0].max
  delay = [cap.to_f, base.to_f * (2**exponent)].min
  delay = (delay / 2) + rand(0.0..(delay / 2)) if jitter
  delay.seconds
end

#budget_keyObject

Stable per-policy identifier derived from the errors this policy declares (its retry_on), not the error thrown. Inside a composite this keys the policy’s attempt budget, so the budget is shared across every class the policy lists (and their subclasses) and is independent of the policy’s position — reordering the composite does not reset counts. A catch-all (retry_on: nil) keys “*”.



67
68
69
# File 'lib/chrono_forge/executor/retry_policy.rb', line 67

def budget_key
  retry_on.nil? ? "*" : retry_on.map(&:name).sort.join(",")
end

#matches?(error) ⇒ Boolean

Public routing predicate: would this policy handle this error at all? (independent of the attempt cap). nil retry_on = any StandardError;

nothing; a list = those classes and their subclasses.

Returns:

  • (Boolean)


49
50
51
# File 'lib/chrono_forge/executor/retry_policy.rb', line 49

def matches?(error)
  retryable_error?(error)
end

#retry_backoff(error, attempts:) ⇒ Object

Single-call decision used by every retry site: the backoff Duration to retry, or nil to stop. A plain policy uses ‘attempts` and ignores any block (the block exists only so a CompositeRetryPolicy can supply a per-error count — see CompositeRetryPolicy#retry_backoff).



57
58
59
# File 'lib/chrono_forge/executor/retry_policy.rb', line 57

def retry_backoff(error, attempts:)
  retryable?(error, attempts) ? backoff_for(attempts) : nil
end

#retryable?(error, attempts) ⇒ Boolean

Returns:

  • (Boolean)


32
33
34
# File 'lib/chrono_forge/executor/retry_policy.rb', line 32

def retryable?(error, attempts)
  within_attempt_cap?(attempts) && retryable_error?(error)
end