Class: Rubino::Security::DoomLoopDetector
- Inherits:
-
Object
- Object
- Rubino::Security::DoomLoopDetector
- Defined in:
- lib/rubino/security/doom_loop_detector.rb
Overview
Detects when the agent enters a doom loop - repeatedly calling the same tool with identical arguments without progress.
Two dimensions, both config-driven (Hermes tool_guardrails alignment, #414):
- threshold: how many identical consecutive calls trip detection
(default 5; Hermes grades 5-8). The old default was 3, which hard-
denied a legitimate 3rd retry of an idempotent read.
- hard_stop: when true, a tripped detector means BLOCK (the policy
returns :deny). When false (the default) it WARNS but allows — the
policy surfaces a one-time warning to the model and lets the call run.
Constant Summary collapse
- DEFAULT_THRESHOLD =
5
Instance Attribute Summary collapse
-
#threshold ⇒ Object
readonly
Returns the value of attribute threshold.
Instance Method Summary collapse
-
#hard_stop? ⇒ Boolean
True when the detector is configured to BLOCK on detection (vs. warn).
-
#initialize(threshold: DEFAULT_THRESHOLD, hard_stop: false) ⇒ DoomLoopDetector
constructor
A new instance of DoomLoopDetector.
-
#record(tool_name:, arguments:) ⇒ Object
Records a tool call and returns true if a doom loop is detected (the last ‘threshold` calls are identical).
-
#reset! ⇒ Object
Resets the detector (e.g., when user provides new input).
Constructor Details
#initialize(threshold: DEFAULT_THRESHOLD, hard_stop: false) ⇒ DoomLoopDetector
Returns a new instance of DoomLoopDetector.
21 22 23 24 25 |
# File 'lib/rubino/security/doom_loop_detector.rb', line 21 def initialize(threshold: DEFAULT_THRESHOLD, hard_stop: false) @threshold = threshold @hard_stop = hard_stop @history = [] end |
Instance Attribute Details
#threshold ⇒ Object (readonly)
Returns the value of attribute threshold.
19 20 21 |
# File 'lib/rubino/security/doom_loop_detector.rb', line 19 def threshold @threshold end |
Instance Method Details
#hard_stop? ⇒ Boolean
True when the detector is configured to BLOCK on detection (vs. warn).
28 29 30 |
# File 'lib/rubino/security/doom_loop_detector.rb', line 28 def hard_stop? @hard_stop == true end |
#record(tool_name:, arguments:) ⇒ Object
Records a tool call and returns true if a doom loop is detected (the last ‘threshold` calls are identical). Detection is independent of hard_stop — the caller decides whether a hit blocks or only warns.
35 36 37 38 39 40 41 42 43 44 45 46 |
# File 'lib/rubino/security/doom_loop_detector.rb', line 35 def record(tool_name:, arguments:) signature = generate_signature(tool_name, arguments) @history << signature # Check if the last N calls are identical if @history.size >= @threshold recent = @history.last(@threshold) return true if recent.uniq.size == 1 end false end |
#reset! ⇒ Object
Resets the detector (e.g., when user provides new input)
49 50 51 |
# File 'lib/rubino/security/doom_loop_detector.rb', line 49 def reset! @history.clear end |