Class: RubynCode::Goal::Evaluator
- Inherits:
-
Object
- Object
- RubynCode::Goal::Evaluator
- Defined in:
- lib/rubyn_code/goal/evaluator.rb
Overview
Judges whether a goal condition has been satisfied.
The evaluator is deliberately conservative: it returns true only when the model is confident the goal is genuinely complete. Any error or ambiguous answer is treated as “not met” so the agent keeps working rather than stopping prematurely.
Constant Summary collapse
- SYSTEM_PROMPT =
<<~PROMPT You are a strict completion judge. Given a GOAL and a transcript of an AI coding agent's recent work, decide whether the goal is genuinely and fully satisfied. Be conservative: if there is any doubt, or the work is only partially done, answer NO. Answer with exactly one word on the first line: YES or NO. Optionally add a short reason on the next line. PROMPT
- TRANSCRIPT_WINDOW =
Number of trailing conversation messages to show the judge.
12
Instance Method Summary collapse
-
#call(condition:, conversation: nil) ⇒ Boolean
True only when the goal is confidently complete.
-
#initialize(llm_client:) ⇒ Evaluator
constructor
A new instance of Evaluator.
Constructor Details
#initialize(llm_client:) ⇒ Evaluator
Returns a new instance of Evaluator.
28 29 30 |
# File 'lib/rubyn_code/goal/evaluator.rb', line 28 def initialize(llm_client:) @llm_client = llm_client end |
Instance Method Details
#call(condition:, conversation: nil) ⇒ Boolean
Returns true only when the goal is confidently complete.
35 36 37 38 39 40 41 42 43 44 |
# File 'lib/rubyn_code/goal/evaluator.rb', line 35 def call(condition:, conversation: nil) response = @llm_client.chat( messages: [{ role: 'user', content: prompt(condition, conversation) }], system: SYSTEM_PROMPT ) verdict_yes?(answer_text(response)) rescue StandardError => e RubynCode::Debug.warn("Goal evaluation failed: #{e.}") false end |