Class: Ace::Test::EndToEndRunner::Atoms::TcFidelityValidator

Inherits:
Object
  • Object
show all
Defined in:
lib/ace/test/end_to_end_runner/atoms/tc_fidelity_validator.rb

Overview

Validates that agent-reported test cases match the scenario’s expected TCs

Detects when an agent invents its own test cases instead of executing the defined standalone TC files. Returns an error result when fidelity check fails.

Class Method Summary collapse

Class Method Details

.validate(parsed, scenario, filtered_tc_ids: nil) ⇒ Hash?

Validate parsed result against expected test case count

Parameters:

  • parsed (Hash)

    Parsed result from SkillResultParser (:test_cases, :status, etc.)

  • scenario (Models::TestScenario)

    The scenario with expected TCs

  • filtered_tc_ids (Array<String>, nil) (defaults to: nil)

    TC IDs filter (when subset was requested)

Returns:

  • (Hash, nil)

    Error info hash if validation fails, nil if valid



18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# File 'lib/ace/test/end_to_end_runner/atoms/tc_fidelity_validator.rb', line 18

def self.validate(parsed, scenario, filtered_tc_ids: nil)
  expected_ids = filtered_tc_ids || scenario.test_case_ids
  return nil if expected_ids.empty?

  reported_count = parsed[:test_cases]&.size || 0
  expected_count = expected_ids.size

  return nil if reported_count == expected_count

  {
    error: "TC fidelity mismatch: agent reported #{reported_count} test cases " \
           "but scenario has #{expected_count} (#{expected_ids.join(", ")})",
    expected_count: expected_count,
    reported_count: reported_count,
    expected_ids: expected_ids
  }
end