Module: RobotLab::Convergence

Defined in:
lib/robot_lab/convergence.rb

Overview

TF-IDF cosine similarity utilities for detecting semantic convergence between two texts.

Common use cases:

  • Checking whether two independent verifiers have reached the same conclusion

  • Skipping a reconciler LLM call when verifiers already agree (fast-path)

  • Detecting when a multi-robot debate has converged on a consensus

Requires the optional ‘classifier’ gem (~> 2.3).

Examples:

Skip reconciler when verifiers agree

score = RobotLab::Convergence.similarity(result_a.reply, result_b.reply)

router = ->(args) do
  a = args.context[:verifier_a]&.reply.to_s
  b = args.context[:verifier_b]&.reply.to_s
  RobotLab::Convergence.detected?(a, b) ? nil : ["reconciler"]
end

Constant Summary collapse

DEFAULT_THRESHOLD =

Default cosine similarity threshold above which texts are convergent.

0.85
MIN_TEXT_LENGTH =

Minimum text length (characters) for meaningful TF-IDF scoring. Texts shorter than this always return 0.0 similarity.

30

Class Method Summary collapse

Class Method Details

.detected?(text_a, text_b, threshold: DEFAULT_THRESHOLD) ⇒ Boolean

Determine whether two texts are semantically convergent.

Parameters:

  • text_a (String)
  • text_b (String)
  • threshold (Float) (defaults to: DEFAULT_THRESHOLD)

    minimum similarity to declare convergence (default 0.85)

Returns:

  • (Boolean)

Raises:

  • (DependencyError)

    if the ‘classifier’ gem is not installed

  • (ArgumentError)

    if threshold is outside [0.0, 1.0]



38
39
40
41
42
43
44
# File 'lib/robot_lab/convergence.rb', line 38

def self.detected?(text_a, text_b, threshold: DEFAULT_THRESHOLD)
  unless (0.0..1.0).cover?(threshold)
    raise ArgumentError, "threshold must be in [0.0, 1.0], got #{threshold}"
  end

  similarity(text_a, text_b) >= threshold
end

.similarity(text_a, text_b) ⇒ Float

Compute cosine similarity between two texts using stemmed term frequencies.

Uses String#word_hash (provided by the classifier gem) to build stemmed, stopword-filtered term-frequency vectors, then computes L2-normalized cosine similarity. Term frequencies (no IDF) are used because IDF on a 2-document corpus collapses shared terms to zero, which would incorrectly penalize texts that agree on the same topic.

Returns 0.0 when either text is blank or shorter than MIN_TEXT_LENGTH.

Parameters:

  • text_a (String)
  • text_b (String)

Returns:

  • (Float)

    in [0.0, 1.0]

Raises:



60
61
62
63
64
65
66
67
# File 'lib/robot_lab/convergence.rb', line 60

def self.similarity(text_a, text_b)
  a = text_a.to_s.strip
  b = text_b.to_s.strip

  return 0.0 if a.length < MIN_TEXT_LENGTH || b.length < MIN_TEXT_LENGTH

  TextAnalysis.tf_cosine_similarity(a, b)
end