Class: Legion::Extensions::Agentic::Social::Conscience::Helpers::MoralEvaluator

Inherits:
Object
  • Object
show all
Defined in:
lib/legion/extensions/agentic/social/conscience/helpers/moral_evaluator.rb

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeMoralEvaluator

Returns a new instance of MoralEvaluator.



12
13
14
15
16
# File 'lib/legion/extensions/agentic/social/conscience/helpers/moral_evaluator.rb', line 12

def initialize
  @sensitivities = Constants::MORAL_FOUNDATIONS.keys.to_h do |foundation|
    [foundation, Constants::INITIAL_SENSITIVITY]
  end
end

Instance Attribute Details

#sensitivitiesObject (readonly)

Returns the value of attribute sensitivities.



10
11
12
# File 'lib/legion/extensions/agentic/social/conscience/helpers/moral_evaluator.rb', line 10

def sensitivities
  @sensitivities
end

Instance Method Details

#detect_dilemma(scores) ⇒ Object

Detect a dilemma when foundations strongly disagree with each other. Returns nil when no dilemma, or a hash describing the conflict type and disagreeing foundations.



61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
# File 'lib/legion/extensions/agentic/social/conscience/helpers/moral_evaluator.rb', line 61

def detect_dilemma(scores)
  pos_foundations = scores.select { |_, v| v > Constants::CONFLICT_THRESHOLD }
  neg_foundations = scores.select { |_, v| v < -Constants::CONFLICT_THRESHOLD }

  return nil if pos_foundations.empty? || neg_foundations.empty?

  dilemma_type = classify_dilemma(pos_foundations.keys, neg_foundations.keys)

  {
    type:            dilemma_type,
    approving:       pos_foundations.keys,
    opposing:        neg_foundations.keys,
    tension:         (pos_foundations.values.sum / pos_foundations.size.to_f).round(4),
    counter_tension: (neg_foundations.values.sum / neg_foundations.size.to_f).abs.round(4),
    detected_at:     Time.now.utc
  }
end

#evaluate(action:, context:) ⇒ Object

Evaluate a proposed action against all 6 moral foundations. Returns a hash with per-foundation scores, weighted_score, verdict, and dilemma info.



20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# File 'lib/legion/extensions/agentic/social/conscience/helpers/moral_evaluator.rb', line 20

def evaluate(action:, context:)
  scores = per_foundation_scores(action, context)
  w_score = weighted_score(scores)
  v = verdict(w_score)
  dilemma = detect_dilemma(scores)

  {
    action:         action,
    scores:         scores,
    weighted_score: w_score.round(4),
    verdict:        v,
    dilemma:        dilemma,
    sensitivities:  @sensitivities.transform_values { |s| s.round(4) },
    evaluated_at:   Time.now.utc
  }
end

#update_sensitivity(foundation, outcome) ⇒ Object

Feedback loop: update sensitivity for a foundation based on observed outcome. outcome is a float in [-1.0, 1.0] where positive = action was morally good in retrospect.



81
82
83
84
85
86
# File 'lib/legion/extensions/agentic/social/conscience/helpers/moral_evaluator.rb', line 81

def update_sensitivity(foundation, outcome)
  return unless @sensitivities.key?(foundation)

  current = @sensitivities[foundation]
  @sensitivities[foundation] = ema(current, outcome.abs.clamp(0.0, 1.0), Constants::FOUNDATION_ALPHA)
end

#verdict(score) ⇒ Object

Determine overall moral verdict from a weighted score



49
50
51
52
53
54
55
56
57
# File 'lib/legion/extensions/agentic/social/conscience/helpers/moral_evaluator.rb', line 49

def verdict(score)
  if score <= Constants::PROHIBITION_THRESHOLD
    :prohibited
  elsif score < Constants::CAUTION_THRESHOLD
    :cautioned
  else
    :permitted
  end
end

#weighted_score(scores) ⇒ Object

Weighted sum of per-foundation scores * weights * sensitivity



38
39
40
41
42
43
44
45
46
# File 'lib/legion/extensions/agentic/social/conscience/helpers/moral_evaluator.rb', line 38

def weighted_score(scores)
  total = 0.0
  Constants::MORAL_FOUNDATIONS.each do |foundation, config|
    score = scores[foundation] || 0.0
    sensitivity = @sensitivities[foundation]
    total += score * config[:weight] * sensitivity
  end
  total.clamp(Constants::MORAL_SCORE_RANGE[:min], Constants::MORAL_SCORE_RANGE[:max])
end