Class: Engram::Extractors::LLMExtractor

Inherits:
Object
  • Object
show all
Includes:
Ports::Extractor
Defined in:
lib/engram/extractors/llm_extractor.rb

Overview

Derives durable, user-specific facts from a conversation turn via an LLM.

Constant Summary collapse

SYSTEM =
<<~PROMPT
  You extract durable, user-specific facts worth remembering across future sessions.
  Rules:
  - Only stable facts about the user (preferences, attributes, decisions, history).
  - Ignore ephemeral chit-chat, questions, and the assistant's own messages.
  - Normalize each fact to a terse third-person statement (e.g. "User is on the Pro plan").
  - Set confidence in [0,1]; importance in [0,1].
  Return an empty list if there is nothing worth remembering.
PROMPT
SCHEMA =
{
  type: "object",
  properties: {
    facts: {
      type: "array",
      items: {
        type: "object",
        properties: {
          content: {type: "string"},
          kind: {type: "string", enum: %w[semantic episodic preference]},
          importance: {type: "number"},
          confidence: {type: "number"}
        },
        required: %w[content]
      }
    }
  },
  required: %w[facts]
}.freeze

Instance Method Summary collapse

Constructor Details

#initialize(completion:, embedder:, min_confidence: 0.5) ⇒ LLMExtractor

Returns a new instance of LLMExtractor.



39
40
41
42
43
# File 'lib/engram/extractors/llm_extractor.rb', line 39

def initialize(completion:, embedder:, min_confidence: 0.5)
  @completion = completion
  @embedder = embedder
  @min_confidence = min_confidence
end

Instance Method Details

#extract(messages:, scope:) ⇒ Object



45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/engram/extractors/llm_extractor.rb', line 45

def extract(messages:, scope:)
  result = @completion.complete(system: SYSTEM, user: transcript(messages), schema: SCHEMA)
  facts(result).filter_map do |fact|
    fact = fact.transform_keys(&:to_s)
    content = fact["content"].to_s.strip
    next if content.empty?
    next if (fact["confidence"] || 1.0).to_f < @min_confidence

    Engram::Record.new(
      content: content,
      scope: scope,
      kind: (fact["kind"] || "semantic").to_sym,
      importance: (fact["importance"] || 1.0).to_f,
      embedding: @embedder.embed(content)
    )
  end
end