Class: Rubino::Memory::Backends::Sqlite

Inherits:
Rubino::Memory::Backend show all
Includes:
SqliteGraph
Defined in:
lib/rubino/memory/backends/sqlite.rb

Overview

“Tiny-Zep” memory backend on embedded SQLite (Zep/Graphiti-inspired, minus the graph DB, the server, and the six-LLM-call pipeline).

Three ideas are kept from Zep:

* ATOMIC LLM-extracted facts (one declarative fact per row), via a
  single aux-LLM call per turn that both ADDs new facts and SUPERSEDES
  contradicted ones (Graphiti edge-invalidation, collapsed to 1 call).
* BI-TEMPORAL supersession — a contradicted fact is soft-retired
  (valid_to set), not deleted; "live" memory = valid_to IS NULL, so we
  get temporal correctness without losing provenance.
* HYBRID ranked recall — FTS5/BM25 (+ optional vector KNN) fused with
  Reciprocal Rank Fusion and lightly kind-weighted, top-k under the
  char budget. Graph (1-hop) and recency are tail SUPPLEMENTS that only
  backfill the budget after direct content matches — never outranking
  them. (Optional vector KNN via sqlite-vec when available; see #vector?.)

The injection-defense floor (ThreatScanner + char-budget) is enforced on the write path exactly as Memory::Store does, so no fact can splice tainted or over-budget content into a future system prompt.

Constant Summary collapse

TABLE =
:memory_facts
FTS =
:memory_facts_fts
RRF_K =
60
DEFAULT_K =
20
FTS_WEIGHT =

Weighted-RRF list weights for the DIRECT relevance signals (FTS/BM25 and vector KNN). Graph (1-hop) and recency are no longer fused here — they are tail supplements (see #rank) so they can never outrank a direct content match.

3.0
VECTOR_WEIGHT =
3.0
STOPWORDS =

Trivial words that appear in almost every fact (“user”, “project”) or carry no retrieval signal — excluded from the FTS MATCH so a probe like “what package manager does the user use” doesn’t match every “User …” fact on the word “user”.

%w[
  the a an of to in on at for and or is are was were be been being do does did
  how what where when which who whom whose why this that these those it its
  use uses used user users project projects right now
].to_set.freeze
USER_KIND =

Maps the backend’s fact ‘kind` onto Memory::Store’s budget group so a user_profile fact is metered against the user budget and everything else against the shared memory budget — same split as the default backend.

"user_profile"
KIND_WEIGHT =

Light kind weighting applied after RRF so durable user facts outrank one-off facts on ties.

Hash.new(1.0).merge(
  "user_profile" => 1.3,
  "preference" => 1.2,
  "env" => 1.1
).freeze

Constants included from SqliteGraph

SqliteGraph::CO_OCCURS, SqliteGraph::EDGES, SqliteGraph::ENTITIES

Class Method Summary collapse

Instance Method Summary collapse

Methods included from SqliteGraph

#facts_tagged_with, #graph_neighbors, #index_fact_graph, #normalize_entity_name, #resolve_entity, #seed_entities, #supersede_edge, #upsert_edge

Constructor Details

#initialize(config: nil, db: nil, aux_client: nil) ⇒ Sqlite

Returns a new instance of Sqlite.



72
73
74
75
76
# File 'lib/rubino/memory/backends/sqlite.rb', line 72

def initialize(config: nil, db: nil, aux_client: nil)
  super(config: config)
  @db = db || Rubino.database.db
  @aux_client = aux_client
end

Class Method Details

.backend_nameObject



68
69
70
# File 'lib/rubino/memory/backends/sqlite.rb', line 68

def self.backend_name
  "sqlite"
end

Instance Method Details

#available?Boolean

FTS5 ships with the sqlite3 gem, so the backend is always available. (Vector mode is a best-effort upgrade gated separately by #vector?.)

Returns:

  • (Boolean)


80
81
82
# File 'lib/rubino/memory/backends/sqlite.rb', line 80

def available?
  true
end

#countObject

Count only LIVE facts (valid_to IS NULL) — retired/superseded rows are tombstones the admin surface and #list already hide.



204
205
206
# File 'lib/rubino/memory/backends/sqlite.rb', line 204

def count
  live_dataset.count
end

#delete(id) ⇒ Object



198
199
200
# File 'lib/rubino/memory/backends/sqlite.rb', line 198

def delete(id)
  @db[TABLE].where(Sequel.like(:id, "#{id}%")).delete.positive?
end

#extract(session_id) ⇒ Object

ONE aux-LLM call over the recent turn(s): returns supersede. Apply is pure Ruby — insert adds (deduped + guarded), retire superseded rows and insert their replacement.



128
129
130
131
132
133
134
135
136
# File 'lib/rubino/memory/backends/sqlite.rb', line 128

def extract(session_id)
  turn = recent_turn_text(session_id)
  return [] if turn.strip.empty?

  result = call_llm(session_id: session_id, turn: turn)
  return [] unless result

  apply(result, session_id)
end

#find(id) ⇒ Object



193
194
195
196
# File 'lib/rubino/memory/backends/sqlite.rb', line 193

def find(id)
  row = @db[TABLE].where(Sequel.like(:id, "#{id}%")).first
  row && present(row)
end

#forget(kind:, old_text:) ⇒ Object

Hard-delete the first LIVE fact of ‘kind` whose text includes `old_text` (forget = remove from the record entirely, vs supersede).



116
117
118
119
120
121
122
123
# File 'lib/rubino/memory/backends/sqlite.rb', line 116

def forget(kind:, old_text:)
  target = live_dataset.where(kind: normalize_kind(kind))
                       .where(Sequel.like(:text, "%#{old_text}%")).first
  return nil unless target

  @db[TABLE].where(id: target[:id]).delete
  target
end

#list(kind: nil, limit: 20, include_retired: false) ⇒ Object

LIVE facts only by default — a superseded fact is a tombstone, not a current memory, so listing it undecorated next to its replacement presents contradicted data as true and makes the rows disagree with #count/#retrieve (#82). ‘include_retired: true` opts into the full supersession history (`rubino memory list –all`).



187
188
189
190
191
# File 'lib/rubino/memory/backends/sqlite.rb', line 187

def list(kind: nil, limit: 20, include_retired: false)
  ds = (include_retired ? @db[TABLE] : live_dataset).order(Sequel.desc(:created_at)).limit(limit)
  ds = ds.where(kind: normalize_kind(kind)) if kind
  ds.all.map { |r| present(r) }
end

#project_contextObject



151
152
153
154
155
156
157
158
# File 'lib/rubino/memory/backends/sqlite.rb', line 151

def project_context
  return nil unless @config.dig("memory", "project_context_enabled")

  rows = live_dataset.where(kind: %w[project env]).order(Sequel.desc(:created_at)).limit(10).all
  return nil if rows.empty?

  rows.map { |r| r[:text] }.join("\n")
end

#replace(kind:, old_text:, content:) ⇒ Object

Replace the first LIVE fact of ‘kind` whose text includes `old_text`. Modelled as a supersession so history is preserved.



99
100
101
102
103
104
105
106
107
108
109
110
111
112
# File 'lib/rubino/memory/backends/sqlite.rb', line 99

def replace(kind:, old_text:, content:)
  target = live_dataset.where(kind: normalize_kind(kind))
                       .where(Sequel.like(:text, "%#{old_text}%")).first
  return nil unless target

  # Retire first so the old row's chars free up before the new fact is
  # budget-checked (a same-size replace must always fit).
  new_id = SecureRandom.uuid
  retire!(target[:id], new_id)
  insert_fact(text: content, kind: target[:kind],
              entities: parse_entities(target[:entities_json]),
              source_session_id: target[:source_session_id], id: new_id)
  target
end

#retrieve(session_id:, query: nil, k: DEFAULT_K) ⇒ Object

HYBRID recall over LIVE facts: FTS5/BM25 on ‘query` (and vector KNN when available) fused via RRF and kind-weighted as the direct relevance ranking, then graph/recency-supplemented and greedily packed under the memory char budget. Returns rows shaped like the default backend (kind:, content:, …) so the prompt assembler is unchanged.



165
166
167
168
169
170
171
172
173
174
175
176
177
178
# File 'lib/rubino/memory/backends/sqlite.rb', line 165

def retrieve(session_id:, query: nil, k: DEFAULT_K)
  ranked = rank(query: query, k: k)
  budget = @config.memory_char_limit
  selected = []
  total = 0
  ranked.each do |row|
    len = row[:text].to_s.length
    break if budget&.positive? && total + len > budget

    selected << present(row)
    total += len
  end
  selected
end

#store(kind:, content:, source_session_id: nil, confidence: 1.0, metadata: {}) ⇒ Object

– WRITE path –



86
87
88
89
90
91
92
93
94
95
# File 'lib/rubino/memory/backends/sqlite.rb', line 86

def store(kind:, content:, source_session_id: nil, confidence: 1.0, metadata: {})
  insert_fact(
    text: content,
    kind: normalize_kind(kind),
    entities: Array([:entities]),
    source_session_id: source_session_id,
    confidence: confidence,
    valid_from: [:valid_from]
  )
end

#user_profileObject

– READ path –



140
141
142
143
144
145
146
147
148
149
# File 'lib/rubino/memory/backends/sqlite.rb', line 140

def 
  return nil unless @config.dig("memory", "user_profile_enabled")

  rows = live_dataset.where(kind: USER_KIND).order(Sequel.desc(:created_at)).all
  return nil if rows.empty?

  text = rows.map { |r| r[:text] }.join("\n")
  limit = @config.memory_user_char_limit
  text.length > limit ? text[0...limit] : text
end