Class: Rubino::Memory::Backends::Sqlite
- Inherits:
-
Rubino::Memory::Backend
- Object
- Rubino::Memory::Backend
- Rubino::Memory::Backends::Sqlite
- Includes:
- SqliteGraph
- Defined in:
- lib/rubino/memory/backends/sqlite.rb
Overview
“Tiny-Zep” memory backend on embedded SQLite (Zep/Graphiti-inspired, minus the graph DB, the server, and the six-LLM-call pipeline).
Three ideas are kept from Zep:
* ATOMIC LLM-extracted facts (one declarative fact per row), via a
single aux-LLM call per turn that both ADDs new facts and SUPERSEDES
contradicted ones (Graphiti edge-invalidation, collapsed to 1 call).
* BI-TEMPORAL supersession — a contradicted fact is soft-retired
(valid_to set), not deleted; "live" memory = valid_to IS NULL, so we
get temporal correctness without losing provenance.
* HYBRID ranked recall — FTS5/BM25 (+ optional vector KNN) fused with
Reciprocal Rank Fusion and lightly kind-weighted, top-k under the
char budget. Graph (1-hop) and recency are tail SUPPLEMENTS that only
backfill the budget after direct content matches — never outranking
them. (Optional vector KNN via sqlite-vec when available; see #vector?.)
The injection-defense floor (ThreatScanner + char-budget) is enforced on the write path exactly as Memory::Store does, so no fact can splice tainted or over-budget content into a future system prompt.
Constant Summary collapse
- TABLE =
:memory_facts- FTS =
:memory_facts_fts- RRF_K =
60- DEFAULT_K =
20- FTS_WEIGHT =
Weighted-RRF list weights for the DIRECT relevance signals (FTS/BM25 and vector KNN). Graph (1-hop) and recency are no longer fused here — they are tail supplements (see #rank) so they can never outrank a direct content match.
3.0- VECTOR_WEIGHT =
3.0- STOPWORDS =
Trivial words that appear in almost every fact (“user”, “project”) or carry no retrieval signal — excluded from the FTS MATCH so a probe like “what package manager does the user use” doesn’t match every “User …” fact on the word “user”.
%w[ the a an of to in on at for and or is are was were be been being do does did how what where when which who whom whose why this that these those it its use uses used user users project projects right now ].to_set.freeze
- USER_KIND =
Maps the backend’s fact ‘kind` onto Memory::Store’s budget group so a user_profile fact is metered against the user budget and everything else against the shared memory budget — same split as the default backend.
"user_profile"- KIND_WEIGHT =
Light kind weighting applied after RRF so durable user facts outrank one-off facts on ties.
Hash.new(1.0).merge( "user_profile" => 1.3, "preference" => 1.2, "env" => 1.1 ).freeze
Constants included from SqliteGraph
SqliteGraph::CO_OCCURS, SqliteGraph::EDGES, SqliteGraph::ENTITIES
Class Method Summary collapse
Instance Method Summary collapse
-
#available? ⇒ Boolean
FTS5 ships with the sqlite3 gem, so the backend is always available.
-
#count ⇒ Object
Count only LIVE facts (valid_to IS NULL) — retired/superseded rows are tombstones the admin surface and #list already hide.
- #delete(id) ⇒ Object
-
#extract(session_id) ⇒ Object
ONE aux-LLM call over the recent turn(s): returns supersede.
- #find(id) ⇒ Object
-
#forget(kind:, old_text:) ⇒ Object
Hard-delete the first LIVE fact of ‘kind` whose text includes `old_text` (forget = remove from the record entirely, vs supersede).
-
#initialize(config: nil, db: nil, aux_client: nil) ⇒ Sqlite
constructor
A new instance of Sqlite.
-
#list(kind: nil, limit: 20, include_retired: false) ⇒ Object
LIVE facts only by default — a superseded fact is a tombstone, not a current memory, so listing it undecorated next to its replacement presents contradicted data as true and makes the rows disagree with #count/#retrieve (#82).
- #project_context ⇒ Object
-
#replace(kind:, old_text:, content:) ⇒ Object
Replace the first LIVE fact of ‘kind` whose text includes `old_text`.
-
#retrieve(session_id:, query: nil, k: DEFAULT_K) ⇒ Object
HYBRID recall over LIVE facts: FTS5/BM25 on ‘query` (and vector KNN when available) fused via RRF and kind-weighted as the direct relevance ranking, then graph/recency-supplemented and greedily packed under the memory char budget.
-
#store(kind:, content:, source_session_id: nil, confidence: 1.0, metadata: {}) ⇒ Object
– WRITE path –.
-
#user_profile ⇒ Object
– READ path –.
Methods included from SqliteGraph
#facts_tagged_with, #graph_neighbors, #index_fact_graph, #normalize_entity_name, #resolve_entity, #seed_entities, #supersede_edge, #upsert_edge
Constructor Details
Class Method Details
.backend_name ⇒ Object
68 69 70 |
# File 'lib/rubino/memory/backends/sqlite.rb', line 68 def self.backend_name "sqlite" end |
Instance Method Details
#available? ⇒ Boolean
FTS5 ships with the sqlite3 gem, so the backend is always available. (Vector mode is a best-effort upgrade gated separately by #vector?.)
80 81 82 |
# File 'lib/rubino/memory/backends/sqlite.rb', line 80 def available? true end |
#count ⇒ Object
Count only LIVE facts (valid_to IS NULL) — retired/superseded rows are tombstones the admin surface and #list already hide.
204 205 206 |
# File 'lib/rubino/memory/backends/sqlite.rb', line 204 def count live_dataset.count end |
#delete(id) ⇒ Object
198 199 200 |
# File 'lib/rubino/memory/backends/sqlite.rb', line 198 def delete(id) @db[TABLE].where(Sequel.like(:id, "#{id}%")).delete.positive? end |
#extract(session_id) ⇒ Object
ONE aux-LLM call over the recent turn(s): returns supersede. Apply is pure Ruby — insert adds (deduped + guarded), retire superseded rows and insert their replacement.
128 129 130 131 132 133 134 135 136 |
# File 'lib/rubino/memory/backends/sqlite.rb', line 128 def extract(session_id) turn = recent_turn_text(session_id) return [] if turn.strip.empty? result = call_llm(session_id: session_id, turn: turn) return [] unless result apply(result, session_id) end |
#find(id) ⇒ Object
193 194 195 196 |
# File 'lib/rubino/memory/backends/sqlite.rb', line 193 def find(id) row = @db[TABLE].where(Sequel.like(:id, "#{id}%")).first row && present(row) end |
#forget(kind:, old_text:) ⇒ Object
Hard-delete the first LIVE fact of ‘kind` whose text includes `old_text` (forget = remove from the record entirely, vs supersede).
116 117 118 119 120 121 122 123 |
# File 'lib/rubino/memory/backends/sqlite.rb', line 116 def forget(kind:, old_text:) target = live_dataset.where(kind: normalize_kind(kind)) .where(Sequel.like(:text, "%#{old_text}%")).first return nil unless target @db[TABLE].where(id: target[:id]).delete target end |
#list(kind: nil, limit: 20, include_retired: false) ⇒ Object
LIVE facts only by default — a superseded fact is a tombstone, not a current memory, so listing it undecorated next to its replacement presents contradicted data as true and makes the rows disagree with #count/#retrieve (#82). ‘include_retired: true` opts into the full supersession history (`rubino memory list –all`).
187 188 189 190 191 |
# File 'lib/rubino/memory/backends/sqlite.rb', line 187 def list(kind: nil, limit: 20, include_retired: false) ds = (include_retired ? @db[TABLE] : live_dataset).order(Sequel.desc(:created_at)).limit(limit) ds = ds.where(kind: normalize_kind(kind)) if kind ds.all.map { |r| present(r) } end |
#project_context ⇒ Object
151 152 153 154 155 156 157 158 |
# File 'lib/rubino/memory/backends/sqlite.rb', line 151 def project_context return nil unless @config.dig("memory", "project_context_enabled") rows = live_dataset.where(kind: %w[project env]).order(Sequel.desc(:created_at)).limit(10).all return nil if rows.empty? rows.map { |r| r[:text] }.join("\n") end |
#replace(kind:, old_text:, content:) ⇒ Object
Replace the first LIVE fact of ‘kind` whose text includes `old_text`. Modelled as a supersession so history is preserved.
99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
# File 'lib/rubino/memory/backends/sqlite.rb', line 99 def replace(kind:, old_text:, content:) target = live_dataset.where(kind: normalize_kind(kind)) .where(Sequel.like(:text, "%#{old_text}%")).first return nil unless target # Retire first so the old row's chars free up before the new fact is # budget-checked (a same-size replace must always fit). new_id = SecureRandom.uuid retire!(target[:id], new_id) insert_fact(text: content, kind: target[:kind], entities: parse_entities(target[:entities_json]), source_session_id: target[:source_session_id], id: new_id) target end |
#retrieve(session_id:, query: nil, k: DEFAULT_K) ⇒ Object
HYBRID recall over LIVE facts: FTS5/BM25 on ‘query` (and vector KNN when available) fused via RRF and kind-weighted as the direct relevance ranking, then graph/recency-supplemented and greedily packed under the memory char budget. Returns rows shaped like the default backend (kind:, content:, …) so the prompt assembler is unchanged.
165 166 167 168 169 170 171 172 173 174 175 176 177 178 |
# File 'lib/rubino/memory/backends/sqlite.rb', line 165 def retrieve(session_id:, query: nil, k: DEFAULT_K) ranked = rank(query: query, k: k) budget = @config.memory_char_limit selected = [] total = 0 ranked.each do |row| len = row[:text].to_s.length break if budget&.positive? && total + len > budget selected << present(row) total += len end selected end |
#store(kind:, content:, source_session_id: nil, confidence: 1.0, metadata: {}) ⇒ Object
– WRITE path –
86 87 88 89 90 91 92 93 94 95 |
# File 'lib/rubino/memory/backends/sqlite.rb', line 86 def store(kind:, content:, source_session_id: nil, confidence: 1.0, metadata: {}) insert_fact( text: content, kind: normalize_kind(kind), entities: Array([:entities]), source_session_id: source_session_id, confidence: confidence, valid_from: [:valid_from] ) end |
#user_profile ⇒ Object
– READ path –
140 141 142 143 144 145 146 147 148 149 |
# File 'lib/rubino/memory/backends/sqlite.rb', line 140 def user_profile return nil unless @config.dig("memory", "user_profile_enabled") rows = live_dataset.where(kind: USER_KIND).order(Sequel.desc(:created_at)).all return nil if rows.empty? text = rows.map { |r| r[:text] }.join("\n") limit = @config.memory_user_char_limit text.length > limit ? text[0...limit] : text end |