Module: Rubino::Memory::SqliteGraph
- Included in:
- Backends::Sqlite
- Defined in:
- lib/rubino/memory/sqlite_graph.rb
Overview
Graph-lite layer for the Sqlite backend (Memory Phase 3b).
A thin mixin over two tables (memory_entities + memory_edges) that turns the per-fact entity tags into a tiny knowledge graph and blends a bounded 1-hop traversal into retrieval. NOT a graph DB — just entity resolution by normalized name and a bounded join over edges.
Edges are populated two ways, both cheap (no extra LLM call beyond the single extraction call the backend already makes):
* DETERMINISTIC co-occurrence — every pair of entities tagged on the
same fact gets a `co_occurs` edge (free, derived from `entities_json`).
* TYPED relations — the extraction LLM optionally returns `edges:
[{src, relation, dst}]` in the SAME structured call, so the typed
graph costs 0 additional calls/turn.
Edges are bi-temporal like facts: a contradicting relation soft-retires the old edge (valid_to set), it is not deleted.
Constant Summary collapse
- ENTITIES =
:memory_entities- EDGES =
:memory_edges- CO_OCCURS =
"co_occurs"
Instance Method Summary collapse
-
#facts_tagged_with(norm_names, limit) ⇒ Object
Live fact ids whose entities_json contains any of the given normalized entity names.
-
#graph_neighbors(query, limit) ⇒ Object
Given query text, find seed entities whose name appears in the query, walk LIVE edges out one hop to neighbor entities, and return the ids of LIVE facts tagged with any seed-or-neighbor entity.
-
#index_fact_graph(fact_id, entities, typed: []) ⇒ Object
Wire the graph for a freshly-inserted fact: upsert its entity nodes, connect every co-occurring pair with a co_occurs edge, and add any typed relations the extractor emitted for this fact.
- #normalize_entity_name(name) ⇒ Object
-
#resolve_entity(name, kind: nil) ⇒ Object
Resolve (find-or-create) an entity node by normalized name, returning its id.
-
#seed_entities(query) ⇒ Object
Entities whose normalized name (or a token of it) appears in the query.
-
#supersede_edge(src, dst, _relation) ⇒ Object
Soft-retire any live typed edge between src->dst whose relation differs, so a contradicting relation supersedes the old one (history kept).
-
#upsert_edge(src, dst, relation, source_fact_id) ⇒ Object
Insert a live edge unless an identical live edge already exists (idempotent).
Instance Method Details
#facts_tagged_with(norm_names, limit) ⇒ Object
Live fact ids whose entities_json contains any of the given normalized entity names. Bounded scan over the live set (small in practice).
142 143 144 145 146 147 148 149 150 151 |
# File 'lib/rubino/memory/sqlite_graph.rb', line 142 def facts_tagged_with(norm_names, limit) wanted = norm_names.to_set return [] if wanted.empty? live_dataset.exclude(entities_json: nil).order(Sequel.desc(:created_at)) .limit(limit * 6).all.filter_map do |row| ents = parse_entities(row[:entities_json]).map { |e| e.to_s.downcase } row[:id] if ents.any? { |e| wanted.include?(e) } end.first(limit) end |
#graph_neighbors(query, limit) ⇒ Object
Given query text, find seed entities whose name appears in the query, walk LIVE edges out one hop to neighbor entities, and return the ids of LIVE facts tagged with any seed-or-neighbor entity. This surfaces facts connected through a relation that pure FTS on the probe would miss. Bounded: capped seeds, single hop, capped fact scan.
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
# File 'lib/rubino/memory/sqlite_graph.rb', line 114 def graph_neighbors(query, limit) seeds = seed_entities(query) return [] if seeds.empty? # 1-hop: neighbors reachable via a live edge in either direction. neighbor_ids = @db[EDGES] .where(valid_to: nil) .where(Sequel.|({ src_entity_id: seeds }, { dst_entity_id: seeds })) .select_map(%i[src_entity_id dst_entity_id]) .flatten.uniq entity_ids = (seeds + neighbor_ids).uniq return [] if entity_ids.empty? names = @db[ENTITIES].where(id: entity_ids).select_map(:name_norm) facts_tagged_with(names, limit) end |
#index_fact_graph(fact_id, entities, typed: []) ⇒ Object
Wire the graph for a freshly-inserted fact: upsert its entity nodes, connect every co-occurring pair with a co_occurs edge, and add any typed relations the extractor emitted for this fact. Bounded and free of extra LLM calls. ‘typed` is an array of relation, dst hashes.
63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
# File 'lib/rubino/memory/sqlite_graph.rb', line 63 def index_fact_graph(fact_id, entities, typed: []) ids = Array(entities).filter_map { |e| resolve_entity(e) }.uniq ids.combination(2).each { |a, b| upsert_edge(a, b, CO_OCCURS, fact_id) } Array(typed).each do |edge| src = resolve_entity(edge["src"] || edge[:src]) dst = resolve_entity(edge["dst"] || edge[:dst]) rel = (edge["relation"] || edge[:relation]).to_s.strip.downcase next if src.nil? || dst.nil? || src == dst || rel.empty? # A changed typed relation between the SAME pair supersedes the old # one (e.g. "uses postgres" -> "uses sqlite" is handled at the fact # level; here we keep the latest relation label live). supersede_edge(src, dst, rel) upsert_edge(src, dst, rel, fact_id) end end |
#normalize_entity_name(name) ⇒ Object
53 54 55 |
# File 'lib/rubino/memory/sqlite_graph.rb', line 53 def normalize_entity_name(name) name.to_s.strip.downcase.gsub(/\s+/, " ") end |
#resolve_entity(name, kind: nil) ⇒ Object
Resolve (find-or-create) an entity node by normalized name, returning its id. Same name from different facts collapses to one node.
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
# File 'lib/rubino/memory/sqlite_graph.rb', line 34 def resolve_entity(name, kind: nil) norm = normalize_entity_name(name) return nil if norm.empty? existing = @db[ENTITIES].where(name_norm: norm).first return existing[:id] if existing now = Time.now.utc.iso8601 id = SecureRandom.uuid @db[ENTITIES].insert( id: id, name: name.to_s.strip, name_norm: norm, kind: kind, created_at: now, updated_at: now ) id rescue Sequel::UniqueConstraintViolation # Concurrent insert: re-read the winner. @db[ENTITIES].where(name_norm: norm).get(:id) end |
#seed_entities(query) ⇒ Object
Entities whose normalized name (or a token of it) appears in the query.
133 134 135 136 137 138 |
# File 'lib/rubino/memory/sqlite_graph.rb', line 133 def seed_entities(query) tokens = query.to_s.downcase.scan(/[\p{L}\p{N}]+/).reject { |w| w.length < 2 }.uniq return [] if tokens.empty? @db[ENTITIES].where(name_norm: tokens).select_map(:id).first(8) end |
#supersede_edge(src, dst, _relation) ⇒ Object
Soft-retire any live typed edge between src->dst whose relation differs, so a contradicting relation supersedes the old one (history kept).
101 102 103 104 105 |
# File 'lib/rubino/memory/sqlite_graph.rb', line 101 def supersede_edge(src, dst, _relation) @db[EDGES].where(src_entity_id: src, dst_entity_id: dst, valid_to: nil) .exclude(relation: CO_OCCURS) .update(valid_to: Time.now.utc.iso8601, updated_at: Time.now.utc.iso8601) end |
#upsert_edge(src, dst, relation, source_fact_id) ⇒ Object
Insert a live edge unless an identical live edge already exists (idempotent). Co_occurs edges are undirected in effect: we store the canonical ordering for the pair so the de-dup works both ways.
84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
# File 'lib/rubino/memory/sqlite_graph.rb', line 84 def upsert_edge(src, dst, relation, source_fact_id) a, b = relation == CO_OCCURS ? [src, dst].minmax : [src, dst] return if @db[EDGES].where( src_entity_id: a, dst_entity_id: b, relation: relation, valid_to: nil ).count.positive? now = Time.now.utc.iso8601 @db[EDGES].insert( id: SecureRandom.uuid, src_entity_id: a, dst_entity_id: b, relation: relation, source_fact_id: source_fact_id, valid_from: now, valid_to: nil, superseded_by: nil, created_at: now, updated_at: now ) end |