Class: Phronomy::Agent::Context::Knowledge::VectorStore::InMemory

Inherits:
Base
  • Object
show all
Defined in:
lib/phronomy/agent/context/knowledge/vector_store/in_memory.rb

Overview

Pure-Ruby in-memory vector store using cosine similarity.

Intended for tests, short-lived agents, and Retrieval::Semantic scenarios where the message count is small enough that a linear scan is fast enough.

Examples:

store = Phronomy::Agent::Context::Knowledge::VectorStore::InMemory.new
store.add(id: "1", embedding: [0.1, 0.9], metadata: { message: msg })
results = store.search(query_embedding: [0.1, 0.8], k: 3)

Instance Method Summary collapse

Methods included from AsyncBackend

#add_async, #clear_async, #remove_async, #search_async

Constructor Details

#initialize(dimension: nil) ⇒ InMemory

Returns a new instance of InMemory.

Parameters:

  • dimension (Integer, nil) (defaults to: nil)

    expected embedding dimension. When nil, the dimension is inferred from the first call to #add. For multi-threaded use, pass dimension: explicitly; concurrent first adds are not guaranteed to be race-free.



23
24
25
26
# File 'lib/phronomy/agent/context/knowledge/vector_store/in_memory.rb', line 23

def initialize(dimension: nil)
  @documents = {}
  @expected_dimension = dimension
end

Instance Method Details

#add(id:, embedding:, metadata: {}, cancellation_token: nil) ⇒ Object

Parameters:



33
34
35
36
37
38
39
40
# File 'lib/phronomy/agent/context/knowledge/vector_store/in_memory.rb', line 33

def add(id:, embedding:, metadata: {}, cancellation_token: nil)
  cancellation_token&.raise_if_cancelled!
  # Establish expected dimension on first add, then validate.
  @expected_dimension ||= embedding.size
  validate_embedding_dimension!(embedding, @expected_dimension)
  @documents[id] = {embedding: embedding, metadata: }
  self
end

#clearObject



75
76
77
78
# File 'lib/phronomy/agent/context/knowledge/vector_store/in_memory.rb', line 75

def clear
  @documents.clear
  self
end

#remove(id:) ⇒ Object



70
71
72
73
# File 'lib/phronomy/agent/context/knowledge/vector_store/in_memory.rb', line 70

def remove(id:)
  @documents.delete(id)
  self
end

#search(query_embedding:, k: 5, cancellation_token: nil) ⇒ Array<Hash>

mutant:disable - genuine equivalent mutations: doc.fetch(:embedding) vs doc:embedding; score:, metadata: doc.fetch(:metadata) shorthand+fetch vs ; -r.fetch(:score) vs -r:score; snapshot = @documents vs .dup is equivalent in single-threaded tests (GVL makes Hash#dup atomic, no behaviour difference under test isolation)

Parameters:

Returns:

  • (Array<Hash>)

    sorted by descending score



52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# File 'lib/phronomy/agent/context/knowledge/vector_store/in_memory.rb', line 52

def search(query_embedding:, k: 5, cancellation_token: nil)
  cancellation_token&.raise_if_cancelled!
  k = validate_k!(k)
  # search never establishes dimension; validate only when dimension is known.
  validate_embedding_dimension!(query_embedding, @expected_dimension)
  # Take an atomic snapshot before iterating.  Hash#dup is a C-level
  # call that completes without releasing the GVL, so it is atomic with
  # respect to any other Ruby thread.  Iterating the copy instead of
  # @documents directly prevents "can't add a new key into hash during
  # iteration" when a concurrent thread calls #add.
  snapshot = @documents.dup
  results = snapshot.map do |id, doc|
    score = cosine_similarity(query_embedding, doc[:embedding])
    {id: id, score: score, metadata: doc[:metadata]}
  end
  results.sort_by { |r| -r[:score] }.first(k)
end

#sizeInteger

Returns number of documents stored.

Returns:

  • (Integer)

    number of documents stored



82
83
84
# File 'lib/phronomy/agent/context/knowledge/vector_store/in_memory.rb', line 82

def size
  @documents.size
end