Class: ClaudeMemory::Embeddings::Similarity

Inherits:
Object
  • Object
show all
Defined in:
lib/claude_memory/embeddings/similarity.rb

Overview

Calculates similarity between embedding vectors Uses cosine similarity for comparing normalized vectors

Class Method Summary collapse

Class Method Details

.average_similarity(query_vector, target_vectors) ⇒ Float

Calculate average similarity of a vector to multiple other vectors Useful for multi-concept queries

Parameters:

  • query_vector (Array<Float>)

    query embedding

  • target_vectors (Array<Array<Float>>)

    target embeddings

Returns:

  • (Float)

    average similarity



52
53
54
55
56
57
# File 'lib/claude_memory/embeddings/similarity.rb', line 52

def self.average_similarity(query_vector, target_vectors)
  return 0.0 if target_vectors.empty?

  similarities = target_vectors.map { |vec| cosine(query_vector, vec) }
  similarities.sum / similarities.size.to_f
end

.batch_similarities(query_vector, candidate_vectors) ⇒ Array<Float>

Batch calculate similarities between one query and many candidates More efficient than calling cosine repeatedly

Parameters:

  • query_vector (Array<Float>)

    query embedding

  • candidate_vectors (Array<Array<Float>>)

    candidate embeddings

Returns:

  • (Array<Float>)

    similarity scores in same order as candidates



64
65
66
# File 'lib/claude_memory/embeddings/similarity.rb', line 64

def self.batch_similarities(query_vector, candidate_vectors)
  candidate_vectors.map { |vec| cosine(query_vector, vec) }
end

.cosine(vec_a, vec_b) ⇒ Float

Calculate cosine similarity between two vectors Assumes vectors are already normalized to unit length

Parameters:

  • vec_a (Array<Float>)

    first vector

  • vec_b (Array<Float>)

    second vector

Returns:

  • (Float)

    similarity score between 0 and 1



13
14
15
16
17
18
19
20
21
22
# File 'lib/claude_memory/embeddings/similarity.rb', line 13

def self.cosine(vec_a, vec_b)
  return 0.0 if vec_a.nil? || vec_b.nil?
  return 0.0 if vec_a.empty? || vec_b.empty?

  # For normalized vectors, cosine similarity is just the dot product
  dot_product = vec_a.zip(vec_b).sum { |a, b| a * b }

  # Clamp to [0, 1] range (handle floating point errors)
  dot_product.clamp(0.0, 1.0)
end

.top_k(query_vector, candidates, k) ⇒ Array<Hash>

Find top K most similar items

Parameters:

  • query_vector (Array<Float>)

    query embedding

  • candidates (Array<Hash>)

    array of hashes with :embedding key

  • k (Integer)

    number of top results to return

Returns:

  • (Array<Hash>)

    top K candidates with :similarity scores



29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# File 'lib/claude_memory/embeddings/similarity.rb', line 29

def self.top_k(query_vector, candidates, k)
  return [] if candidates.empty?

  # Calculate similarities and score
  scored = candidates.map do |candidate|
    embedding = candidate[:embedding]
    similarity = cosine(query_vector, embedding)

    {
      candidate: candidate,
      similarity: similarity
    }
  end

  # Sort by similarity (highest first) and take top K
  scored.sort_by { |item| -item[:similarity] }.take(k)
end