Class: ClaudeMemory::Embeddings::Similarity
- Inherits:
-
Object
- Object
- ClaudeMemory::Embeddings::Similarity
- Defined in:
- lib/claude_memory/embeddings/similarity.rb
Overview
Calculates similarity between embedding vectors Uses cosine similarity for comparing normalized vectors
Class Method Summary collapse
-
.average_similarity(query_vector, target_vectors) ⇒ Float
Calculate average similarity of a vector to multiple other vectors Useful for multi-concept queries.
-
.batch_similarities(query_vector, candidate_vectors) ⇒ Array<Float>
Batch calculate similarities between one query and many candidates More efficient than calling cosine repeatedly.
-
.cosine(vec_a, vec_b) ⇒ Float
Calculate cosine similarity between two vectors Assumes vectors are already normalized to unit length.
-
.top_k(query_vector, candidates, k) ⇒ Array<Hash>
Find top K most similar items.
Class Method Details
.average_similarity(query_vector, target_vectors) ⇒ Float
Calculate average similarity of a vector to multiple other vectors Useful for multi-concept queries
52 53 54 55 56 57 |
# File 'lib/claude_memory/embeddings/similarity.rb', line 52 def self.average_similarity(query_vector, target_vectors) return 0.0 if target_vectors.empty? similarities = target_vectors.map { |vec| cosine(query_vector, vec) } similarities.sum / similarities.size.to_f end |
.batch_similarities(query_vector, candidate_vectors) ⇒ Array<Float>
Batch calculate similarities between one query and many candidates More efficient than calling cosine repeatedly
64 65 66 |
# File 'lib/claude_memory/embeddings/similarity.rb', line 64 def self.batch_similarities(query_vector, candidate_vectors) candidate_vectors.map { |vec| cosine(query_vector, vec) } end |
.cosine(vec_a, vec_b) ⇒ Float
Calculate cosine similarity between two vectors Assumes vectors are already normalized to unit length
13 14 15 16 17 18 19 20 21 22 |
# File 'lib/claude_memory/embeddings/similarity.rb', line 13 def self.cosine(vec_a, vec_b) return 0.0 if vec_a.nil? || vec_b.nil? return 0.0 if vec_a.empty? || vec_b.empty? # For normalized vectors, cosine similarity is just the dot product dot_product = vec_a.zip(vec_b).sum { |a, b| a * b } # Clamp to [0, 1] range (handle floating point errors) dot_product.clamp(0.0, 1.0) end |
.top_k(query_vector, candidates, k) ⇒ Array<Hash>
Find top K most similar items
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
# File 'lib/claude_memory/embeddings/similarity.rb', line 29 def self.top_k(query_vector, candidates, k) return [] if candidates.empty? # Calculate similarities and score scored = candidates.map do |candidate| = candidate[:embedding] similarity = cosine(query_vector, ) { candidate: candidate, similarity: similarity } end # Sort by similarity (highest first) and take top K scored.sort_by { |item| -item[:similarity] }.take(k) end |