Module: HTM::LongTermMemory::HybridSearch

Included in:: HTM::LongTermMemory

Defined in:: lib/htm/long_term_memory/hybrid_search.rb

Overview

Hybrid search using Reciprocal Rank Fusion (RRF)

Performs three independent searches and merges results:

Vector similarity search for semantic matching
Full-text search for keyword matching
Tag-based search for hierarchical category matching

Results are merged using RRF scoring. Nodes appearing in multiple searches receive boosted scores, making them rank higher.

Tag scoring uses hierarchical depth matching - the more levels of a tag hierarchy that match, the higher the score contribution.

RRF Formula: score = Σ 1/(k + rank) for each search where node appears

Results are cached for performance.

Security: All queries use parameterized placeholders to prevent SQL injection.

Constant Summary collapse

MAX_HYBRID_LIMIT = Maximum results to prevent DoS via unbounded queries

RRF_K = RRF constant - higher values reduce the impact of rank differences 60 is the standard value from the original RRF paper

CANDIDATE_MULTIPLIER = Multiplier for candidates from each search We fetch more candidates than requested to ensure good fusion

Instance Method Summary collapse

#search_hybrid(timeframe:, query:, limit:, embedding_service:, prefilter_limit: 100, metadata: {}) ⇒ Array<Hash>

Hybrid search using Reciprocal Rank Fusion.

Instance Method Details

#search_hybrid(timeframe:, query:, limit:, embedding_service:, prefilter_limit: 100, metadata: {}) ⇒ `Array<Hash>`

Hybrid search using Reciprocal Rank Fusion

Parameters:

timeframe (Range) —

Time range to search
query (String) —

Search query
limit (Integer) —

Maximum results (capped at MAX_HYBRID_LIMIT)
embedding_service (Object) —

Service to generate embeddings
prefilter_limit (Integer) (defaults to: 100) —

Candidates per search (default: 100)
metadata (Hash) (defaults to: {}) —

Filter by metadata fields (default: {})

Returns:

(Array<Hash>) —

Matching nodes