Module: HTM::LongTermMemory::FulltextSearch
- Included in:
- HTM::LongTermMemory
- Defined in:
- lib/htm/long_term_memory/fulltext_search.rb
Overview
Full-text search using PostgreSQL tsvector and pg_trgm
Performs keyword-based search using:
-
PostgreSQL full-text search (tsvector/tsquery) for stemmed word matching
-
Trigram fuzzy matching (pg_trgm) for typos and partial words
-
Combined scoring: tsvector matches rank higher, trigram provides fallback
Results are cached for performance.
Security: All queries use parameterized placeholders to prevent SQL injection.
Constant Summary collapse
- MAX_FULLTEXT_LIMIT =
Maximum results to prevent DoS via unbounded queries
1000- TRIGRAM_SIMILARITY_THRESHOLD =
Minimum trigram similarity threshold (0.0-1.0) Lower = more fuzzy matches, higher = stricter matching
0.1- TSVECTOR_SCORE_BOOST =
Score boost for tsvector matches over trigram matches Ensures exact word matches rank above fuzzy matches
1.0
Instance Method Summary collapse
-
#search_fulltext(timeframe:, query:, limit:, metadata: {}) ⇒ Array<Hash>
Full-text search.
Instance Method Details
#search_fulltext(timeframe:, query:, limit:, metadata: {}) ⇒ Array<Hash>
Full-text search
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
# File 'lib/htm/long_term_memory/fulltext_search.rb', line 36 def search_fulltext(timeframe:, query:, limit:, metadata: {}) # Enforce limit to prevent DoS safe_limit = limit.to_i.clamp(1, MAX_FULLTEXT_LIMIT) start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC) result = @cache.fetch(:fulltext, timeframe, query, safe_limit, ) do search_fulltext_uncached( timeframe: timeframe, query: query, limit: safe_limit, metadata: ) end elapsed_ms = ((Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time) * 1000).round HTM::Telemetry.search_latency.record(elapsed_ms, attributes: { 'strategy' => 'fulltext' }) result end |