Module: HTM::LongTermMemory::FulltextSearch

Included in:
HTM::LongTermMemory
Defined in:
lib/htm/long_term_memory/fulltext_search.rb

Overview

Full-text search using PostgreSQL tsvector and pg_trgm

Performs keyword-based search using:

  • PostgreSQL full-text search (tsvector/tsquery) for stemmed word matching

  • Trigram fuzzy matching (pg_trgm) for typos and partial words

  • Combined scoring: tsvector matches rank higher, trigram provides fallback

Results are cached for performance.

Security: All queries use parameterized placeholders to prevent SQL injection.

Constant Summary collapse

MAX_FULLTEXT_LIMIT =

Maximum results to prevent DoS via unbounded queries

1000
TRIGRAM_SIMILARITY_THRESHOLD =

Minimum trigram similarity threshold (0.0-1.0) Lower = more fuzzy matches, higher = stricter matching

0.1
TSVECTOR_SCORE_BOOST =

Score boost for tsvector matches over trigram matches Ensures exact word matches rank above fuzzy matches

1.0

Instance Method Summary collapse

Instance Method Details

#search_fulltext(timeframe:, query:, limit:, metadata: {}) ⇒ Array<Hash>

Full-text search

Parameters:

  • timeframe (Range)

    Time range to search

  • query (String)

    Search query

  • limit (Integer)

    Maximum results (capped at MAX_FULLTEXT_LIMIT)

  • metadata (Hash) (defaults to: {})

    Filter by metadata fields (default: {})

Returns:

  • (Array<Hash>)

    Matching nodes



36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
# File 'lib/htm/long_term_memory/fulltext_search.rb', line 36

def search_fulltext(timeframe:, query:, limit:, metadata: {})
  # Enforce limit to prevent DoS
  safe_limit = limit.to_i.clamp(1, MAX_FULLTEXT_LIMIT)

  start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
  result = @cache.fetch(:fulltext, timeframe, query, safe_limit, ) do
    search_fulltext_uncached(
      timeframe: timeframe,
      query: query,
      limit: safe_limit,
      metadata: 
    )
  end
  elapsed_ms = ((Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time) * 1000).round
  HTM::Telemetry.search_latency.record(elapsed_ms, attributes: { 'strategy' => 'fulltext' })
  result
end