Class: Pikuri::VectorDb::Search

Inherits:
Tool
  • Object
show all
Defined in:
lib/pikuri/vector_db/search.rb

Overview

The LLM-facing piece: a Tool subclass exposed as vectordb_search. Agentic search — the agent decides when to retrieve. Composes Embedder + a Backend + optionally a Reranker into one call:

  1. Embed the query via Embedder into a vector.

  2. Backend#query for candidates. Window is CANDIDATE_K when a reranker is configured (we’ll narrow back to FINAL_K), or FINAL_K directly when not (no point over-fetching candidates we’ll just throw away).

  3. Optionally rerank: pass the candidate texts to the Reranker, reorder by Reranker::Hit#index, replace each result’s score with the reranker’s relevance score, then take the top FINAL_K.

  4. Format as a numbered list with source + score + text snippet, capped at SNIPPET_LENGTH chars per chunk.

Which score is shown

The score in the observation tracks whatever actually determined the ordering, so the number never disagrees with the rank:

  • Vector-only (no reranker, or reranker outage fallback) —the Backend::Result cosine similarity, range [0, 1], directly comparable across results.

  • Reranked — the Reranker::Hit relevance score that produced the order. This is a cross-encoder score on a model-dependent scale (can be negative / unbounded for raw logits — see Reranker::Hit‘s score yardoc), so it’s a relative confidence gradient, not a 0–1 value to threshold on. The two modes therefore put different scales behind one score= label by design: cosine and a rerank score are not comparable, and pikuri shows the one the ordering was actually built from rather than a misleading cosine next to a rerank-determined list.

Reranker outage falls back, not fails

If the reranker’s HTTP call raises (server down, timeout, 5xx, malformed response — all surfaced by Reranker::LlamaServer as RuntimeError), the tool catches it, logs a WARN, and returns the vector-only top FINAL_K from the same candidate set. Reranker is a quality knob, not a correctness one — degraded results beat a tool error the agent has to recover from.

Why a single-param tool surface

vectordb_search(query:) — no top_k:, no reranker: toggle. Retrieval-depth and rerank choice are host policy decisions baked into the Extension‘s configuration; the LLM doesn’t need (or want) to be tuning retrieval parameters mid-conversation. Same reasoning the IDEAS.md design called out for keeping the tool surface minimal.

Constant Summary collapse

LOGGER =
Pikuri.logger_for('VectorDb::Search')
FINAL_K =

Returns number of results returned to the LLM. Five is the sweet spot most RAG implementations converge on: enough to cover related-but-different angles on a query, few enough to fit in a turn without budget pressure.

Returns:

  • (Integer)

    number of results returned to the LLM. Five is the sweet spot most RAG implementations converge on: enough to cover related-but-different angles on a query, few enough to fit in a turn without budget pressure.

5
CANDIDATE_K =

Returns over-fetch size when a reranker is configured. Fifty matches the “retrieve-broad-then-narrow” pattern IDEAS.md §“Vector DB / RAG” calls out — the reranker’s cross-encoder sees more candidates than vector search alone would surface and reorders by query-conditional relevance.

Returns:

  • (Integer)

    over-fetch size when a reranker is configured. Fifty matches the “retrieve-broad-then-narrow” pattern IDEAS.md §“Vector DB / RAG” calls out — the reranker’s cross-encoder sees more candidates than vector search alone would surface and reorders by query-conditional relevance.

50
SNIPPET_LENGTH =

Returns per-result text cap in the observation. Keeps a five-result page under ~2.5 KB; the agent can re-query with a sharper question if a snippet looks like it cuts off mid-thought.

Returns:

  • (Integer)

    per-result text cap in the observation. Keeps a five-result page under ~2.5 KB; the agent can re-query with a sharper question if a snippet looks like it cuts off mid-thought.

500
DESCRIPTION =

Returns static description shown to the LLM, opencode-shape (summary + Usage: bullets).

Returns:

  • (String)

    static description shown to the LLM, opencode-shape (summary + Usage: bullets).

<<~DESC
  Search the indexed document corpus for content relevant to a query.

  Usage:
  - Use when the user asks about facts, definitions, or topics that would live in their indexed documents (notes, docs, knowledge base).
  - Phrase the query as a complete question or topic statement — the embedder reads natural language, not keyword bags.
  - Returns up to #{FINAL_K} results with `source` paths and text snippets; cite results back using the source path.
  - If a query returns nothing useful, the relevant doc may not be in the indexed corpus — the corpus is what the user explicitly indexed at boot.
DESC

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(embedder:, backend:, reranker: nil) ⇒ Search

Parameters:

  • embedder (#embed)

    anything implementing embed(Array<String>) -> Array<Array<Float>>. Typically Embedder.

  • backend (#query, #upsert, #delete_all, #count)

    any Backend implementation.

  • reranker (#rerank, nil) (defaults to: nil)

    optional. nil skips reranking and retrieves FINAL_K from the backend directly.



107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
# File 'lib/pikuri/vector_db/search.rb', line 107

def initialize(embedder:, backend:, reranker: nil)
  super(
    name: 'vectordb_search',
    description: DESCRIPTION,
    parameters: Pikuri::Tool::Parameters.build { |p|
      p.required_string :query,
                        'Natural-language search query, e.g. ' \
                        '"how does the deployment pipeline handle migrations?" or ' \
                        '"recipe with mushrooms and risotto rice".'
    },
    execute: lambda { |query:|
      Search.execute(
        embedder: embedder, backend: backend, reranker: reranker,
        query: query
      )
    }
  )
end

Class Method Details

.execute(embedder:, backend:, reranker:, query:) ⇒ String

Public so specs can exercise the search pipeline without constructing a Tool wrapper.

Parameters:

  • embedder (#embed)
  • backend (#query)
  • reranker (#rerank, nil)
  • query (String)

Returns:

  • (String)

    formatted observation.



134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
# File 'lib/pikuri/vector_db/search.rb', line 134

def self.execute(embedder:, backend:, reranker:, query:)
  return 'Error: query is empty' if query.nil? || query.strip.empty?

  candidate_k = reranker ? CANDIDATE_K : FINAL_K
  query_vector = embedder.embed([query]).first
  candidates = backend.query(vector: query_vector, top_k: candidate_k)
  return 'No matches found in the indexed corpus.' if candidates.empty?

  final = if reranker
            reranked_or_fallback(reranker: reranker, query: query, candidates: candidates)
          else
            candidates
          end.first(FINAL_K)

  format_observation(final)
end