Class: Pikuri::VectorDb::Search
- Inherits:
-
Tool
- Object
- Tool
- Pikuri::VectorDb::Search
- Defined in:
- lib/pikuri/vector_db/search.rb
Overview
The LLM-facing piece: a Tool subclass exposed as vectordb_search. Agentic search — the agent decides when to retrieve. Composes Embedder + a Backend + optionally a Reranker into one call:
-
Embed the query via Embedder into a vector.
-
Backend#query for candidates. Window is CANDIDATE_K when a reranker is configured (we’ll narrow back to FINAL_K), or FINAL_K directly when not (no point over-fetching candidates we’ll just throw away).
-
Optionally rerank: pass the candidate texts to the Reranker, reorder by Reranker::Hit#index, replace each result’s score with the reranker’s relevance score, then take the top FINAL_K.
-
Format as a numbered list with
source+score+ text snippet, capped at SNIPPET_LENGTH chars per chunk.
Which score is shown
The score in the observation tracks whatever actually determined the ordering, so the number never disagrees with the rank:
-
Vector-only (no reranker, or reranker outage fallback) —the Backend::Result cosine similarity, range [0, 1], directly comparable across results.
-
Reranked — the Reranker::Hit relevance score that produced the order. This is a cross-encoder score on a model-dependent scale (can be negative / unbounded for raw logits — see Reranker::Hit‘s
scoreyardoc), so it’s a relative confidence gradient, not a 0–1 value to threshold on. The two modes therefore put different scales behind onescore=label by design: cosine and a rerank score are not comparable, and pikuri shows the one the ordering was actually built from rather than a misleading cosine next to a rerank-determined list.
Reranker outage falls back, not fails
If the reranker’s HTTP call raises (server down, timeout, 5xx, malformed response — all surfaced by Reranker::LlamaServer as RuntimeError), the tool catches it, logs a WARN, and returns the vector-only top FINAL_K from the same candidate set. Reranker is a quality knob, not a correctness one — degraded results beat a tool error the agent has to recover from.
Why a single-param tool surface
vectordb_search(query:) — no top_k:, no reranker: toggle. Retrieval-depth and rerank choice are host policy decisions baked into the Extension‘s configuration; the LLM doesn’t need (or want) to be tuning retrieval parameters mid-conversation. Same reasoning the IDEAS.md design called out for keeping the tool surface minimal.
Constant Summary collapse
- LOGGER =
Pikuri.logger_for('VectorDb::Search')
- FINAL_K =
Returns number of results returned to the LLM. Five is the sweet spot most RAG implementations converge on: enough to cover related-but-different angles on a query, few enough to fit in a turn without budget pressure.
5- CANDIDATE_K =
Returns over-fetch size when a reranker is configured. Fifty matches the “retrieve-broad-then-narrow” pattern IDEAS.md §“Vector DB / RAG” calls out — the reranker’s cross-encoder sees more candidates than vector search alone would surface and reorders by query-conditional relevance.
50- SNIPPET_LENGTH =
Returns per-result text cap in the observation. Keeps a five-result page under ~2.5 KB; the agent can re-query with a sharper question if a snippet looks like it cuts off mid-thought.
500- DESCRIPTION =
Returns static description shown to the LLM, opencode-shape (summary +
Usage:bullets). <<~DESC Search the indexed document corpus for content relevant to a query. Usage: - Use when the user asks about facts, definitions, or topics that would live in their indexed documents (notes, docs, knowledge base). - Phrase the query as a complete question or topic statement — the embedder reads natural language, not keyword bags. - Returns up to #{FINAL_K} results with `source` paths and text snippets; cite results back using the source path. - If a query returns nothing useful, the relevant doc may not be in the indexed corpus — the corpus is what the user explicitly indexed at boot. DESC
Class Method Summary collapse
-
.execute(embedder:, backend:, reranker:, query:) ⇒ String
Public so specs can exercise the search pipeline without constructing a Tool wrapper.
Instance Method Summary collapse
Constructor Details
#initialize(embedder:, backend:, reranker: nil) ⇒ Search
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
# File 'lib/pikuri/vector_db/search.rb', line 107 def initialize(embedder:, backend:, reranker: nil) super( name: 'vectordb_search', description: DESCRIPTION, parameters: Pikuri::Tool::Parameters.build { |p| p.required_string :query, 'Natural-language search query, e.g. ' \ '"how does the deployment pipeline handle migrations?" or ' \ '"recipe with mushrooms and risotto rice".' }, execute: lambda { |query:| Search.execute( embedder: , backend: backend, reranker: reranker, query: query ) } ) end |
Class Method Details
.execute(embedder:, backend:, reranker:, query:) ⇒ String
Public so specs can exercise the search pipeline without constructing a Tool wrapper.
134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
# File 'lib/pikuri/vector_db/search.rb', line 134 def self.execute(embedder:, backend:, reranker:, query:) return 'Error: query is empty' if query.nil? || query.strip.empty? candidate_k = reranker ? CANDIDATE_K : FINAL_K query_vector = .([query]).first candidates = backend.query(vector: query_vector, top_k: candidate_k) return 'No matches found in the indexed corpus.' if candidates.empty? final = if reranker reranked_or_fallback(reranker: reranker, query: query, candidates: candidates) else candidates end.first(FINAL_K) format_observation(final) end |