Module: Pikuri::VectorDb::Reranker
- Defined in:
- lib/pikuri/vector_db/reranker.rb,
lib/pikuri/vector_db/reranker/hit.rb,
lib/pikuri/vector_db/reranker/llama_server.rb
Overview
Namespace for rerankers — the optional quality knob that narrows a top-N candidate set (typically 50) down to the top-k the LLM actually sees (typically 5) by scoring each candidate’s relevance to the query with a cross-encoder model. One ships in v1:
-
LlamaServer — Faraday POST
/v1/rerankagainst a cross-encoder model on a llama.cpp server. Same wire format Cohere’s hosted reranker speaks, so adding aReranker::Coherelater is a thin adapter (see IDEAS.md §“Vector DB / RAG” → “Deferred”).
Reranker protocol
Duck-typed, single method:
-
#rerank(query:, documents:) — return an Array<{Hit}> sorted by descending
score; one entry per input document (or fewer if the server truncates, but typical reranker behaviour is “score every input”).documentsis an Array<String>; theHit.indexis the position in that input array. Emptydocumentsshort-circuits to[]without an HTTP call.
Why index, not document text
The Search tool already holds the candidate Backend::Results with their text + metadata. Returning bare indices keeps the rerank API minimal and avoids re-shuttling the document text over HTTP just to receive it back — Cohere’s return_documents flag exists exactly for callers that don’t already have the text, which isn’t us. See IDEAS.md §“Vector DB / RAG” for the full rationale (index+score is enough; document+score can be reconstructed by the caller).
Why nil-skip rather than a NOOP default
Pikuri::VectorDb::Extension accepts reranker: nil as “skip rerank entirely; retrieve top-k from the backend directly.” A Reranker::NOOP that returns the input ordering with synthetic scores was considered and rejected: it doesn’t unify the candidate-window choice (we’d either waste vector lookups by always fetching top-N or branch anyway on whether the reranker is “real”), and the faux scores would pollute observation formatting and debug logs.
Defined Under Namespace
Classes: Hit, LlamaServer