Class: Pikuri::Tool::Search::Engines
- Inherits:
-
Object
- Object
- Pikuri::Tool::Search::Engines
- Defined in:
- lib/pikuri/tool/search/engines.rb
Overview
Search-orchestration object: the cascade across configured providers, the result cache, and the Unavailable protocol marker the cascade uses to fall back. The LLM-facing tool itself is built by WebSearch.build, which constructs one of these and wires its #search into a Pikuri::Tool. Each Pikuri::Tool::Search provider module (DuckDuckGo, Brave, Exa) raises Unavailable when it wants the cascade to try the next one.
Provider keys are constructor config, not environment
Brave and Exa are paid and need an API key; DuckDuckGo needs none. An Engines is constructed with the keys it should use (brave_key: / exa_key:, both optional) — pikuri reads no key from the environment, so the only providers in the cascade are DuckDuckGo plus whichever keyed providers the host actually configured. The host sources those keys however it likes (the bundled bin/ examples load a JSON config file by convention); see CLAUDE.md “Environment is not a secret store”.
Defined Under Namespace
Classes: Unavailable
Constant Summary collapse
- LOGGER =
Subsystem logger; set its level with
PIKURI_LOG_ENGINES(e.g. PIKURI_LOG_ENGINES=debug) or the globalPIKURI_LOG. Pikuri.logger_for('Engines')
- CACHE =
Process-shared on-disk cache backing #search‘s default. Kept at class level (not per-instance) so every engine dedupes answered queries into one directory; the constructor’s
cache:parameter injects a different store for tests. Exposed as a method so specs can swap it for UrlCache::NULL without touching the instance. UrlCache.new(ttl: UrlCache::DEFAULT_TTL, dir: "#{UrlCache::ROOT_DIR}/web_search")
Instance Attribute Summary collapse
-
#providers ⇒ Array<#search, #label>
readonly
The provider instances this engine cascades across, in declaration order (the cascade itself shuffles them per call).
Class Method Summary collapse
-
.cache ⇒ UrlCache, #fetch
Accessor for CACHE, used as the constructor’s
cache:default; specs override this to swap in UrlCache::NULL.
Instance Method Summary collapse
-
#initialize(brave_key: nil, exa_key: nil, cache: self.class.cache) ⇒ Engines
constructor
Builds the provider cascade once: DuckDuckGo always (no key needed), plus Brave / Exa when their key was supplied (non-blank).
-
#search(query, max_results:) ⇒ String
Run
querythrough the configured providers in random order, falling back to the next one each time a provider raises Unavailable.
Constructor Details
#initialize(brave_key: nil, exa_key: nil, cache: self.class.cache) ⇒ Engines
Builds the provider cascade once: DuckDuckGo always (no key needed), plus Brave / Pikuri::Tool::Search::Exa when their key was supplied (non-blank). Each keyed provider is constructed with its key, so from here on every provider is just an object answering #search / #label — the cascade in #search treats them uniformly.
72 73 74 75 76 77 78 |
# File 'lib/pikuri/tool/search/engines.rb', line 72 def initialize(brave_key: nil, exa_key: nil, cache: self.class.cache) @providers = [DuckDuckGo.new] @providers << Brave.new(api_key: brave_key) unless brave_key.to_s.strip.empty? @providers << Exa.new(api_key: exa_key) unless exa_key.to_s.strip.empty? @cache = cache @last_logged_providers = nil end |
Instance Attribute Details
#providers ⇒ Array<#search, #label> (readonly)
The provider instances this engine cascades across, in declaration order (the cascade itself shuffles them per call).
84 85 86 |
# File 'lib/pikuri/tool/search/engines.rb', line 84 def providers @providers end |
Class Method Details
.cache ⇒ UrlCache, #fetch
Accessor for CACHE, used as the constructor’s cache: default; specs override this to swap in UrlCache::NULL.
55 56 57 |
# File 'lib/pikuri/tool/search/engines.rb', line 55 def self.cache CACHE end |
Instance Method Details
#search(query, max_results:) ⇒ String
Run query through the configured providers in random order, falling back to the next one each time a provider raises Unavailable. The shuffle spreads load so a single provider isn’t always hit first (and exhausted first); revisit if it stops being the right default.
The query is whitespace-trimmed and runs of whitespace collapsed to a single space before the cascade runs. The winning provider’s Array<Result> is rendered into smolagents-style Markdown here (+“## Search Results”+ header, then [title](url)nbody entries joined by blank lines; an empty array becomes “No results found.”), and the rendered Markdown is cached on disk via #initialize‘s cache:, keyed by the cleaned query. A cache hit short-circuits the cascade entirely (and benefits whichever provider would have answered next time too — once a query is cached, the cooldown state of the original answering provider no longer matters). max_results is not part of the cache key, so callers passing a non-default value may get a result rendered with the previously-cached size.
If every provider reports temporary unavailability, returns an “Error: …” string instead of raising — same convention as Calculator.calculate, so the agent loop can feed the failure back to the model as the next observation. Any non-Unavailable exception (network error, parser failure, malformed response, bad API key) bubbles up to the caller.
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
# File 'lib/pikuri/tool/search/engines.rb', line 116 def search(query, max_results:) cleaned = query.to_s.strip.gsub(/\s+/, ' ') raise ArgumentError, 'query is empty' if cleaned.empty? current_providers = providers log_providers(current_providers) hit = true result = @cache.fetch(cleaned) do hit = false failures = [] results = nil chosen = nil current_providers.shuffle.each do |provider| results = provider.search(cleaned, max_results: max_results) chosen = provider break rescue Unavailable => e failures << "#{provider.label} (#{e.})" end # Raise so {UrlCache#fetch} does NOT persist the all-unavailable # message — otherwise that string would block every future search # for this query until the TTL expires. The outer +rescue+ turns # the raise back into the calculator-style "Error: …" string. chosen or raise Unavailable, "all search providers temporarily unavailable: #{failures.join('; ')}" LOGGER.info do "engine=#{chosen.label} query=#{cleaned.inspect} results=#{results.size}" end render(results) end LOGGER.info { "cache=hit query=#{cleaned.inspect} bytes=#{result.bytesize}" } if hit result rescue Unavailable => e "Error: #{e.}" end |