Module: Pikuri::Tool::Search::Brave
- Defined in:
- lib/pikuri/tool/search/brave.rb
Overview
Performs a Brave Search via the official Web Search API and returns the hits as a list of Result rows. Split into a thin HTTP fetch (#search) and a pure parser (#parse) so tests can exercise the parser against fixture JSON without hitting the network. The cascade in Engines.search owns the final Markdown rendering.
Requires a Brave Search API key. Get one at api-dashboard.search.brave.com — the free “Data for Search” tier allows 1 query/sec and ~2k queries/month.
Privacy posture
Brave’s API Privacy Notice retains Search Query Logs for 90 days (billing / troubleshooting) and states Brave does not collect any identifiers that can link a search query to an individual or their devices. Brave publicly commits that the Search API does not use query data to train its own models, and offers Zero Data Retention — but only on the Enterprise plan, not on the free “Data for Search” tier pikuri defaults to.
Bottom line: of pikuri’s three providers Brave has the cleanest API-level posture — no training-on-queries, no IP linkage, capped 90-day retention by default, real ZDR if you pay for it. Still a logged 90-day window on the cheap tier, so not a substitute for ZDR for genuinely sensitive queries.
Constant Summary collapse
- ENDPOINT =
Returns Web Search endpoint.
'https://api.search.brave.com/res/v1/web/search'- DEFAULT_MAX_RESULTS =
Returns default number of results returned, matching DuckDuckGo::DEFAULT_MAX_RESULTS.
10- ENV_KEY =
Returns env var holding the API key;
X-Subscription-Token. 'BRAVE_SEARCH_API_KEY'- LIMITER =
Returns free-tier Brave caps at 1 req/sec; the 5-minute cooldown protects the limited monthly quota from being burned on doomed retries when a 429 hits.
RateLimiter.new(min_interval: 1.0, cooldown: 300.0)
Class Method Summary collapse
-
.parse(json, max_results: DEFAULT_MAX_RESULTS) ⇒ Array<Result>
Parse a Brave Web Search JSON response into a list of Result rows.
-
.search(query, max_results: DEFAULT_MAX_RESULTS, api_key: ENV.fetch(ENV_KEY, nil)) ⇒ Array<Result>
Fetch results for
queryand return them as an Array<Result>.
Class Method Details
.parse(json, max_results: DEFAULT_MAX_RESULTS) ⇒ Array<Result>
Parse a Brave Web Search JSON response into a list of Result rows. HTML highlight tags (+<strong>+) inside title and description are stripped via Nokogiri so the output is plain text.
When the response yields zero result nodes, two cases are distinguished: a genuine “no results” payload (recognized search shape with empty mixed.main/top/side — typically a too-narrow query Brave couldn’t match) returns an empty array instead of raising, so Engines.search can render its standard no-results stub. Anything else (unknown layout, structured error) raises with a diagnostic so the failure surfaces.
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
# File 'lib/pikuri/tool/search/brave.rb', line 109 def self.parse(json, max_results: DEFAULT_MAX_RESULTS) data = JSON.parse(json) results = Array(data.dig('web', 'results')).take(max_results).filter_map do |r| href = r['url'].to_s next nil if href.empty? Result.new( url: href, title: strip_html(r['title']), body: strip_html(r['description']) ) end if results.empty? return [] if genuine_no_results?(data) raise diagnose_empty(data, json) end results end |
.search(query, max_results: DEFAULT_MAX_RESULTS, api_key: ENV.fetch(ENV_KEY, nil)) ⇒ Array<Result>
Fetch results for query and return them as an Array<Result>. Calls are throttled to one per second and circuit-broken for 5 minutes on rate-limit / quota-exhausted responses; see LIMITER. The caller (typically Engines.search) is expected to have already normalized the query and to wrap this in a result cache.
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
# File 'lib/pikuri/tool/search/brave.rb', line 70 def self.search(query, max_results: DEFAULT_MAX_RESULTS, api_key: ENV.fetch(ENV_KEY, nil)) raise ArgumentError, "Brave Search API key not set (#{ENV_KEY})" if api_key.to_s.strip.empty? LIMITER.call do response = Faraday.get( ENDPOINT, { q: query, count: max_results }, { 'X-Subscription-Token' => api_key, 'Accept' => 'application/json' } ) unless response.success? if response.status == 429 || response.status >= 500 raise Engines::Unavailable, "HTTP #{response.status}" end raise "Brave Search request failed: #{response.status} #{response.body}" end parse(response.body, max_results: max_results) end end |