Module: Pikuri::Tool::Search::Brave

Defined in:
lib/pikuri/tool/search/brave.rb

Overview

Performs a Brave Search via the official Web Search API and returns the hits as a list of Result rows. Split into a thin HTTP fetch (#search) and a pure parser (#parse) so tests can exercise the parser against fixture JSON without hitting the network. The cascade in Engines.search owns the final Markdown rendering.

Requires a Brave Search API key. Get one at api-dashboard.search.brave.com — the free “Data for Search” tier allows 1 query/sec and ~2k queries/month.

Privacy posture

Brave’s API Privacy Notice retains Search Query Logs for 90 days (billing / troubleshooting) and states Brave does not collect any identifiers that can link a search query to an individual or their devices. Brave publicly commits that the Search API does not use query data to train its own models, and offers Zero Data Retention — but only on the Enterprise plan, not on the free “Data for Search” tier pikuri defaults to.

Bottom line: of pikuri’s three providers Brave has the cleanest API-level posture — no training-on-queries, no IP linkage, capped 90-day retention by default, real ZDR if you pay for it. Still a logged 90-day window on the cheap tier, so not a substitute for ZDR for genuinely sensitive queries.

Constant Summary collapse

ENDPOINT =

Returns Web Search endpoint.

Returns:

  • (String)

    Web Search endpoint

'https://api.search.brave.com/res/v1/web/search'
DEFAULT_MAX_RESULTS =

Returns default number of results returned, matching DuckDuckGo::DEFAULT_MAX_RESULTS.

Returns:

10
ENV_KEY =

Returns env var holding the API key; X-Subscription-Token.

Returns:

  • (String)

    env var holding the API key; X-Subscription-Token

'BRAVE_SEARCH_API_KEY'
LIMITER =

Returns free-tier Brave caps at 1 req/sec; the 5-minute cooldown protects the limited monthly quota from being burned on doomed retries when a 429 hits.

Returns:

  • (RateLimiter)

    free-tier Brave caps at 1 req/sec; the 5-minute cooldown protects the limited monthly quota from being burned on doomed retries when a 429 hits.

RateLimiter.new(min_interval: 1.0, cooldown: 300.0)

Class Method Summary collapse

Class Method Details

.parse(json, max_results: DEFAULT_MAX_RESULTS) ⇒ Array<Result>

Parse a Brave Web Search JSON response into a list of Result rows. HTML highlight tags (+<strong>+) inside title and description are stripped via Nokogiri so the output is plain text.

When the response yields zero result nodes, two cases are distinguished: a genuine “no results” payload (recognized search shape with empty mixed.main/top/side — typically a too-narrow query Brave couldn’t match) returns an empty array instead of raising, so Engines.search can render its standard no-results stub. Anything else (unknown layout, structured error) raises with a diagnostic so the failure surfaces.

Parameters:

  • json (String)

    response body from ENDPOINT

  • max_results (Integer) (defaults to: DEFAULT_MAX_RESULTS)

    maximum number of result entries

Returns:

  • (Array<Result>)

    hits, possibly empty on a recognized empty-results payload

Raises:

  • (RuntimeError)

    when the response yields no result entries and is not recognized as a genuine empty-results payload



109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
# File 'lib/pikuri/tool/search/brave.rb', line 109

def self.parse(json, max_results: DEFAULT_MAX_RESULTS)
  data = JSON.parse(json)
  results = Array(data.dig('web', 'results')).take(max_results).filter_map do |r|
    href = r['url'].to_s
    next nil if href.empty?

    Result.new(
      url: href,
      title: strip_html(r['title']),
      body: strip_html(r['description'])
    )
  end

  if results.empty?
    return [] if genuine_no_results?(data)

    raise diagnose_empty(data, json)
  end

  results
end

.search(query, max_results: DEFAULT_MAX_RESULTS, api_key: ENV.fetch(ENV_KEY, nil)) ⇒ Array<Result>

Fetch results for query and return them as an Array<Result>. Calls are throttled to one per second and circuit-broken for 5 minutes on rate-limit / quota-exhausted responses; see LIMITER. The caller (typically Engines.search) is expected to have already normalized the query and to wrap this in a result cache.

Parameters:

  • query (String)

    search query (already normalized)

  • max_results (Integer) (defaults to: DEFAULT_MAX_RESULTS)

    maximum number of result entries; passed through as Brave’s count (1..20)

  • api_key (String) (defaults to: ENV.fetch(ENV_KEY, nil))

    Brave Search subscription token; defaults to the ENV_KEY environment variable

Returns:

  • (Array<Result>)

    hits, possibly empty when Brave ran the query and matched nothing

Raises:

  • (ArgumentError)

    if no API key is available

  • (Engines::Unavailable)

    when Brave returns HTTP 429 (rate limit / quota exhausted) or 5xx — “try again later” responses the cascade in Engines.search can fall back from. Also raised immediately if LIMITER is in cooldown. Other non-2xx (e.g. 401/403 from a bad API key) bubble up as RuntimeError so config problems stay visible.

  • (RuntimeError)

    for non-rate-limit HTTP failures or when the response shape contains no results.



70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# File 'lib/pikuri/tool/search/brave.rb', line 70

def self.search(query, max_results: DEFAULT_MAX_RESULTS, api_key: ENV.fetch(ENV_KEY, nil))
  raise ArgumentError, "Brave Search API key not set (#{ENV_KEY})" if api_key.to_s.strip.empty?

  LIMITER.call do
    response = Faraday.get(
      ENDPOINT,
      { q: query, count: max_results },
      { 'X-Subscription-Token' => api_key, 'Accept' => 'application/json' }
    )
    unless response.success?
      if response.status == 429 || response.status >= 500
        raise Engines::Unavailable, "HTTP #{response.status}"
      end

      raise "Brave Search request failed: #{response.status} #{response.body}"
    end

    parse(response.body, max_results: max_results)
  end
end