Class: Pikuri::Tool::Search::Exa

Inherits:
Object
  • Object
show all
Defined in:
lib/pikuri/tool/search/exa.rb

Overview

Performs an Exa search via the official /search endpoint and returns the hits as a list of Result rows. Split into a thin HTTP fetch (#search) and a pure parser (.parse) so tests can exercise the parser against fixture JSON without hitting the network. The cascade in Pikuri::Tool::Search::Engines#search owns the final Markdown rendering.

A class constructed with the API key it should use (+Exa.new(api_key:)+); Engines builds one only when an Exa key was configured, so users who haven’t registered never spend money on it, and then drives it through the same #search / #label interface as every other provider. pikuri reads no key from the environment (see CLAUDE.md “Environment is not a secret store”). Get a key at exa.ai — the service is paid.

Calls request type: “auto” (Exa picks neural vs keyword per query) and contents: { highlights: true } so each result carries a short neural-ranked snippet — the closest analog to Brave’s description field, populating Result#body consistently across providers.

Privacy posture

Exa’s Privacy Policy states Query Data is used to improve our products and technology, including by training and fine-tuning models that power our Services, and the Terms of Service §1.2© grant Exa a perpetual and irrevocable, sub-licensable, worldwide license over User Input that can be disclosed to third parties as needed. Business customers under a Master Subscription Agreement / DPA get carve-outs; the default pay-as-you-go API key (which is what pikuri uses) does not.

Bottom line: Exa does not sell queries to data brokers, but it does mine them to train competing models, and the license it claims is effectively “do what we want with this, forever”. If a query would be embarrassing or sensitive in a training set, simply don’t configure an Exa key — Pikuri::Tool::Search::Engines#providers leaves Exa out of the cascade unless its key was supplied to the constructor.

Constant Summary collapse

ENDPOINT =

Returns Search endpoint (POST, JSON body).

Returns:

  • (String)

    Search endpoint (POST, JSON body)

'https://api.exa.ai/search'
DEFAULT_MAX_RESULTS =

Returns default number of results returned, matching DuckDuckGo::DEFAULT_MAX_RESULTS.

Returns:

10
LIMITER =

Returns Exa is paid and doesn’t aggressively throttle, so no minimum interval is enforced. The 5-minute cooldown still applies on Pikuri::Tool::Search::Engines::Unavailable so the user’s budget isn’t burned on doomed retries while a 429 / 5xx condition persists.

Returns:

  • (RateLimiter)

    Exa is paid and doesn’t aggressively throttle, so no minimum interval is enforced. The 5-minute cooldown still applies on Pikuri::Tool::Search::Engines::Unavailable so the user’s budget isn’t burned on doomed retries while a 429 / 5xx condition persists.

RateLimiter.new(min_interval: 0.0, cooldown: 300.0)

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(api_key:) ⇒ Exa

Returns a new instance of Exa.

Parameters:

  • api_key (String)

    Exa API key. Required and non-blank: pikuri reads no key from the environment — the host supplies it (Pikuri::Tool::Search::Engines only constructs an Exa when a key was configured).

Raises:

  • (ArgumentError)

    if api_key is blank



63
64
65
66
67
# File 'lib/pikuri/tool/search/exa.rb', line 63

def initialize(api_key:)
  raise ArgumentError, 'Exa Search API key is blank' if api_key.to_s.strip.empty?

  @api_key = api_key
end

Class Method Details

.parse(json, max_results: DEFAULT_MAX_RESULTS) ⇒ Array<Result>

Parse an Exa Search JSON response into a list of Result rows, where body is the first non-empty highlights snippet (empty when Exa returned no highlight for that result — e.g. for navigational results).

When the response yields zero result entries, two cases are distinguished: a genuine “no results” payload (response carries a requestId and an empty results array — Exa ran the query but matched nothing) returns an empty array instead of raising, so Pikuri::Tool::Search::Engines#search can render its standard no-results stub. Anything else (unknown shape, structured error) raises with a diagnostic so the failure surfaces.

Parameters:

  • json (String)

    response body from ENDPOINT

  • max_results (Integer) (defaults to: DEFAULT_MAX_RESULTS)

    maximum number of result entries

Returns:

  • (Array<Result>)

    hits, possibly empty on a recognized empty-results payload

Raises:

  • (RuntimeError)

    when the response yields no result entries and is not recognized as a genuine empty-results payload



139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
# File 'lib/pikuri/tool/search/exa.rb', line 139

def self.parse(json, max_results: DEFAULT_MAX_RESULTS)
  data = JSON.parse(json)
  results = Array(data['results']).take(max_results).filter_map do |r|
    href = r['url'].to_s
    next nil if href.empty?

    Result.new(
      url: href,
      title: clean(r['title']) || href,
      body: first_highlight(r['highlights'])
    )
  end

  if results.empty?
    return [] if genuine_no_results?(data)

    raise diagnose_empty(data, json)
  end

  results
end

Instance Method Details

#labelString

Returns short provider label for Pikuri::Tool::Search::Engines logging / fallback messages.

Returns:



71
72
73
# File 'lib/pikuri/tool/search/exa.rb', line 71

def label
  'Exa'
end

#search(query, max_results: DEFAULT_MAX_RESULTS) ⇒ Array<Result>

Fetch results for query and return them as an Array<Result>. Calls are circuit-broken for 5 minutes on rate-limit / unavailable responses; see LIMITER. The caller (typically Pikuri::Tool::Search::Engines#search) is expected to have already normalized the query and to wrap this in a result cache.

Parameters:

  • query (String)

    search query (already normalized)

  • max_results (Integer) (defaults to: DEFAULT_MAX_RESULTS)

    maximum number of result entries; passed through as Exa’s numResults

Returns:

  • (Array<Result>)

    hits, possibly empty when Exa ran the query and matched nothing

Raises:

  • (Engines::Unavailable)

    when Exa returns HTTP 429 (rate limit / quota exhausted) or 5xx — “try again later” responses the cascade in Pikuri::Tool::Search::Engines#search can fall back from. Also raised immediately if LIMITER is in cooldown. Other non-2xx (e.g. 401/403 from a bad API key) bubble up as RuntimeError so config problems stay visible.

  • (RuntimeError)

    for non-rate-limit HTTP failures or when the response shape contains no results and isn’t a recognized empty-results payload.



95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
# File 'lib/pikuri/tool/search/exa.rb', line 95

def search(query, max_results: DEFAULT_MAX_RESULTS)
  LIMITER.call do
    response = Faraday.post(ENDPOINT) do |req|
      req.headers['x-api-key'] = @api_key
      req.headers['Content-Type'] = 'application/json'
      req.headers['Accept'] = 'application/json'
      req.body = JSON.dump(
        query: query,
        type: 'auto',
        numResults: max_results,
        contents: { highlights: true }
      )
    end
    unless response.success?
      if response.status == 429 || response.status >= 500
        raise Engines::Unavailable, "HTTP #{response.status}"
      end

      raise "Exa Search request failed: #{response.status} #{response.body}"
    end

    self.class.parse(response.body, max_results: max_results)
  end
end