Tavily

CI Gem Version

A lightweight, dependency-free Ruby client for the Tavily API — the web-access layer built for LLMs and AI agents.

It wraps every Tavily endpoint with typed response objects, automatic retries, streaming research, and a granular error hierarchy:

  • Searchsearch, plus the qna_search and search_context helpers
  • Extract — clean content from one or many URLs
  • Crawl — follow links from a root URL
  • Map — discover a site's URL structure
  • Research — start, poll, or live-stream an asynchronous research task

Built entirely on the Ruby standard library (net/http) — no runtime dependencies.

Installation

Add it to your Gemfile:

gem "tavily"

Then run bundle install. Or install it directly:

gem install tavily

Requires Ruby 3.1+.

Quick start

require "tavily"

client = Tavily::Client.new(api_key: "tvly-YOUR_API_KEY")

response = client.search("who won the 2022 FIFA World Cup?", include_answer: true)

response.answer            # => "Argentina won the 2022 FIFA World Cup..."
response.results.first.url # => "https://en.wikipedia.org/wiki/..."
response.results.first.title
response.credits           # credit usage, when include_usage: true

Get your API key from the Tavily dashboard.

Configuration

You can configure a global default (used by Tavily.search and friends, and as the default for every new client):

Tavily.configure do |config|
  config.api_key          = ENV["TAVILY_API_KEY"] # default: ENV["TAVILY_API_KEY"]
  config.base_url         = "https://api.tavily.com"
  config.timeout          = 60   # read timeout (seconds)
  config.open_timeout     = 10   # connection-open timeout (seconds)
  config.max_retries      = 2    # automatic retries for transient failures
  config.retry_base_delay = 0.5  # base seconds for exponential backoff
  config.proxy            = nil  # "http://user:pass@host:port"
  config.ca_file          = nil  # path to a PEM CA bundle (see "Windows / TLS")
  config.logger           = Logger.new($stdout) # optional request logging
end

Tavily.search("latest ruby release")

Every option can also be overridden per client:

client = Tavily::Client.new(api_key: "tvly-...", timeout: 120, max_retries: 5)

The following environment variables are read automatically: TAVILY_API_KEY, TAVILY_BASE_URL, TAVILY_TIMEOUT, TAVILY_MAX_RETRIES, TAVILY_PROXY, and SSL_CERT_FILE.

Endpoints

response = client.search(
  "embedded systems news",
  topic: "news",            # "general" (default), "news", or "finance"
  search_depth: "advanced", # "basic" (default), "advanced", "fast", "ultra-fast"
  max_results: 10,          # 0–20 (default 5)
  time_range: "week",       # "day" | "week" | "month" | "year"
  include_answer: "advanced",   # true/false, "basic", or "advanced"
  include_raw_content: "markdown",
  include_images: true,
  include_domains: ["arxiv.org"],
  exclude_domains: ["example.com"],
  include_usage: true
)

response.query
response.answer
response.results        # => [Tavily::SearchResult, ...]
response.results.first.title
response.results.first.content
response.results.first.score
response.urls           # => convenience array of result URLs
response.images         # => [Tavily::Image, ...] (url + optional description)
response.credits        # => Integer (when include_usage: true)
response.request_id

qna_search — just the answer

client.qna_search("what is the capital of France?")
# => "Paris is the capital of France."

search_context — a RAG-ready context string

Returns a JSON string of [{ "url" =>, "content" => }, ...], trimmed to roughly max_tokens tokens.

context = client.search_context("ruby concurrency", max_tokens: 4000)

Extract

response = client.extract(
  ["https://docs.tavily.com/welcome", "https://example.com"],
  extract_depth: "advanced", # "basic" (default) or "advanced"
  format: "markdown",        # "markdown" (default) or "text"
  include_images: true,
  include_usage: true
)

response.results        # => [Tavily::ExtractResult, ...]
response.results.first.raw_content
response.failed_results # => [Tavily::FailedResult, ...] (url + error)

A single URL string works too: client.extract("https://example.com"). Up to 20 URLs per request.

Crawl

response = client.crawl(
  "https://docs.tavily.com",
  instructions: "Find all pages about pricing", # natural-language guidance
  max_depth: 2,        # 1–5
  max_breadth: 50,     # links per level (1–500)
  limit: 100,          # total links to process
  select_paths: ["/documentation/.*"], # regex allowlist
  exclude_paths: ["/blog/.*"],         # regex blocklist
  extract_depth: "basic"
)

response.base_url
response.results # => [Tavily::CrawlResult, ...] (url, raw_content, favicon)

Map

response = client.map("https://docs.tavily.com", max_depth: 2, limit: 100)

response.base_url
response.results # => ["https://docs.tavily.com/...", ...] (array of URLs)

Research (asynchronous)

Research is an asynchronous endpoint. Start a task, then either poll for the result or stream it live.

Create + poll:

task = client.research(
  "Compare the leading Ruby HTTP clients in 2025",
  model: "mini",            # "mini", "pro", or "auto" (default)
  output_length: "standard" # "short", "standard", or "long"
)

task.request_id
task.status        # => "pending"

# Block until it finishes (raises on failure / timeout):
result = client.wait_for_research(task.request_id, poll_interval: 3, timeout: 600)
result.content     # => the final report (String, or Hash if output_schema given)
result.sources     # => [Tavily::ResearchSource, ...] (title, url, favicon)

# Or check once, yourself:
client.research_task(task.request_id).completed?

Stream live (pass a block to receive Server-Sent Events). Each event is a Tavily::ResearchEvent with an .event name and parsed .data. The stream is OpenAI-compatible: events are chat.completion.chunks whose choices/0/delta carries either report text (content) or a research tool_calls step, ending with a done event (or an error event on failure):

client.research("Latest developments in fusion energy", model: "mini") do |event|
  case event.event
  when "chat.completion.chunk"
    delta = event.data.dig("choices", 0, "delta")
    print delta["content"] if delta && delta["content"]
  when "error"
    warn event.data["error"]
  when "done"
    puts "\n[done]"
  end
end

The gem yields every event exactly as the API sends it, so it keeps working even if Tavily changes the event schema.

You can also request structured output with a JSON Schema:

client.research(
  "Top 3 EVs by range in 2025",
  output_schema: {
    "type" => "object",
    "properties" => { "cars" => { "type" => "array", "items" => { "type" => "string" } } },
    "required" => ["cars"]
  }
)

Response objects

Every endpoint returns a typed object that wraps the raw JSON. Declared fields are exposed as methods, and the full payload is always reachable:

response = client.search("ruby")

response.results.first.title  # typed accessor
response["request_id"]        # raw access by key (String or Symbol)
response.dig("usage", "credits")
response.to_h                 # the complete parsed Hash

Because access falls through to the raw hash, any new field Tavily adds is reachable immediately — and any new request parameter can be passed through as a keyword argument without waiting for a gem update:

client.search("ruby", some_new_param: true) # forwarded straight to the API

Error handling

Non-2xx responses raise a subclass of Tavily::APIError, which carries the HTTP status, parsed body, and request id:

begin
  client.search("ruby")
rescue Tavily::RateLimitError => e
  retry_later
rescue Tavily::APIError => e
  e.status      # => 401
  e.message     # => "[401] Unauthorized: missing or invalid API key."
  e.request_id  # => "..." (quote this in support tickets)
  e.body        # => parsed response body
end
Exception Status Meaning
Tavily::BadRequestError 400 Invalid request or parameter value
Tavily::AuthenticationError 401 Missing or invalid API key
Tavily::ForbiddenError 403 Not permitted (e.g. unsupported URL)
Tavily::NotFoundError 404 Resource not found
Tavily::UnprocessableEntityError 422 Request body failed validation
Tavily::RateLimitError 429 Rate limit exceeded (honors Retry-After)
Tavily::PlanLimitError 432 Plan/key credit quota exceeded
Tavily::PayAsYouGoLimitError 433 Pay-as-you-go limit exceeded
Tavily::ServerError 5xx Tavily server-side error

Tavily::PlanLimitError and Tavily::PayAsYouGoLimitError share the ancestor Tavily::UsageLimitError. Network problems raise Tavily::TimeoutError or Tavily::ConnectionError, and a missing key raises Tavily::ConfigurationError. Everything ultimately descends from Tavily::Error.

Retries

Transient failures (HTTP 408/409/425/429/5xx and network timeouts) are retried automatically up to max_retries times with exponential backoff and jitter. On a 429, the Retry-After header is respected. Streaming research requests are not retried.

Windows / TLS certificates

Some Windows Ruby builds (including MSVC builds) ship without a usable default OpenSSL certificate store, which causes certificate verify failed (unable to get local issuer certificate). Point the client at a CA bundle:

Tavily.configure { |c| c.ca_file = 'C:\path\to\cacert.pem' }

Or set it for the whole process before requiring the gem:

# A bundle ships with Git for Windows, for example:
set SSL_CERT_FILE=C:\Program Files\Git\mingw64\etc\ssl\certs\ca-bundle.crt

You can also download an up-to-date bundle from https://curl.se/ca/cacert.pem.

Development

bin/setup            # bundle install + create .env
bundle exec rake     # run the specs and RuboCop
bin/console          # an IRB session with the gem loaded

The default rake task runs RSpec and RuboCop. The offline suite uses WebMock and makes no network calls.

To run the live integration suite (consumes credits):

TAVILY_LIVE=1 TAVILY_API_KEY=tvly-... bundle exec rspec spec/live_spec.rb

Contributing

Bug reports and pull requests are welcome at https://github.com/main-path/tavily.

License

Released under the MIT License.