Crawlora Ruby SDK
Ruby client for the public Crawlora web-scraping API. It wraps every public endpoint with grouped helpers and a dynamic call interface, plus retries, pagination, middleware hooks, and client-side rate limiting.
- Base URL:
https://api.crawlora.net/api/v1 - Auth: API key (
x-api-key) or JWT (Authorization) - Ruby: 3.0+
- Operation reference:
docs/operations.md· recipes:docs/recipes.md
Install
Published on RubyGems. The current release
is a prerelease (1.5.0.pre.sdk.3), so install it with --pre or pin the
version:
# Gemfile
gem "crawlora", "1.5.0.pre.sdk.3"
gem install crawlora --pre
Quick start
require "crawlora"
# Reads CRAWLORA_API_KEY from the environment if api_key: is omitted.
client = Crawlora.client(api_key: ENV["CRAWLORA_API_KEY"])
result = client.bing.search(q: "web scraping")
result["data"].each { |item| puts item["title"] }
client.close # release pooled keep-alive connections
Pass a block to auto-close:
Crawlora.client do |client|
puts client.amazon.product(asin: "B07FZ8S74R", language: "en_US")
end
Calling operations
Grouped helpers map directly to the API (client.<group>.<method>):
client.youtube.video(id: "dQw4w9WgXcQ")
client.google.search(q: "crawlora", country: "US")
Or call any operation dynamically by its id — handy for metaprogramming:
client.request("bing-search", { q: "web scraping", page: 2 })
# Discover operations:
Crawlora::OPERATION_COUNT # => total operations
Crawlora::GROUPS["bing"] # => { "search" => "bing-search", ... }
Crawlora::OperationId::BING_SEARCH # => "bing-search"
Configuration
Crawlora.client(
api_key: "…",
timeout: 30, # seconds per request
retries: 2, # retry attempts on retryable failures
retry_delay: 0.25, # base backoff (exponential + jitter, honors Retry-After)
request_id: true, # attach an x-request-id to every call
idempotency_keys: true, # stable Idempotency-Key on POST/PATCH
rate_limit: 5, # max requests/second (client-side)
max_concurrency: 4, # max in-flight requests across threads
headers: { "X-Tenant" => "acme" }
)
Per-request overrides use the reserved _-prefixed keyword on grouped calls, or
keyword args on request:
client.bing.search(q: "x", _timeout: 5, _headers: { "X-Trace" => "1" })
client.request("bing-search", { q: "x" }, response_type: "text", retries: 0)
Pagination
# Numeric (page/offset) — stops on the first empty page:
client.paginate_items("airbnb-room-reviews", { id: "123" }, max_pages: 5).each do |review|
puts review["text"]
end
# Cursor mode — supply the cursor param and a next-cursor extractor:
client.paginate("producthunt-leaderboard", {},
cursor_param: "cursor", next_cursor: ->(page) { page["next_cursor"] }) do |page|
puts page["data"]
end
Error handling
begin
client.bing.search(q: "x")
rescue Crawlora::ClientError => e # 4xx
warn "rejected (#{e.status}): #{e.} #{e.code}"
rescue Crawlora::ServerError => e # 5xx
warn "server error: #{e.status}"
rescue Crawlora::NetworkError => e # timeout / transport failure
warn "network: #{e.}"
end
All inherit from Crawlora::Error, which exposes status, code, body,
raw_body, headers, and request_id.
License
MIT. See LICENSE.