synoppy (Ruby)
Give your AI agents the whole web. Synoppy is the web-data layer for AI agents — one key to read, crawl, map, extract, classify & enrich any site, plus screenshots and image scraping. Pure net/http, zero runtime dependencies.
Get a free key → · Docs · synoppy.com
gem install synoppy
Quickstart
require "synoppy"
client = Synoppy::Client.new(api_key: ENV["SYNOPPY_API_KEY"])
# Read any URL -> clean markdown
page = client.read("https://stripe.com/blog", formats: ["markdown"])
puts page["markdown"]
# Crawl a site
site = client.crawl("https://example.com", limit: 25)
puts "#{site["count"]} of #{site["discovered"]} pages"
# AI structured extraction
result = client.extract("https://news.ycombinator.com", prompt: "Return { title, summary, topics }")
p result["data"]
# Brand intelligence
brand = client.enrich("linear.app")
p brand["colors"], brand["fonts"], brand["socials"]
Client.new also accepts base_url: (defaults to https://synoppy.com) and timeout: (seconds).
Methods
read(url, formats:, only_main_content:, timeout_ms:, render:, wait_ms:) — alias scrape
POST /api/scrape. Read a URL into clean markdown / html / text.
page = client.read(
"https://stripe.com/blog",
formats: ["markdown", "html"], # "markdown" | "html" | "text"
only_main_content: true, # strip nav/boilerplate
timeout_ms: 15_000, # per-request fetch budget
render: "auto", # true | false | "auto" — headless browser
wait_ms: 500 # extra wait after load before capture
)
page["markdown"]
page["metadata"]["title"] # title, description, language, siteName, author,
# ogImage, sourceUrl, statusCode, wordCount,
# fetchedAt, rendered, bytesIn
page["renderMs"] # present when rendered
page["latencyMs"]
screenshot(url, full_page:, wait_ms:, timeout_ms:)
POST /api/screenshot. Capture a PNG screenshot, returned as a data URL.
shot = client.screenshot("https://stripe.com", full_page: true, wait_ms: 500)
shot["screenshot"] # "data:image/png;base64,..."
shot["sourceUrl"]
shot["statusCode"]
shot["fullPage"]
May raise Synoppy::Error with code RENDER_UNAVAILABLE (HTTP 503) when the render backend is down.
crawl(url, limit:)
POST /api/crawl. Crawl a site into one clean page per discovered URL (requires a key). limit is 1–25.
site = client.crawl("https://example.com", limit: 25)
site["domain"]
site["discovered"]
site["count"]
site["pages"] # [{ "url", "title", "markdown", "words" }]
site["credits"]
map(url) — alias sitemap
POST /api/map. Discover every URL on a domain.
m = client.map("https://example.com")
m["urls"] # string[]
m["count"]
m["source"] # "sitemap" | "links"
m["domain"]
extract(url, prompt:)
POST /api/extract. AI-structured JSON extraction (requires a key). instruction: is accepted as an alias for prompt:.
result = client.extract("https://news.ycombinator.com", prompt: "Return { title, summary, topics }")
result["data"]
result["model"]
result["metadata"]
result["truncated"]
result["usage"] # { "inputTokens", "outputTokens" }
classify(url, labels:)
POST /api/classify. Classify a company by industry, or against your own labels (requires a key).
# Default: industry taxonomy
c = client.classify("https://stripe.com")
c["data"]["industry"]
c["data"]["naics_code"] # also naics_title, naics_sector, naics_sector_title, naics_valid
c["data"]["sic_code"] # also sic_title, sic_division, sic_division_title, sic_valid
c["data"]["categories"]
c["data"]["confidence"]
# Labels mode
c = client.classify("https://stripe.com", labels: ["fintech", "ecommerce", "social"])
c["data"]["label"]
c["data"]["matched"]
c["data"]["confidence"]
c["data"]["reasoning"]
enrich(url = nil, domain:, email:) — alias brand
POST /api/brand. Resolve a brand into a full profile. Pass exactly one of a positional url, domain:, or email: (a work email is mapped to its domain).
brand = client.enrich("linear.app")
brand = client.enrich(domain: "linear.app")
brand = client.enrich(email: "jane@linear.app")
brand["name"]
brand["description"]
brand["logo"]
brand["colors"] # string[]
brand["fonts"] # string[]
brand["address"]
brand["socials"] # [{ "label", "url" }]
brand["domain"]
images(url)
POST /api/images. Pull every image off a page.
imgs = client.images("https://stripe.com")
imgs["count"]
imgs["images"] # [{ "src", "alt", "width", "height" }]
Coming soon
act is not live yet — the method exists but raises NotImplementedError. It will map to /api/act once that endpoint ships.
Credits
Every successful response includes creditsUsed (number) and creditsRemaining (number or nil) at the top level, on every method:
page = client.read("https://stripe.com")
puts "used #{page["creditsUsed"]}, #{page["creditsRemaining"]} remaining"
shot = client.screenshot("https://stripe.com")
puts "used #{shot["creditsUsed"]}, #{shot["creditsRemaining"]} remaining"
Errors
begin
client.crawl("https://example.com")
rescue Synoppy::Error => e
warn "#{e.code} #{e.status}: #{e.}"
end
MIT licensed.