synoppy (Ruby)

Gem

Give your AI agents the whole web. Synoppy is the web-data layer for AI agents — one key to read, crawl, map, extract, classify & enrich any site, plus screenshots and image scraping. Pure net/http, zero runtime dependencies.

Get a free key → · Docs · synoppy.com

gem install synoppy

Quickstart

require "synoppy"

client = Synoppy::Client.new(api_key: ENV["SYNOPPY_API_KEY"])

# Read any URL -> clean markdown
page = client.read("https://stripe.com/blog", formats: ["markdown"])
puts page["markdown"]

# Crawl a site
site = client.crawl("https://example.com", limit: 25)
puts "#{site["count"]} of #{site["discovered"]} pages"

# AI structured extraction
result = client.extract("https://news.ycombinator.com", prompt: "Return { title, summary, topics }")
p result["data"]

# Brand intelligence
brand = client.enrich("linear.app")
p brand["colors"], brand["fonts"], brand["socials"]

Client.new also accepts base_url: (defaults to https://synoppy.com) and timeout: (seconds).

Methods

read(url, formats:, only_main_content:, timeout_ms:, render:, wait_ms:) — alias scrape

POST /api/scrape. Read a URL into clean markdown / html / text.

page = client.read(
  "https://stripe.com/blog",
  formats: ["markdown", "html"],   # "markdown" | "html" | "text"
  only_main_content: true,         # strip nav/boilerplate
  timeout_ms: 15_000,              # per-request fetch budget
  render: "auto",                  # true | false | "auto" — headless browser
  wait_ms: 500                     # extra wait after load before capture
)
page["markdown"]
page["metadata"]["title"]          # title, description, language, siteName, author,
                                   # ogImage, sourceUrl, statusCode, wordCount,
                                   # fetchedAt, rendered, bytesIn
page["renderMs"]                   # present when rendered
page["latencyMs"]

screenshot(url, full_page:, wait_ms:, timeout_ms:)

POST /api/screenshot. Capture a PNG screenshot, returned as a data URL.

shot = client.screenshot("https://stripe.com", full_page: true, wait_ms: 500)
shot["screenshot"]   # "data:image/png;base64,..."
shot["sourceUrl"]
shot["statusCode"]
shot["fullPage"]

May raise Synoppy::Error with code RENDER_UNAVAILABLE (HTTP 503) when the render backend is down.

crawl(url, limit:)

POST /api/crawl. Crawl a site into one clean page per discovered URL (requires a key). limit is 1–25.

site = client.crawl("https://example.com", limit: 25)
site["domain"]
site["discovered"]
site["count"]
site["pages"]        # [{ "url", "title", "markdown", "words" }]
site["credits"]

map(url) — alias sitemap

POST /api/map. Discover every URL on a domain.

m = client.map("https://example.com")
m["urls"]            # string[]
m["count"]
m["source"]          # "sitemap" | "links"
m["domain"]

extract(url, prompt:)

POST /api/extract. AI-structured JSON extraction (requires a key). instruction: is accepted as an alias for prompt:.

result = client.extract("https://news.ycombinator.com", prompt: "Return { title, summary, topics }")
result["data"]
result["model"]
result["metadata"]
result["truncated"]
result["usage"]      # { "inputTokens", "outputTokens" }

classify(url, labels:)

POST /api/classify. Classify a company by industry, or against your own labels (requires a key).

# Default: industry taxonomy
c = client.classify("https://stripe.com")
c["data"]["industry"]
c["data"]["naics_code"]      # also naics_title, naics_sector, naics_sector_title, naics_valid
c["data"]["sic_code"]        # also sic_title, sic_division, sic_division_title, sic_valid
c["data"]["categories"]
c["data"]["confidence"]

# Labels mode
c = client.classify("https://stripe.com", labels: ["fintech", "ecommerce", "social"])
c["data"]["label"]
c["data"]["matched"]
c["data"]["confidence"]
c["data"]["reasoning"]

enrich(url = nil, domain:, email:) — alias brand

POST /api/brand. Resolve a brand into a full profile. Pass exactly one of a positional url, domain:, or email: (a work email is mapped to its domain).

brand = client.enrich("linear.app")
brand = client.enrich(domain: "linear.app")
brand = client.enrich(email: "jane@linear.app")

brand["name"]
brand["description"]
brand["logo"]
brand["colors"]      # string[]
brand["fonts"]       # string[]
brand["address"]
brand["socials"]     # [{ "label", "url" }]
brand["domain"]

images(url)

POST /api/images. Pull every image off a page.

imgs = client.images("https://stripe.com")
imgs["count"]
imgs["images"]       # [{ "src", "alt", "width", "height" }]

Coming soon

act is not live yet — the method exists but raises NotImplementedError. It will map to /api/act once that endpoint ships.

Credits

Every successful response includes creditsUsed (number) and creditsRemaining (number or nil) at the top level, on every method:

page = client.read("https://stripe.com")
puts "used #{page["creditsUsed"]}, #{page["creditsRemaining"]} remaining"

shot = client.screenshot("https://stripe.com")
puts "used #{shot["creditsUsed"]}, #{shot["creditsRemaining"]} remaining"

Errors

begin
  client.crawl("https://example.com")
rescue Synoppy::Error => e
  warn "#{e.code} #{e.status}: #{e.message}"
end

MIT licensed.