Class: ContextDev::Resources::Web

Inherits:

Object

Object
ContextDev::Resources::Web

show all

Defined in:: lib/context_dev/resources/web.rb

Instance Method Summary collapse

#extract_fonts(direct_url: nil, domain: nil, timeout_ms: nil, request_options: {}) ⇒ ContextDev::Models::WebExtractFontsResponse

Some parameter documentations has been truncated, see Models::WebExtractFontsParams for more details.
#extract_styleguide(direct_url: nil, domain: nil, timeout_ms: nil, request_options: {}) ⇒ ContextDev::Models::WebExtractStyleguideResponse

Some parameter documentations has been truncated, see Models::WebExtractStyleguideParams for more details.
#initialize(client:) ⇒ Web constructor private

A new instance of Web.
#screenshot(direct_url: nil, domain: nil, full_screenshot: nil, page: nil, prioritize: nil, request_options: {}) ⇒ ContextDev::Models::WebScreenshotResponse

Some parameter documentations has been truncated, see Models::WebScreenshotParams for more details.
#web_crawl_md(url:, follow_subdomains: nil, include_images: nil, include_links: nil, max_depth: nil, max_pages: nil, shorten_base64_images: nil, url_regex: nil, use_main_content_only: nil, request_options: {}) ⇒ ContextDev::Models::WebWebCrawlMdResponse

Some parameter documentations has been truncated, see Models::WebWebCrawlMdParams for more details.
#web_scrape_html(url:, max_age_ms: nil, request_options: {}) ⇒ ContextDev::Models::WebWebScrapeHTMLResponse

Some parameter documentations has been truncated, see Models::WebWebScrapeHTMLParams for more details.
#web_scrape_images(url:, request_options: {}) ⇒ ContextDev::Models::WebWebScrapeImagesResponse

Scrapes all images from the given URL.
#web_scrape_md(url:, include_images: nil, include_links: nil, max_age_ms: nil, shorten_base64_images: nil, use_main_content_only: nil, request_options: {}) ⇒ ContextDev::Models::WebWebScrapeMdResponse

Some parameter documentations has been truncated, see Models::WebWebScrapeMdParams for more details.
#web_scrape_sitemap(domain:, max_links: nil, request_options: {}) ⇒ ContextDev::Models::WebWebScrapeSitemapResponse

Some parameter documentations has been truncated, see Models::WebWebScrapeSitemapParams for more details.

Constructor Details

#initialize(client:) ⇒ `Web`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Returns a new instance of Web.

Parameters:

client (ContextDev::Client)



270
271
272

# File 'lib/context_dev/resources/web.rb', line 270

def initialize(client:)
  @client = client
end

Instance Method Details

#extract_fonts(direct_url: nil, domain: nil, timeout_ms: nil, request_options: {}) ⇒ `ContextDev::Models::WebExtractFontsResponse`

Some parameter documentations has been truncated, see Models::WebExtractFontsParams for more details.

Scrape font information from a website including font families, usage statistics, fallbacks, and element/word counts.

Parameters:

direct_url (String) —

A specific URL to fetch fonts from directly, bypassing domain resolution (e.g.,
domain (String) —

Domain name to extract fonts from (e.g., ‘example.com’, ‘google.com’). The domai
timeout_ms (Integer) —

Optional timeout in milliseconds for the request. If the request takes longer th
request_options (ContextDev::RequestOptions, Hash{Symbol=>Object}, nil)

Returns:

(ContextDev::Models::WebExtractFontsResponse)

See Also:

Models::WebExtractFontsParams

# File 'lib/context_dev/resources/web.rb', line 25

def extract_fonts(params = {})
  parsed, options = ContextDev::WebExtractFontsParams.dump_request(params)
  query = ContextDev::Internal::Util.encode_query_params(parsed)
  @client.request(
    method: :get,
    path: "web/fonts",
    query: query.transform_keys(direct_url: "directUrl", timeout_ms: "timeoutMS"),
    model: ContextDev::Models::WebExtractFontsResponse,
    options: options
  )
end

#extract_styleguide(direct_url: nil, domain: nil, timeout_ms: nil, request_options: {}) ⇒ `ContextDev::Models::WebExtractStyleguideResponse`

Some parameter documentations has been truncated, see Models::WebExtractStyleguideParams for more details.

Extract a comprehensive design system from a website including colors, typography, spacing, shadows, and UI components.

Parameters:

direct_url (String) —

A specific URL to fetch the styleguide from directly, bypassing domain resolutio
domain (String) —

Domain name to extract styleguide from (e.g., ‘example.com’, ‘google.com’). The
timeout_ms (Integer) —

Optional timeout in milliseconds for the request. If the request takes longer th
request_options (ContextDev::RequestOptions, Hash{Symbol=>Object}, nil)

Returns:

(ContextDev::Models::WebExtractStyleguideResponse)

See Also:

Models::WebExtractStyleguideParams

# File 'lib/context_dev/resources/web.rb', line 56

def extract_styleguide(params = {})
  parsed, options = ContextDev::WebExtractStyleguideParams.dump_request(params)
  query = ContextDev::Internal::Util.encode_query_params(parsed)
  @client.request(
    method: :get,
    path: "web/styleguide",
    query: query.transform_keys(direct_url: "directUrl", timeout_ms: "timeoutMS"),
    model: ContextDev::Models::WebExtractStyleguideResponse,
    options: options
  )
end

#screenshot(direct_url: nil, domain: nil, full_screenshot: nil, page: nil, prioritize: nil, request_options: {}) ⇒ `ContextDev::Models::WebScreenshotResponse`

Some parameter documentations has been truncated, see Models::WebScreenshotParams for more details.

Capture a screenshot of a website.

Parameters:

direct_url (String) —

A specific URL to screenshot directly, bypassing domain resolution (e.g., ‘https
domain (String) —

Domain name to take screenshot of (e.g., ‘example.com’, ‘google.com’). The domai
full_screenshot (Symbol, ContextDev::Models::WebScreenshotParams::FullScreenshot) —

Optional parameter to determine screenshot type. If ‘true’, takes a full page sc
page (Symbol, ContextDev::Models::WebScreenshotParams::Page) —

Optional parameter to specify which page type to screenshot. If provided, the sy
prioritize (Symbol, ContextDev::Models::WebScreenshotParams::Prioritize) —

Optional parameter to prioritize screenshot capture. If ‘speed’, optimizes for f
request_options (ContextDev::RequestOptions, Hash{Symbol=>Object}, nil)

Returns:

(ContextDev::Models::WebScreenshotResponse)

See Also:

Models::WebScreenshotParams

# File 'lib/context_dev/resources/web.rb', line 90

def screenshot(params = {})
  parsed, options = ContextDev::WebScreenshotParams.dump_request(params)
  query = ContextDev::Internal::Util.encode_query_params(parsed)
  @client.request(
    method: :get,
    path: "web/screenshot",
    query: query.transform_keys(direct_url: "directUrl", full_screenshot: "fullScreenshot"),
    model: ContextDev::Models::WebScreenshotResponse,
    options: options
  )
end

#web_crawl_md(url:, follow_subdomains: nil, include_images: nil, include_links: nil, max_depth: nil, max_pages: nil, shorten_base64_images: nil, url_regex: nil, use_main_content_only: nil, request_options: {}) ⇒ `ContextDev::Models::WebWebCrawlMdResponse`

Some parameter documentations has been truncated, see Models::WebWebCrawlMdParams for more details.

Performs a crawl starting from a given URL, extracts page content as Markdown, and returns results for all crawled pages.

Parameters:

url (String) —

The starting URL for the crawl (must include http:// or https:// protocol)
follow_subdomains (Boolean) —

When true, follow links on subdomains of the starting URL’s domain (e.g. docs.ex
include_images (Boolean) —

Include image references in the Markdown output
include_links (Boolean) —

Preserve hyperlinks in the Markdown output
max_depth (Integer) —

Maximum link depth from the starting URL (0 = only the starting page)
max_pages (Integer) —

Maximum number of pages to crawl. Hard cap: 500.
shorten_base64_images (Boolean) —

Truncate base64-encoded image data in the Markdown output
url_regex (String) —

Regex pattern. Only URLs matching this pattern will be followed and scraped.
use_main_content_only (Boolean) —

Extract only the main content, stripping headers, footers, sidebars, and navigat
request_options (ContextDev::RequestOptions, Hash{Symbol=>Object}, nil)

Returns:

(ContextDev::Models::WebWebCrawlMdResponse)

See Also:

Models::WebWebCrawlMdParams

# File 'lib/context_dev/resources/web.rb', line 133

def web_crawl_md(params)
  parsed, options = ContextDev::WebWebCrawlMdParams.dump_request(params)
  @client.request(
    method: :post,
    path: "web/crawl",
    body: parsed,
    model: ContextDev::Models::WebWebCrawlMdResponse,
    options: options
  )
end

#web_scrape_html(url:, max_age_ms: nil, request_options: {}) ⇒ `ContextDev::Models::WebWebScrapeHTMLResponse`

Some parameter documentations has been truncated, see Models::WebWebScrapeHTMLParams for more details.

Scrapes the given URL and returns the raw HTML content of the page.

Parameters:

url (String) —

Full URL to scrape (must include http:// or https:// protocol)
max_age_ms (Integer) —

Return a cached result if a prior scrape for the same parameters exists and is y
request_options (ContextDev::RequestOptions, Hash{Symbol=>Object}, nil)

Returns:

(ContextDev::Models::WebWebScrapeHTMLResponse)

See Also:

Models::WebWebScrapeHTMLParams

# File 'lib/context_dev/resources/web.rb', line 160

def web_scrape_html(params)
  parsed, options = ContextDev::WebWebScrapeHTMLParams.dump_request(params)
  query = ContextDev::Internal::Util.encode_query_params(parsed)
  @client.request(
    method: :get,
    path: "web/scrape/html",
    query: query.transform_keys(max_age_ms: "maxAgeMs"),
    model: ContextDev::Models::WebWebScrapeHTMLResponse,
    options: options
  )
end

#web_scrape_images(url:, request_options: {}) ⇒ `ContextDev::Models::WebWebScrapeImagesResponse`

Scrapes all images from the given URL. Extracts images from img, svg, picture/source, link, and video elements including inline SVGs, base64 data URIs, and standard URLs.

Parameters:

url (String) —

Full URL to scrape images from (must include http:// or https:// protocol)
request_options (ContextDev::RequestOptions, Hash{Symbol=>Object}, nil)

Returns:

(ContextDev::Models::WebWebScrapeImagesResponse)

See Also:

Models::WebWebScrapeImagesParams

# File 'lib/context_dev/resources/web.rb', line 185

def web_scrape_images(params)
  parsed, options = ContextDev::WebWebScrapeImagesParams.dump_request(params)
  query = ContextDev::Internal::Util.encode_query_params(parsed)
  @client.request(
    method: :get,
    path: "web/scrape/images",
    query: query,
    model: ContextDev::Models::WebWebScrapeImagesResponse,
    options: options
  )
end

#web_scrape_md(url:, include_images: nil, include_links: nil, max_age_ms: nil, shorten_base64_images: nil, use_main_content_only: nil, request_options: {}) ⇒ `ContextDev::Models::WebWebScrapeMdResponse`

Some parameter documentations has been truncated, see Models::WebWebScrapeMdParams for more details.

Scrapes the given URL into LLM usable Markdown.

Parameters:

url (String) —

Full URL to scrape into LLM usable Markdown (must include http:// or https:// pr
include_images (Boolean) —

Include image references in Markdown output
include_links (Boolean) —

Preserve hyperlinks in Markdown output
max_age_ms (Integer) —

Return a cached result if a prior scrape for the same parameters exists and is y
shorten_base64_images (Boolean) —

Shorten base64-encoded image data in the Markdown output
use_main_content_only (Boolean) —

Extract only the main content of the page, excluding headers, footers, sidebars,
request_options (ContextDev::RequestOptions, Hash{Symbol=>Object}, nil)

Returns:

(ContextDev::Models::WebWebScrapeMdResponse)

See Also:

Models::WebWebScrapeMdParams

# File 'lib/context_dev/resources/web.rb', line 221

def web_scrape_md(params)
  parsed, options = ContextDev::WebWebScrapeMdParams.dump_request(params)
  query = ContextDev::Internal::Util.encode_query_params(parsed)
  @client.request(
    method: :get,
    path: "web/scrape/markdown",
    query: query.transform_keys(
      include_images: "includeImages",
      include_links: "includeLinks",
      max_age_ms: "maxAgeMs",
      shorten_base64_images: "shortenBase64Images",
      use_main_content_only: "useMainContentOnly"
    ),
    model: ContextDev::Models::WebWebScrapeMdResponse,
    options: options
  )
end

#web_scrape_sitemap(domain:, max_links: nil, request_options: {}) ⇒ `ContextDev::Models::WebWebScrapeSitemapResponse`

Some parameter documentations has been truncated, see Models::WebWebScrapeSitemapParams for more details.

Crawl an entire website’s sitemap and return all discovered page URLs.

Parameters:

domain (String) —

Domain to build a sitemap for
max_links (Integer) —

Maximum number of links to return from the sitemap crawl. Defaults to 10,000. Mi
request_options (ContextDev::RequestOptions, Hash{Symbol=>Object}, nil)

Returns:

(ContextDev::Models::WebWebScrapeSitemapResponse)

See Also:

Models::WebWebScrapeSitemapParams

# File 'lib/context_dev/resources/web.rb', line 255

def web_scrape_sitemap(params)
  parsed, options = ContextDev::WebWebScrapeSitemapParams.dump_request(params)
  query = ContextDev::Internal::Util.encode_query_params(parsed)
  @client.request(
    method: :get,
    path: "web/scrape/sitemap",
    query: query.transform_keys(max_links: "maxLinks"),
    model: ContextDev::Models::WebWebScrapeSitemapResponse,
    options: options
  )
end

Class: ContextDev::Resources::Web

Instance Method Summary collapse

Constructor Details

#initialize(client:) ⇒ Web

Instance Method Details

#extract_fonts(direct_url: nil, domain: nil, timeout_ms: nil, request_options: {}) ⇒ ContextDev::Models::WebExtractFontsResponse

#extract_styleguide(direct_url: nil, domain: nil, timeout_ms: nil, request_options: {}) ⇒ ContextDev::Models::WebExtractStyleguideResponse

#screenshot(direct_url: nil, domain: nil, full_screenshot: nil, page: nil, prioritize: nil, request_options: {}) ⇒ ContextDev::Models::WebScreenshotResponse

#web_crawl_md(url:, follow_subdomains: nil, include_images: nil, include_links: nil, max_depth: nil, max_pages: nil, shorten_base64_images: nil, url_regex: nil, use_main_content_only: nil, request_options: {}) ⇒ ContextDev::Models::WebWebCrawlMdResponse

#web_scrape_html(url:, max_age_ms: nil, request_options: {}) ⇒ ContextDev::Models::WebWebScrapeHTMLResponse

#web_scrape_images(url:, request_options: {}) ⇒ ContextDev::Models::WebWebScrapeImagesResponse

#web_scrape_md(url:, include_images: nil, include_links: nil, max_age_ms: nil, shorten_base64_images: nil, use_main_content_only: nil, request_options: {}) ⇒ ContextDev::Models::WebWebScrapeMdResponse

#web_scrape_sitemap(domain:, max_links: nil, request_options: {}) ⇒ ContextDev::Models::WebWebScrapeSitemapResponse

#initialize(client:) ⇒ `Web`

#extract_fonts(direct_url: nil, domain: nil, timeout_ms: nil, request_options: {}) ⇒ `ContextDev::Models::WebExtractFontsResponse`

#extract_styleguide(direct_url: nil, domain: nil, timeout_ms: nil, request_options: {}) ⇒ `ContextDev::Models::WebExtractStyleguideResponse`

#screenshot(direct_url: nil, domain: nil, full_screenshot: nil, page: nil, prioritize: nil, request_options: {}) ⇒ `ContextDev::Models::WebScreenshotResponse`

#web_crawl_md(url:, follow_subdomains: nil, include_images: nil, include_links: nil, max_depth: nil, max_pages: nil, shorten_base64_images: nil, url_regex: nil, use_main_content_only: nil, request_options: {}) ⇒ `ContextDev::Models::WebWebCrawlMdResponse`

#web_scrape_html(url:, max_age_ms: nil, request_options: {}) ⇒ `ContextDev::Models::WebWebScrapeHTMLResponse`

#web_scrape_images(url:, request_options: {}) ⇒ `ContextDev::Models::WebWebScrapeImagesResponse`

#web_scrape_md(url:, include_images: nil, include_links: nil, max_age_ms: nil, shorten_base64_images: nil, use_main_content_only: nil, request_options: {}) ⇒ `ContextDev::Models::WebWebScrapeMdResponse`

#web_scrape_sitemap(domain:, max_links: nil, request_options: {}) ⇒ `ContextDev::Models::WebWebScrapeSitemapResponse`