Class: Html2rss::RequestService::BotasaurusContract
- Inherits:
-
Object
- Object
- Html2rss::RequestService::BotasaurusContract
- Defined in:
- lib/html2rss/request_service/botasaurus_contract.rb
Overview
Maps html2rss request/response handling to the botasaurus-scrape-api contract.
Defined Under Namespace
Classes: ParsedResponse
Constant Summary collapse
- DEFAULT_OPTIONS =
Default Botasaurus scrape options when no explicit config is provided.
{ navigation_mode: 'auto', max_retries: 2, headless: false }.freeze
- OPTION_KEYS =
Allowlisted request.botasaurus keys forwarded to upstream.
%i[ navigation_mode max_retries wait_for_selector wait_timeout_seconds block_images block_images_and_css wait_for_complete_page_load headless proxy user_agent window_size lang ].freeze
Instance Method Summary collapse
-
#initialize(url:, options: {}) ⇒ BotasaurusContract
constructor
A new instance of BotasaurusContract.
- #parse_response(transport_response) ⇒ ParsedResponse
-
#request_payload ⇒ Hash
Payload for POST /scrape.
Constructor Details
#initialize(url:, options: {}) ⇒ BotasaurusContract
Returns a new instance of BotasaurusContract.
128 129 130 131 |
# File 'lib/html2rss/request_service/botasaurus_contract.rb', line 128 def initialize(url:, options: {}) @url = url @options = end |
Instance Method Details
#parse_response(transport_response) ⇒ ParsedResponse
141 142 143 144 145 146 147 148 |
# File 'lib/html2rss/request_service/botasaurus_contract.rb', line 141 def parse_response(transport_response) payload = JSON.parse(transport_response.body.to_s) raise BotasaurusConnectionFailed, 'Botasaurus response must be a JSON object' unless payload.is_a?(Hash) ParsedResponse.new(payload:, transport_status: transport_response.status) rescue JSON::ParserError => error raise BotasaurusConnectionFailed, "Botasaurus response JSON parse failed: #{error.}" end |
#request_payload ⇒ Hash
Returns payload for POST /scrape.
134 135 136 |
# File 'lib/html2rss/request_service/botasaurus_contract.rb', line 134 def request_payload DEFAULT_OPTIONS.merge().merge(url: url.to_s) end |