Class: Radioactive::Fetcher
- Inherits:
-
Object
- Object
- Radioactive::Fetcher
- Defined in:
- lib/radioactive/fetcher.rb
Constant Summary collapse
- REDIRECT_STATUSES =
[301, 302, 303, 307, 308].freeze
- RESERVED_HEADERS =
%w[host user-agent accept-encoding].freeze
- CHUNK_SIZE =
16 * 1024
- DEFAULT_USER_AGENT =
"Radioactive/#{Radioactive::VERSION}"- CONNECT_FALLBACK_ERRORS =
Connect-phase transport failures that justify trying the next pinned candidate. Anything else (TLS handshake failure, read timeout, post-connect ECONNRESET) is terminal — the server engaged with us, so silently retrying against a different IP would mask real problems and risks duplicating side-effecting requests in a future non-GET world.
[ Errno::EHOSTUNREACH, Errno::ENETUNREACH, Errno::ECONNREFUSED, Net::OpenTimeout ].freeze
- NUMERIC_ONLY_HOST =
Single-label hosts that are entirely digits or 0x-prefix hex are not valid RFC 1123 hostnames; they’re SSRF-bypass attempts that some libc getaddrinfo implementations historically resolved as IPs.
/\A(\d+|0x[\da-f]+)\z/i- HEADER_INVALID_CHAR =
CRLF and NUL are illegal in HTTP header names and values (RFC 9110); caller-supplied input containing these is a header-injection attempt.
/[\r\n\0]/- DEFAULTS =
{ schemes: %w[http https].freeze, max_size: 2_097_152, open_timeout: 5, read_timeout: 10, total_timeout: 30, max_redirects: 3, accept_encoding: "identity", user_agent: DEFAULT_USER_AGENT, private_ranges: AddressCheck::DEFAULT_PRIVATE_RANGES, allow_private: false, allow_credentials: false, headers: {}.freeze }.freeze
Instance Method Summary collapse
- #fetch(url, **call_opts) ⇒ Object
-
#initialize(**opts) ⇒ Fetcher
constructor
A new instance of Fetcher.
-
#open(url, **call_opts) ⇒ Object
No-block form returns a StringIO of the fully-buffered body (size-capped at max_size; matches ‘URI.open` semantics).
Constructor Details
#initialize(**opts) ⇒ Fetcher
Returns a new instance of Fetcher.
54 55 56 57 58 59 |
# File 'lib/radioactive/fetcher.rb', line 54 def initialize(**opts) validate_opts!(opts) @opts = DEFAULTS.merge(opts) @resolver = opts[:resolver] || Resolv @clock = opts[:clock] || MonotonicClock end |
Instance Method Details
#fetch(url, **call_opts) ⇒ Object
61 62 63 64 65 66 67 68 69 70 71 72 |
# File 'lib/radioactive/fetcher.rb', line 61 def fetch(url, **call_opts) body = String.new(capacity: CHUNK_SIZE) = run_streaming(url, call_opts) { |chunk| body << chunk } Result.new( url: [:url], final_url: [:final_url], status: [:status], headers: [:headers], body: body, hops: [:hops] ) end |
#open(url, **call_opts) ⇒ Object
No-block form returns a StringIO of the fully-buffered body (size-capped at max_size; matches ‘URI.open` semantics). Block form streams chunks straight to a Tempfile and yields it rewound, so peak memory per fetch is ~CHUNK_SIZE rather than max_size — useful for high-concurrency or low-RAM callers.
78 79 80 81 82 83 84 85 86 87 88 89 90 91 |
# File 'lib/radioactive/fetcher.rb', line 78 def open(url, **call_opts) return StringIO.new(fetch(url, **call_opts).body) unless block_given? io = Tempfile.new("radioactive") io.binmode begin run_streaming(url, call_opts) { |chunk| io.write(chunk) } io.rewind yield io ensure io.close io.unlink end end |