Module: Rubino::Security::UrlSafety
- Defined in:
- lib/rubino/security/url_safety.rb
Overview
SSRF (Server-Side Request Forgery) guard for outbound HTTP(S) requests made by web tools (webfetch / websearch).
Ported from Hermes’ ‘tools/url_safety.py`. The threat model: a malicious prompt, skill, or fetched page tricks the agent into requesting an internal resource — cloud metadata endpoints (169.254.169.254 / IMDS), localhost services, or private-network hosts — to exfiltrate instance credentials or pivot inside the network.
The guard, mirroring Hermes:
* rejects any scheme that is not http/https (W-3),
* RESOLVES the hostname and blocks if ANY answer is loopback,
private, link-local (incl. IMDS), CGNAT, reserved, multicast,
unique-local, or unspecified,
* keeps an always-blocked floor of cloud-metadata IPs/hostnames that
fire even for literal-IP URLs,
* rejects secrets embedded in the URL (userinfo / token query params),
* is DNS-rebinding aware: it returns the resolved IPs so the caller
can PIN the connection to an address that was actually validated,
instead of re-resolving (and trusting) the hostname at connect time,
* is applied on EVERY redirect hop (the caller re-validates each
Location target rather than trusting it).
Fails closed: DNS failures, parse errors, and unexpected exceptions block the request.
Defined Under Namespace
Classes: BlockedURLError
Constant Summary collapse
- ALLOWED_SCHEMES =
%w[http https].freeze
- BLOCKED_HOSTNAMES =
Hostnames blocked regardless of DNS resolution — cloud metadata endpoints an attacker could use to steal instance credentials.
%w[ metadata.google.internal metadata.goog ].freeze
- ALWAYS_BLOCKED_NETWORKS =
Networks blocked regardless of any toggle: the link-local range (where every cloud’s metadata service lives) and its IPv4-mapped IPv6 form. These have no legitimate agent target.
[ IPAddr.new("169.254.0.0/16"), # AWS/GCP/Azure/DO/Oracle IMDS + ECS task metadata IPAddr.new("100.100.100.200/32"), # Alibaba Cloud metadata IPAddr.new("fd00:ec2::254/128") # AWS metadata (IPv6) ].freeze
- BLOCKED_NETWORKS =
Ranges blocked as non-public. ipaddr’s loopback?/private?/link_local? cover most of these, but CGNAT (RFC 6598) and the benchmark range are not flagged by those predicates, so we list ranges explicitly to keep parity with Hermes and avoid relying on stdlib predicate coverage.
[ # IPv4 IPAddr.new("0.0.0.0/8"), # "this network" / unspecified IPAddr.new("10.0.0.0/8"), # private IPAddr.new("100.64.0.0/10"), # CGNAT / shared address space (RFC 6598) IPAddr.new("127.0.0.0/8"), # loopback IPAddr.new("169.254.0.0/16"), # link-local (incl. IMDS) IPAddr.new("172.16.0.0/12"), # private IPAddr.new("192.0.0.0/24"), # IETF protocol assignments IPAddr.new("192.0.2.0/24"), # TEST-NET-1 IPAddr.new("192.168.0.0/16"), # private IPAddr.new("198.18.0.0/15"), # benchmarking IPAddr.new("198.51.100.0/24"), # TEST-NET-2 IPAddr.new("203.0.113.0/24"), # TEST-NET-3 IPAddr.new("224.0.0.0/4"), # multicast IPAddr.new("240.0.0.0/4"), # reserved IPAddr.new("255.255.255.255/32"), # broadcast # IPv6 IPAddr.new("::/128"), # unspecified IPAddr.new("::1/128"), # loopback IPAddr.new("::ffff:0:0/96"), # IPv4-mapped (checked via embedded v4 too) IPAddr.new("64:ff9b::/96"), # NAT64 IPAddr.new("100::/64"), # discard-only IPAddr.new("2001:db8::/32"), # documentation IPAddr.new("fc00::/7"), # unique-local IPAddr.new("fe80::/10") # link-local ].freeze
Class Method Summary collapse
-
.always_blocked?(value) ⇒ Boolean
True when the literal IP / hostname is in the always-blocked floor (cloud metadata).
-
.safe?(url) ⇒ Boolean
Boolean wrapper for call sites that only need a yes/no (e.g. search backend selection).
-
.validate!(url) ⇒ Object
Validate a URL for outbound fetching.
Class Method Details
.always_blocked?(value) ⇒ Boolean
True when the literal IP / hostname is in the always-blocked floor (cloud metadata). Used to reject literal-IP metadata targets even before DNS resolution. Never raises.
120 121 122 123 124 125 126 127 128 129 130 |
# File 'lib/rubino/security/url_safety.rb', line 120 def always_blocked?(value) host = normalize_host(value) return true if BLOCKED_HOSTNAMES.include?(host) ip = safe_ipaddr(host) return false if ip.nil? ip_always_blocked?(ip) rescue StandardError false end |
.safe?(url) ⇒ Boolean
Boolean wrapper for call sites that only need a yes/no (e.g. search backend selection). Never raises.
110 111 112 113 114 115 |
# File 'lib/rubino/security/url_safety.rb', line 110 def safe?(url) validate!(url) true rescue BlockedURLError, StandardError false end |
.validate!(url) ⇒ Object
Validate a URL for outbound fetching. Returns a frozen Hash:
{ uri:, host:, port:, addresses: [validated IP strings] }
so the caller can connect to a pinned, already-validated IP. Raises BlockedURLError (with a safe message) on any violation.
94 95 96 97 98 99 100 101 102 103 104 105 106 |
# File 'lib/rubino/security/url_safety.rb', line 94 def validate!(url) uri = parse(url) assert_scheme!(uri) assert_no_secrets!(uri) host = normalize_host(uri.host) raise BlockedURLError, "Blocked: URL has no host" if host.empty? assert_hostname_not_blocked!(host) addresses = resolve_and_check!(host) { uri: uri, host: host, port: uri.port, addresses: addresses }.freeze end |