Module: Rubino::Security::UrlSafety

Defined in:
lib/rubino/security/url_safety.rb

Overview

SSRF (Server-Side Request Forgery) guard for outbound HTTP(S) requests made by web tools (webfetch / websearch).

Ported from Hermes’ ‘tools/url_safety.py`. The threat model: a malicious prompt, skill, or fetched page tricks the agent into requesting an internal resource — cloud metadata endpoints (169.254.169.254 / IMDS), localhost services, or private-network hosts — to exfiltrate instance credentials or pivot inside the network.

The guard, mirroring Hermes:

* rejects any scheme that is not http/https (W-3),
* RESOLVES the hostname and blocks if ANY answer is loopback,
  private, link-local (incl. IMDS), CGNAT, reserved, multicast,
  unique-local, or unspecified,
* keeps an always-blocked floor of cloud-metadata IPs/hostnames that
  fire even for literal-IP URLs,
* rejects secrets embedded in the URL (userinfo / token query params),
* is DNS-rebinding aware: it returns the resolved IPs so the caller
  can PIN the connection to an address that was actually validated,
  instead of re-resolving (and trusting) the hostname at connect time,
* is applied on EVERY redirect hop (the caller re-validates each
  Location target rather than trusting it).

Fails closed: DNS failures, parse errors, and unexpected exceptions block the request.

Defined Under Namespace

Classes: BlockedURLError

Constant Summary collapse

ALLOWED_SCHEMES =
%w[http https].freeze
BLOCKED_HOSTNAMES =

Hostnames blocked regardless of DNS resolution — cloud metadata endpoints an attacker could use to steal instance credentials.

%w[
  metadata.google.internal
  metadata.goog
].freeze
ALWAYS_BLOCKED_NETWORKS =

Networks blocked regardless of any toggle: the link-local range (where every cloud’s metadata service lives) and its IPv4-mapped IPv6 form. These have no legitimate agent target.

[
  IPAddr.new("169.254.0.0/16"),       # AWS/GCP/Azure/DO/Oracle IMDS + ECS task metadata
  IPAddr.new("100.100.100.200/32"),   # Alibaba Cloud metadata
  IPAddr.new("fd00:ec2::254/128")     # AWS metadata (IPv6)
].freeze
BLOCKED_NETWORKS =

Ranges blocked as non-public. ipaddr’s loopback?/private?/link_local? cover most of these, but CGNAT (RFC 6598) and the benchmark range are not flagged by those predicates, so we list ranges explicitly to keep parity with Hermes and avoid relying on stdlib predicate coverage.

[
  # IPv4
  IPAddr.new("0.0.0.0/8"),            # "this network" / unspecified
  IPAddr.new("10.0.0.0/8"),           # private
  IPAddr.new("100.64.0.0/10"),        # CGNAT / shared address space (RFC 6598)
  IPAddr.new("127.0.0.0/8"),          # loopback
  IPAddr.new("169.254.0.0/16"),       # link-local (incl. IMDS)
  IPAddr.new("172.16.0.0/12"),        # private
  IPAddr.new("192.0.0.0/24"),         # IETF protocol assignments
  IPAddr.new("192.0.2.0/24"),         # TEST-NET-1
  IPAddr.new("192.168.0.0/16"),       # private
  IPAddr.new("198.18.0.0/15"),        # benchmarking
  IPAddr.new("198.51.100.0/24"),      # TEST-NET-2
  IPAddr.new("203.0.113.0/24"),       # TEST-NET-3
  IPAddr.new("224.0.0.0/4"),          # multicast
  IPAddr.new("240.0.0.0/4"),          # reserved
  IPAddr.new("255.255.255.255/32"),   # broadcast
  # IPv6
  IPAddr.new("::/128"),               # unspecified
  IPAddr.new("::1/128"),              # loopback
  IPAddr.new("::ffff:0:0/96"),        # IPv4-mapped (checked via embedded v4 too)
  IPAddr.new("64:ff9b::/96"),         # NAT64
  IPAddr.new("100::/64"),             # discard-only
  IPAddr.new("2001:db8::/32"),        # documentation
  IPAddr.new("fc00::/7"),             # unique-local
  IPAddr.new("fe80::/10")             # link-local
].freeze

Class Method Summary collapse

Class Method Details

.always_blocked?(value) ⇒ Boolean

True when the literal IP / hostname is in the always-blocked floor (cloud metadata). Used to reject literal-IP metadata targets even before DNS resolution. Never raises.

Returns:

  • (Boolean)


120
121
122
123
124
125
126
127
128
129
130
# File 'lib/rubino/security/url_safety.rb', line 120

def always_blocked?(value)
  host = normalize_host(value)
  return true if BLOCKED_HOSTNAMES.include?(host)

  ip = safe_ipaddr(host)
  return false if ip.nil?

  ip_always_blocked?(ip)
rescue StandardError
  false
end

.safe?(url) ⇒ Boolean

Boolean wrapper for call sites that only need a yes/no (e.g. search backend selection). Never raises.

Returns:

  • (Boolean)


110
111
112
113
114
115
# File 'lib/rubino/security/url_safety.rb', line 110

def safe?(url)
  validate!(url)
  true
rescue BlockedURLError, StandardError
  false
end

.validate!(url) ⇒ Object

Validate a URL for outbound fetching. Returns a frozen Hash:

{ uri:, host:, port:, addresses: [validated IP strings] }

so the caller can connect to a pinned, already-validated IP. Raises BlockedURLError (with a safe message) on any violation.

Raises:



94
95
96
97
98
99
100
101
102
103
104
105
106
# File 'lib/rubino/security/url_safety.rb', line 94

def validate!(url)
  uri = parse(url)
  assert_scheme!(uri)
  assert_no_secrets!(uri)

  host = normalize_host(uri.host)
  raise BlockedURLError, "Blocked: URL has no host" if host.empty?

  assert_hostname_not_blocked!(host)
  addresses = resolve_and_check!(host)

  { uri: uri, host: host, port: uri.port, addresses: addresses }.freeze
end