Module: Woods::Util::HostGuard
- Defined in:
- lib/woods/util/host_guard.rb
Overview
Shared host-header / URL-host canonicalization used by MCP::OriginGuard and the Storage::VectorStore::Qdrant URL validator.
Both components need to reject numeric IPv4 notations that ‘URI` and `getaddrinfo` accept but `IPAddr` does not — hex (`0x7f000001`), bare integer (`2130706433`), octal (`017700000001` or `0177.0.0.1`), short-form (`127.1`), mixed-radix (`0x7f.0.0.1`). Keeping the logic in one place prevents drift between the two defenses (which previously had slightly different regex lists).
Constant Summary collapse
- NUMERIC_HOST_BYPASS =
Non-canonical numeric IPv4 forms that legitimate clients never emit but ‘getaddrinfo` will happily resolve — rejecting the form is safer than trying to intuit the intended IPv4.
Regexp.union( /\A0x[0-9a-f]+\z/, # hex: `0x7f000001` /\A\d+\z/, # bare integer: `2130706433` /\A0[0-7]+\z/, # bare octal: `017700000001` /\A\d+\.\d+\z/, # short-form two-part: `127.1` /\A\d+\.\d+\.\d+\z/ # short-form three-part: `127.0.1` ).freeze
- SUSPICIOUS_OCTET =
Octets inside a four-part dotted form that tag the form as non-canonical: leading zero (octal interpretation), or ‘0x` prefix (hex interpretation).
Regexp.union( /\A0\d+\z/, # leading-zero octal: `0177` /\A0x[0-9a-f]+\z/ # hex octet: `0x7f` ).freeze
Class Method Summary collapse
-
.canonicalize(host) ⇒ String
Canonicalize a host string: downcase, strip port, strip the FQDN trailing dot, drop IPv6 brackets.
-
.suspicious_numeric_host?(canonical) ⇒ Boolean
Does this canonicalized host smuggle a private IP via a notation that ‘IPAddr.new` won’t parse? Callers should reject any match rather than try to resolve it.
Class Method Details
.canonicalize(host) ⇒ String
Canonicalize a host string: downcase, strip port, strip the FQDN trailing dot, drop IPv6 brackets. Returns a plain host.
41 42 43 |
# File 'lib/woods/util/host_guard.rb', line 41 def canonicalize(host) host.to_s.downcase.sub(/:\d+\z/, '').sub(/\.\z/, '').delete('[]') end |
.suspicious_numeric_host?(canonical) ⇒ Boolean
Does this canonicalized host smuggle a private IP via a notation that ‘IPAddr.new` won’t parse? Callers should reject any match rather than try to resolve it.
51 52 53 54 55 56 57 58 |
# File 'lib/woods/util/host_guard.rb', line 51 def suspicious_numeric_host?(canonical) return true if canonical.match?(NUMERIC_HOST_BYPASS) four_octet = canonical.match(/\A(\w+)\.(\w+)\.(\w+)\.(\w+)\z/) return false unless four_octet four_octet.captures.any? { |octet| octet.match?(SUSPICIOUS_OCTET) } end |