Module: Rubino::Config::Validator
- Defined in:
- lib/rubino/config/validator.rb
Overview
Set-time schema validation for ‘config set` (#327). The Defaults hash is the authoritative schema: a key the schema doesn’t know, or a value whose type/format can’t match the seeded default, is REJECTED at write time with a clear ConfigurationError (and a non-zero exit via ConfigCommand) instead of being persisted with a green ✓ and only blowing up later — as a runtime crash or a deterministic provider 4xx the agent then retries for ~85s.
Two checks, intentionally narrow (false positives would block legitimate config more than the original bug):
* unknown key — the path doesn't exist in Defaults AND isn't under an
open-map section (providers.<name>, quick_commands, …)
where arbitrary child keys are expected.
* type / format — when the schema seeds a NON-nil scalar default at the
leaf, the coerced value must be the same coarse type
(numeric / boolean / string). A nil default carries no
type, so its leaf is type-unconstrained. A *_url leaf
additionally must parse as an http(s) URL.
Constant Summary collapse
- PROVIDER_TEMPLATE =
A per-provider leaf (providers.<name>.<leaf>) is type-checked against the openai provider template, the canonical OpenAI-compatible provider shape, so providers.minimax.request_timeout_seconds “soon” is still rejected.
"openai"- EXTRA_TOP_LEVEL_SECTIONS =
Top-level sections that are REAL config namespaces but intentionally not seeded in Defaults (e.g. read at point-of-use with a fallback, or kept unseeded on purpose). They are valid roots, so a key under them is not an “unknown key” even though Defaults has no entry. The unknown-key check is deliberately SHALLOW — it only rejects a typo’d TOP-LEVEL section (the ‘foo.bar.baz` footgun from #327) — because the schema’s leaves are intentionally incomplete (model.api_key, display.reasoning, …), so a deeper check would reject legitimate-but-unseeded keys. Type/format checks still apply to every known leaf regardless.
* sessions — read at point-of-use (Commands::Handlers::Sessions reads sessions.list_limit) with a fallback, not seeded. * oauth — read at boot (OAuth::Registry reads oauth.providers.*) with a fallback, not seeded; without it `config set oauth.providers.github.client_id …` was rejected with the misleading "not a config section" typo error (H3). %w[sessions oauth].freeze
- RANGES =
Closed numeric ranges for the obvious bounded keys (#392b). A value that type-checks as a number but falls outside its range used to be persisted with a green ✓ (e.g. ‘model.temperature 9.9`) and only manifested as a provider 4xx or nonsensical behaviour at call time. Keyed by the LEAF name so the same bound applies wherever the key appears (model.* and the per-provider/aux mirrors). Bounds are inclusive.
Keyed by FULL dotted path when the leaf name is ambiguous: the bare ‘threshold` is a 0..1 ratio for compression.threshold but an identical-call COUNT (>= 2) for doom_loop.threshold (#414) — keying the count by leaf name applied the 0..1 ratio bound to it, rejecting the shipped default of 5. #range_for tries the full path first, then the leaf.
{ "temperature" => (0.0..2.0), "threshold" => (0.0..1.0), "target_ratio" => (0.0..1.0), "doom_loop.threshold" => (2..Float::INFINITY) }.freeze
- POSITIVE_INT_LEAVES =
Leaves that must be a POSITIVE integer when set: a turn/iteration cap of 0 or negative is nonsense (the turn could never run a single iteration) and used to persist with a green ✓ then silently degrade to “unbounded” / the default at runtime. Keyed by leaf name. A nil (clearing the key) is left to the type-unconstrained nil-default path.
%w[max_turns max_tool_iterations].freeze
- ENUMS =
Leaves whose value must be one of a fixed, known SET of strings (#392b follow-up). A garbage value used to persist with a green ✓ then have the runtime fail SAFE to a default (e.g. confirm_policy → :dangerous_only), so the ✓ LIED about what was stored. Reject an unknown value at set time with a clear message naming the valid choices, mirroring the model.temperature range check. Keyed by leaf name so the same set applies wherever the key appears. A nil (clearing the key) is allowed —it restores the built-in default. The other enum-ish leaves with a SMALL, FIXED, statically-known value set where garbage silently degraded at runtime, swept in alongside confirm_policy:
* approvals.mode — ApprovalPolicy::MODES (manual|auto|skip); an unknown mode fell through to the manual fallback. * thinking.effort — ReasoningPrefs::EFFORTS (off|low|medium|high); an unknown effort resolved to nil → the medium default.NOT swept (avoid false positives): memory.backend (pluggable — its set is registered at runtime and already rejected with an actionable error at use time) and jobs.mode (treated as inline vs “anything else”, not a closed set).
Keyed by FULL dotted path when the leaf name is ambiguous (approvals.mode vs the unconstrained jobs.mode), else by leaf name. #check_enum! tries the full path first, then the leaf.
{ "security.confirm_policy" => %w[dangerous_only confirm_all], "approvals.mode" => %w[manual auto skip], "thinking.effort" => %w[off low medium high] }.freeze
- MCP_SERVER_ARRAY_LEAVES =
Leaves that MUST be a list (JSON/YAML array) when set. ‘mcp.servers` is an open map (defaults to {} with no per-server template), so its array leaves resolve to a :__absent__ default and check_type! skips them — a `config set mcp.servers.x.args “run server”` was accepted with a green ✓and stored as a scalar string, only rejected later at MCP startup (Manager#validate_stdio_args!, #499). Reject a non-array at set time so the user learns immediately to use the JSON-array syntax (#420), e.g. `config set mcp.servers.x.args ’[“run”,“server”]‘`. Matched STRUCTURALLY (mcp.servers.<name>.<leaf>) since the server name is dynamic. A nil (clearing the key) is allowed. Keyed by the leaf name.
%w[args].freeze
- REMOVED_KEYS =
Config keys that were REMOVED and are no longer honored (item 7). A config.yml that still carries one isn’t an “unknown key” (its top-level section is real) and isn’t a type error, so the generic checks miss it —and the old key would be silently ignored. Map the FULL dotted path to a bespoke warning that names the removed key and its replacement, surfaced at load + in ‘rubino doctor` so the user migrates instead of believing a stale setting still takes effect.
{ "security.require_confirmation_for_shell" => "`security.require_confirmation_for_shell` is removed — use " \ "`security.confirm_policy` (dangerous_only|confirm_all)" }.freeze
Class Method Summary collapse
-
.check_enum!(key_path, keys, value) ⇒ Object
An ENUM leaf must be one of its known values when set.
-
.check_mcp_array!(key_path, keys, value) ⇒ Object
An ‘mcp.servers.<name>.args` leaf must be a list when set (#499).
-
.check_positive_int!(key_path, keys, value) ⇒ Object
A turn/iteration cap leaf must be a POSITIVE integer when set.
-
.check_range!(key_path, keys, value) ⇒ Object
Range-check a bounded numeric leaf (#392b).
- .check_type!(key_path, keys, value, default) ⇒ Object
-
.check_url_format!(key_path, keys, value) ⇒ Object
A *_url / base_url leaf must be a real http(s) URL when a non-empty value is given — exactly the providers.<name>.base_url “not a url” footgun from #327, which otherwise persisted fine and only failed as a connection error at the first model call.
- .coarse_type(value) ⇒ Object
- .dig_default(keys) ⇒ Object
-
.each_leaf(node, prefix = [], &block) ⇒ Object
Yields [keys_array, leaf_value] for every scalar/array leaf in a nested config hash.
-
.known_top_level?(section) ⇒ Boolean
A top-level section is known when Defaults seeds it or it’s one of the documented intentionally-unseeded namespaces.
-
.leaf_default(keys) ⇒ Object
The seeded default at this exact path, or the sentinel :__absent__ when the schema has no such leaf.
-
.range_for(keys) ⇒ Object
The RANGE for a leaf: full dotted path first (so an ambiguous leaf name like ‘threshold` gets its path-specific bound), then the bare leaf name.
- .reject_unknown_key!(key_path, keys) ⇒ Object
-
.seeded_default?(keys, value) ⇒ Boolean
True when
valueatkeysequals the schema’s seeded default — i.e. - .valid_http_url?(str) ⇒ Boolean
- .validate!(key_path, keys, value) ⇒ Object
-
.warnings(raw) ⇒ Object
NON-raising counterpart of #validate!, run at LOAD time over a HAND-EDITED config (F8).
Class Method Details
.check_enum!(key_path, keys, value) ⇒ Object
An ENUM leaf must be one of its known values when set. A garbage value used to persist with a green ✓ then have the runtime quietly fall back to a (often SAFE-but-different) default — so the ✓ misrepresented what took effect. Reject it with a clear message naming the valid choices. A nil (clearing the key) is allowed: it restores the built-in default.
347 348 349 350 351 352 353 354 355 356 357 358 |
# File 'lib/rubino/config/validator.rb', line 347 def check_enum!(key_path, keys, value) allowed = ENUMS[keys.join(".")] || ENUMS[keys.last.to_s] return unless allowed coerced = Writer.coerce_value(value) return if coerced.nil? return if allowed.include?(coerced.to_s) raise ConfigurationError, "invalid value for '#{key_path}': #{value.inspect} — must be one of " \ "#{allowed.map(&:inspect).join(", ")}" end |
.check_mcp_array!(key_path, keys, value) ⇒ Object
An ‘mcp.servers.<name>.args` leaf must be a list when set (#499). `mcp.servers` is an open map with no per-server template, so the leaf has a :__absent__ default and check_type! skips it — a scalar string was accepted with a green ✓ and only rejected later at MCP startup. Reject it HERE, at set time, pointing at the JSON-array syntax (#420). Matched structurally (mcp.servers.<name>.<leaf>) since the server name is dynamic. A nil (clearing the key) is allowed.
330 331 332 333 334 335 336 337 338 339 340 |
# File 'lib/rubino/config/validator.rb', line 330 def check_mcp_array!(key_path, keys, value) return unless keys.length >= 4 && keys[0] == "mcp" && keys[1] == "servers" return unless MCP_SERVER_ARRAY_LEAVES.include?(keys.last.to_s) coerced = Writer.coerce_value(value) return if coerced.nil? || coerced.is_a?(Array) raise ConfigurationError, "invalid value for '#{key_path}': expected a list, got #{value.inspect}. " \ "Use a JSON array, e.g. config set #{key_path} '[\"run\", \"server\"]'" end |
.check_positive_int!(key_path, keys, value) ⇒ Object
A turn/iteration cap leaf must be a POSITIVE integer when set. A 0 or negative cap is nonsense (the turn could never run a single iteration); it used to persist with a green ✓ then silently degrade to unbounded / the default at runtime. A nil (clearing the key, “nil”/“null”) is allowed — it means “use the built-in default”.
311 312 313 314 315 316 317 318 319 320 321 |
# File 'lib/rubino/config/validator.rb', line 311 def check_positive_int!(key_path, keys, value) return unless POSITIVE_INT_LEAVES.include?(keys.last.to_s) coerced = Writer.coerce_value(value) return if coerced.nil? return if coerced.is_a?(Numeric) && coerced.positive? && coerced == coerced.to_i raise ConfigurationError, "invalid value for '#{key_path}': #{value.inspect} — must be a positive " \ "integer (a 0 or negative cap would never let the turn run)" end |
.check_range!(key_path, keys, value) ⇒ Object
Range-check a bounded numeric leaf (#392b). Only applies when the leaf name has a declared RANGE and the coerced value is numeric — a non-number is already caught by check_type!, and a nil (clearing the key) is left to the type-unconstrained nil-default path. Out of range is a hard reject with a clear message + non-zero exit, like the other set-time footguns.
287 288 289 290 291 292 293 294 295 296 297 298 |
# File 'lib/rubino/config/validator.rb', line 287 def check_range!(key_path, keys, value) range = range_for(keys) return unless range coerced = Writer.coerce_value(value) return unless coerced.is_a?(Numeric) return if range.cover?(coerced) raise ConfigurationError, "invalid value for '#{key_path}': #{coerced} is out of range " \ "(expected #{range.begin}..#{range.end})" end |
.check_type!(key_path, keys, value, default) ⇒ Object
241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 |
# File 'lib/rubino/config/validator.rb', line 241 def check_type!(key_path, keys, value, default) coerced = Writer.coerce_value(value) # A leaf with a declared RANGE is numeric by definition, so enforce its # type even when the seeded default is nil (e.g. model.temperature is now # nil to inherit the provider default, #414 — but `temperature banana` # must still be rejected, not silently stored as a string). A nil clears # the key and is allowed. if range_for(keys) return if coerced.nil? || coerced.is_a?(Numeric) raise ConfigurationError, "invalid value for '#{key_path}': expected number, got #{value.inspect}" end # Otherwise a nil default carries no type signal; leave it unconstrained. return if default.nil? expected = coarse_type(default) actual = coarse_type(coerced) return if expected == actual # Numbers written as strings already coerced; an int is a fine float. return if expected == :number && actual == :number raise ConfigurationError, "invalid value for '#{key_path}': expected #{expected} " \ "(default #{default.inspect}), got #{value.inspect}" end |
.check_url_format!(key_path, keys, value) ⇒ Object
A *_url / base_url leaf must be a real http(s) URL when a non-empty value is given — exactly the providers.<name>.base_url “not a url” footgun from #327, which otherwise persisted fine and only failed as a connection error at the first model call.
364 365 366 367 368 369 370 371 372 373 374 |
# File 'lib/rubino/config/validator.rb', line 364 def check_url_format!(key_path, keys, value) leaf = keys.last.to_s return unless leaf == "base_url" || leaf.end_with?("_url") str = value.to_s.strip return if str.empty? || %w[nil null].include?(str.downcase) return if valid_http_url?(str) raise ConfigurationError, "invalid value for '#{key_path}': '#{value}' is not a valid http(s) URL" end |
.coarse_type(value) ⇒ Object
270 271 272 273 274 275 276 277 278 279 280 |
# File 'lib/rubino/config/validator.rb', line 270 def coarse_type(value) case value when Numeric then :number when true, false then :boolean when String then :string when nil then :nil when Array then :array when Hash then :hash else :other end end |
.dig_default(keys) ⇒ Object
215 216 217 218 219 220 221 222 223 |
# File 'lib/rubino/config/validator.rb', line 215 def dig_default(keys) node = Defaults::MODULE_DEFAULTS keys.each do |k| return :__absent__ unless node.is_a?(Hash) && node.key?(k) node = node[k] end node end |
.each_leaf(node, prefix = [], &block) ⇒ Object
Yields [keys_array, leaf_value] for every scalar/array leaf in a nested config hash. Open-map sections (providers.<name>, quick_commands, …) are walked the same way; the unknown-key check is intentionally shallow (top-level only) so their arbitrary child keys are not flagged.
193 194 195 196 197 198 199 200 201 202 |
# File 'lib/rubino/config/validator.rb', line 193 def each_leaf(node, prefix = [], &block) node.each do |k, v| keys = prefix + [k.to_s] if v.is_a?(Hash) each_leaf(v, keys, &block) else block.call(keys, v) end end end |
.known_top_level?(section) ⇒ Boolean
A top-level section is known when Defaults seeds it or it’s one of the documented intentionally-unseeded namespaces. Everything settable hangs off one of these, so an UNKNOWN first segment is the typo we reject.
236 237 238 239 |
# File 'lib/rubino/config/validator.rb', line 236 def known_top_level?(section) Defaults::MODULE_DEFAULTS.key?(section) || EXTRA_TOP_LEVEL_SECTIONS.include?(section) end |
.leaf_default(keys) ⇒ Object
The seeded default at this exact path, or the sentinel :__absent__ when the schema has no such leaf. providers.<name>.<leaf> resolves against the openai template so a custom provider’s known leaves still type-check.
207 208 209 210 211 212 213 |
# File 'lib/rubino/config/validator.rb', line 207 def leaf_default(keys) value = dig_default(keys) return value unless value == :__absent__ return value unless keys.length >= 3 && keys.first == "providers" dig_default(["providers", PROVIDER_TEMPLATE, *keys[2..]]) end |
.range_for(keys) ⇒ Object
The RANGE for a leaf: full dotted path first (so an ambiguous leaf name like ‘threshold` gets its path-specific bound), then the bare leaf name.
302 303 304 |
# File 'lib/rubino/config/validator.rb', line 302 def range_for(keys) RANGES[keys.join(".")] || RANGES[keys.last.to_s] end |
.reject_unknown_key!(key_path, keys) ⇒ Object
225 226 227 228 229 230 231 |
# File 'lib/rubino/config/validator.rb', line 225 def reject_unknown_key!(key_path, keys) return if known_top_level?(keys.first) raise ConfigurationError, "unknown config key '#{key_path}': '#{keys.first}' is not a config " \ "section. Run 'rubino config show' to see the valid sections" end |
.seeded_default?(keys, value) ⇒ Boolean
True when value at keys equals the schema’s seeded default — i.e. an untouched leaf, not a user edit. :__absent__ (no such default) is never a match, so an unknown key is still flagged.
182 183 184 185 186 187 |
# File 'lib/rubino/config/validator.rb', line 182 def seeded_default?(keys, value) default = leaf_default(keys) return false if default == :__absent__ default == value end |
.valid_http_url?(str) ⇒ Boolean
376 377 378 379 380 381 |
# File 'lib/rubino/config/validator.rb', line 376 def valid_http_url?(str) uri = URI.parse(str) uri.is_a?(URI::HTTP) && !uri.host.to_s.empty? rescue URI::InvalidURIError false end |
.validate!(key_path, keys, value) ⇒ Object
129 130 131 132 133 134 135 136 137 138 |
# File 'lib/rubino/config/validator.rb', line 129 def validate!(key_path, keys, value) default = leaf_default(keys) reject_unknown_key!(key_path, keys) if default == :__absent__ check_type!(key_path, keys, value, default) unless default == :__absent__ check_mcp_array!(key_path, keys, value) check_range!(key_path, keys, value) check_positive_int!(key_path, keys, value) check_enum!(key_path, keys, value) check_url_format!(key_path, keys, value) end |
.warnings(raw) ⇒ Object
NON-raising counterpart of #validate!, run at LOAD time over a HAND-EDITED config (F8). #validate! only fires at ‘config set`, so a config.yml edited by hand with an unknown key or a wrong-typed value loaded SILENTLY and only blew up later (a runtime crash, or a provider 4xx the agent retries for ~85s). This walks every leaf of the RAW on-disk hash and returns a list of human-readable WARNING strings (the same checks #validate! makes) — surfaced at boot and by `rubino doctor`, never a crash. raw is the user’s config.yml as loaded (Loader#raw_config), NOT merged with defaults (defaults are valid by construction).
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
# File 'lib/rubino/config/validator.rb', line 149 def warnings(raw) return [] unless raw.is_a?(Hash) out = [] each_leaf(raw) do |keys, value| # A REMOVED key (item 7) is flagged regardless of its value — it is no # longer honored, so we warn the moment it is PRESENT, before the # seeded-default skip (a removed key has no seeded default anyway). if (msg = REMOVED_KEYS[keys.join(".")]) out << msg next end # Skip a leaf still at its seeded default: `setup` writes the FULL # default config to disk, so every default value is present in the raw # hash. Those are valid by construction. Only a leaf the user CHANGED # can be a hand-edit mistake. next if seeded_default?(keys, value) key_path = keys.join(".") validate!(key_path, keys, value) rescue ConfigurationError => e out << e. rescue StandardError # A surprising shape must never crash a load — skip that leaf. nil end out end |