Module: Rigor::Builtins::RegexRefinement
- Defined in:
- lib/rigor/builtins/regex_refinement.rb
Overview
Maps a curated table of canonical regex sub-patterns onto the imported refinement carriers Rigor already ships (‘decimal-int-string`, `hex-int-string`, `octal-int-string`, `lowercase-string`, `uppercase-string`, `numeric-string`). See `docs/type-specification/imported-built-in-types.md` for the registry the refinements come from and `docs/MILESTONES.md` § “v0.1.1 — Planned” Track 1 slice 1 for the binding scope of this recogniser.
The intended consumer is ‘Inference::Narrowing.analyse_match_write`: given `if /(?<year>d+)/ =~ str; year; end`, the v0.1.0 baseline narrows `year` to plain `String`; v0.1.1 introspects the regex source and narrows further to `decimal-int-string` whenever the named-capture body matches one of the rows in RULES.
Recognised body shapes (each row admits the ‘+` quantifier and the bounded `n` / `n,m` forms with `n >= 1`):
- `\d` -> decimal-int-string
- `\h` -> hex-int-string
- `[0-9a-fA-F]` -> hex-int-string
- `[0-9a-f]`, `[0-9A-F]` -> hex-int-string
- `[0-7]` -> octal-int-string
- `[a-z]` -> lowercase-string
- `[A-Z]` -> uppercase-string
- `[[:digit:]]` -> numeric-string
Anything outside the table returns ‘nil` so the calling narrowing site falls back to its previous behaviour (plain `String`). Arbitrary regex semantic equivalence is undecidable, so the table is intentionally a small audited set of canonical shapes rather than a general equivalence checker.
Class Method Summary collapse
-
.for_capture_body(body) ⇒ Rigor::Type?
The matching imported refinement carrier, or ‘nil` if `body` is not a recognised shape.
-
.valid_bounds?(body) ⇒ Boolean
Filters the bounded-quantifier forms to ones whose lower bound is at least 1 and whose upper bound (if any) is at least the lower bound.
Class Method Details
.for_capture_body(body) ⇒ Rigor::Type?
Returns the matching imported refinement carrier, or ‘nil` if `body` is not a recognised shape.
75 76 77 78 79 80 81 82 83 |
# File 'lib/rigor/builtins/regex_refinement.rb', line 75 def for_capture_body(body) return nil if body.nil? || body.empty? rule = RULES.find { |pattern, _| pattern.match?(body) } return nil if rule.nil? return nil unless valid_bounds?(body) Type::Combinator.public_send(rule.last) end |
.valid_bounds?(body) ⇒ Boolean
Filters the bounded-quantifier forms to ones whose lower bound is at least 1 and whose upper bound (if any) is at least the lower bound. Without this, ‘d0,5` would be accepted even though it admits the empty string, which is not a valid `decimal-int-string`.
90 91 92 93 94 95 96 97 98 99 100 101 |
# File 'lib/rigor/builtins/regex_refinement.rb', line 90 def valid_bounds?(body) m = BOUND_RE.match(body) return true if m.nil? low = Integer(m[1]) return false if low < 1 high = m[2] && Integer(m[2]) return true if high.nil? low <= high end |