Class: LexerKit::DFA::CharClassCollector
- Inherits:
-
Object
- Object
- LexerKit::DFA::CharClassCollector
- Includes:
- RegexAST
- Defined in:
- lib/lexer_kit/dfa/char_class_collector.rb
Overview
CharClassCollector collects character class items and builds appropriate AST. Separates byte/codepoint handling from parsing control flow. Case folding is handled at the NFA layer, not here.
Instance Method Summary collapse
-
#add_item(item) ⇒ Object
Add a single item (byte or codepoint).
-
#add_range(start_item, end_item) ⇒ Object
Add a range of items.
-
#initialize ⇒ CharClassCollector
constructor
A new instance of CharClassCollector.
-
#to_ast(negated:, meta:) ⇒ Object
Build the final AST.
Constructor Details
#initialize ⇒ CharClassCollector
Returns a new instance of CharClassCollector.
11 12 13 14 |
# File 'lib/lexer_kit/dfa/char_class_collector.rb', line 11 def initialize @byte_ranges = [] @codepoint_ranges = [] end |
Instance Method Details
#add_item(item) ⇒ Object
Add a single item (byte or codepoint)
17 18 19 20 21 22 23 |
# File 'lib/lexer_kit/dfa/char_class_collector.rb', line 17 def add_item(item) if item[:type] == :byte @byte_ranges << [item[:value], item[:value]] else @codepoint_ranges << [item[:value], item[:value]] end end |
#add_range(start_item, end_item) ⇒ Object
Add a range of items
26 27 28 29 30 31 32 33 34 35 36 37 38 |
# File 'lib/lexer_kit/dfa/char_class_collector.rb', line 26 def add_range(start_item, end_item) raise ArgumentError, "mixed byte and multibyte range in char class" if start_item[:type] != end_item[:type] if start_item[:type] == :byte @byte_ranges << [start_item[:value], end_item[:value]] else start_cp = start_item[:value] end_cp = end_item[:value] raise ArgumentError, "invalid multibyte range" if start_cp > end_cp @codepoint_ranges << [start_cp, end_cp] end end |
#to_ast(negated:, meta:) ⇒ Object
Build the final AST
41 42 43 44 45 46 |
# File 'lib/lexer_kit/dfa/char_class_collector.rb', line 41 def to_ast(negated:, meta:) validate_negated_multibyte!(negated) ascii_ast = build_ascii_ast(negated, ) utf8_ast = build_utf8_ast() combine_asts(ascii_ast, utf8_ast, negated, ) end |