Class: Ucode::Parsers::ExtractedProperties
- Defined in:
- lib/ucode/parsers/extracted_properties.rb
Overview
Generic range/value parser for the files under ‘extracted/` (DerivedGeneralCategory, DerivedJoiningGroup, DerivedLineBreak, DerivedNumericType, …).
Format is uniform across every file (UAX #44):
XXXX..YYYY; value
XXXX; value
The parser is intentionally dumb: it yields ‘(first, last, value)` triples without knowing what the value means. The Coordinator dispatches by source file name (DerivedGeneralCategory.txt →CodePoint#general_category, etc.). This decoupling means a new extracted file adds one line to the Coordinator, not a new parser.
Ranges are NOT expanded — yielding per-codepoint would explode the stream for CJK ranges. The Coordinator expands lazily if needed.
Direct Known Subclasses
Defined Under Namespace
Classes: Tuple
Class Method Summary collapse
Methods inherited from Base
each_line, parse_codepoint_or_range, parse_field, parse_hex_cp
Class Method Details
.each_record(path) ⇒ Object
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
# File 'lib/ucode/parsers/extracted_properties.rb', line 45 def each_record(path) return enum_for(:each_record, path) unless block_given? each_line(path) do |line| fields = line.fields next if fields.length < 2 range = parse_codepoint_or_range(fields[0]) value = fields[1] next if value.nil? || value.empty? yield build_tuple(range, value) end nil end |