Class: Ucode::Parsers::ExtractedProperties
- Defined in:
- lib/ucode/parsers/extracted_properties.rb
Overview
Generic range/value parser for the files under extracted/
(DerivedGeneralCategory, DerivedJoiningGroup, DerivedLineBreak,
DerivedNumericType, …).
Format is uniform across every file (UAX #44):
XXXX..YYYY; value
XXXX; value
The parser is intentionally dumb: it yields (first, last, value)
triples without knowing what the value means. The Coordinator
dispatches by source file name (DerivedGeneralCategory.txt →
CodePoint#general_category, etc.). This decoupling means a new
extracted file adds one line to the Coordinator, not a new parser.
Ranges are NOT expanded — yielding per-codepoint would explode the stream for CJK ranges. The Coordinator expands lazily if needed.
Direct Known Subclasses
Defined Under Namespace
Classes: Tuple
Class Method Summary collapse
Methods inherited from Base
each_line, parse_codepoint_or_range, parse_field, parse_hex_cp
Class Method Details
.each_record(path) ⇒ Object
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
# File 'lib/ucode/parsers/extracted_properties.rb', line 45 def each_record(path) return enum_for(:each_record, path) unless block_given? each_line(path) do |line| fields = line.fields next if fields.length < 2 range = parse_codepoint_or_range(fields[0]) value = fields[1] next if value.nil? || value.empty? yield build_tuple(range, value) end nil end |