Class: Ucode::Parsers::ExtractedProperties

Inherits:
Base
  • Object
show all
Defined in:
lib/ucode/parsers/extracted_properties.rb

Overview

Generic range/value parser for the files under ‘extracted/` (DerivedGeneralCategory, DerivedJoiningGroup, DerivedLineBreak, DerivedNumericType, …).

Format is uniform across every file (UAX #44):

XXXX..YYYY; value
XXXX; value

The parser is intentionally dumb: it yields ‘(first, last, value)` triples without knowing what the value means. The Coordinator dispatches by source file name (DerivedGeneralCategory.txt →CodePoint#general_category, etc.). This decoupling means a new extracted file adds one line to the Coordinator, not a new parser.

Ranges are NOT expanded — yielding per-codepoint would explode the stream for CJK ranges. The Coordinator expands lazily if needed.

Direct Known Subclasses

Auxiliary

Defined Under Namespace

Classes: Tuple

Class Method Summary collapse

Methods inherited from Base

each_line, parse_codepoint_or_range, parse_field, parse_hex_cp

Class Method Details

.each_record(path) ⇒ Object



45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# File 'lib/ucode/parsers/extracted_properties.rb', line 45

def each_record(path)
  return enum_for(:each_record, path) unless block_given?

  each_line(path) do |line|
    fields = line.fields
    next if fields.length < 2

    range = parse_codepoint_or_range(fields[0])
    value = fields[1]
    next if value.nil? || value.empty?

    yield build_tuple(range, value)
  end

  nil
end