Class: Ucode::Parsers::NamedSequences

Inherits:
Base
  • Object
show all
Defined in:
lib/ucode/parsers/named_sequences.rb

Overview

Parses ‘NamedSequences.txt` — named multi-codepoint sequences.

Format (UAX #44):

cp1 cp2 cp3 ...; Name

The first field is a space-separated list of hex codepoints; the second is the human-readable name.

Class Method Summary collapse

Methods inherited from Base

each_line, parse_codepoint_or_range, parse_field, parse_hex_cp

Class Method Details

.each_record(path) ⇒ Object

Yields one NamedSequence per non-comment line. Returns a lazy Enumerator when called without a block.



19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# File 'lib/ucode/parsers/named_sequences.rb', line 19

def each_record(path)
  return enum_for(:each_record, path) unless block_given?

  each_line(path) do |line|
    fields = line.fields
    next if fields.length < 2

    sequence_field = fields[0]
    name = fields[1]
    next if name.nil? || name.empty?

    yield Models::NamedSequence.new(
      name: name,
      codepoint_ids: parse_sequence(sequence_field)
    )
  end

  nil
end