Class: Ucode::Parsers::StandardizedVariants

Inherits:
Base
  • Object
show all
Defined in:
lib/ucode/parsers/standardized_variants.rb

Overview

Parses ‘StandardizedVariants.txt` — variation selector sequences.

Format (UAX #44):

base_cp VS_cp; description; [contexts]; # trailing comment

‘base_cp` + `variation_selector_id` is the key; `description` is the visual result; `contexts` (optional) is a space-separated list of shaping contexts (e.g. `no-break`).

Class Method Summary collapse

Methods inherited from Base

each_line, parse_codepoint_or_range, parse_field, parse_hex_cp

Class Method Details

.each_record(path) ⇒ Object



18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
# File 'lib/ucode/parsers/standardized_variants.rb', line 18

def each_record(path)
  return enum_for(:each_record, path) unless block_given?

  each_line(path) do |line|
    fields = line.fields
    next if fields.length < 2

    sequence_field = fields[0]
    description = fields[1]
    next if description.nil? || description.empty?

    sequence = sequence_field.to_s.split(/\s+/).reject(&:empty?)
    next if sequence.length < 2

    base = parse_hex_cp(sequence[0])
    vs = parse_hex_cp(sequence[1])

    yield Models::StandardizedVariant.new(
      base_id: format("U+%04X", base),
      variation_selector_id: format("U+%04X", vs),
      description: description,
      contexts: parse_contexts(fields[2])
    )
  end

  nil
end