Class: Ucode::Parsers::SpecialCasing

Inherits:
Base
  • Object
show all
Defined in:
lib/ucode/parsers/special_casing.rb

Overview

Parses ‘SpecialCasing.txt` — context-sensitive case mappings.

Format (UAX #44):

cp; lower; title; upper; [conditions;] # name

The ‘lower`/`title`/`upper` fields are either empty or a space-separated list of hex codepoints. `conditions` is a space-separated list of context identifiers (`Final_Sigma`, `After_I`) and/or locale codes (`tr`, `az`). Filtering by condition is the consumer’s job.

Class Method Summary collapse

Methods inherited from Base

each_line, parse_codepoint_or_range, parse_field, parse_hex_cp

Class Method Details

.each_record(path) ⇒ Object

Yields one SpecialCasingRule per non-comment line. Returns a lazy Enumerator when called without a block.



22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# File 'lib/ucode/parsers/special_casing.rb', line 22

def each_record(path)
  return enum_for(:each_record, path) unless block_given?

  each_line(path) do |line|
    fields = line.fields
    next if fields.length < 4

    cp = parse_hex_cp(fields[0])

    yield Models::SpecialCasingRule.new(
      codepoint: cp,
      lower_ids: parse_mapping(fields[1]),
      title_ids: parse_mapping(fields[2]),
      upper_ids: parse_mapping(fields[3]),
      conditions: parse_conditions(fields[4]),
      comment: line.comment
    )
  end

  nil
end