Class: Ucode::Parsers::ScriptExtensions
- Defined in:
- lib/ucode/parsers/script_extensions.rb
Overview
Parses ‘ScriptExtensions.txt` — additional scripts per codepoint.
Format (UAX #44):
XXXX..XXXX ; Latn Grek Cyrl # trailing comment
A codepoint can be associated with many scripts. The parser yields one Tuple per (codepoint, script_code) pair; the Coordinator merges these into CodePoint#script_extensions.
‘script_code` is the ISO 15924 4-letter code already present in the source file (e.g. `Latn`, `Grek`). No alias resolution is needed.
Defined Under Namespace
Classes: Tuple
Class Method Summary collapse
-
.each_record(path) ⇒ Object
Yields one Tuple per (codepoint, script_code) pair.
Methods inherited from Base
each_line, parse_codepoint_or_range, parse_field, parse_hex_cp
Class Method Details
.each_record(path) ⇒ Object
Yields one Tuple per (codepoint, script_code) pair. Returns a lazy Enumerator when called without a block.
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
# File 'lib/ucode/parsers/script_extensions.rb', line 29 def each_record(path) return enum_for(:each_record, path) unless block_given? each_line(path) do |line| fields = line.fields next if fields.length < 2 codes_field = fields[1] next if codes_field.nil? || codes_field.empty? range = parse_codepoint_or_range(fields[0]) codes = codes_field.split(/\s+/) each_cp(range) do |cp| codes.each do |code| yield Tuple.new(cp: cp, script_code: code) end end end nil end |