Class: Ucode::Index

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/ucode/index.rb

Overview

Sorted, run-length-encoded lookup table over Unicode codepoints.

One Index answers “what <thing> does codepoint N belong to?” for one property (block, or script). Lookup is O(log N) via ‘bsearch_index`.

Two ways to construct:

- `Index.from_triples([[first, last, name], ...])`
- `Index.load(path)` from a YAML file previously written by `#save`.

The YAML form is the dependency-free alternative to SQLite — same query API, simpler ops. Pick whichever fits the deployment.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(entries) ⇒ Index

Returns a new instance of Index.



21
22
23
# File 'lib/ucode/index.rb', line 21

def initialize(entries)
  @entries = entries.sort
end

Instance Attribute Details

#entriesObject (readonly)

Returns the value of attribute entries.



25
26
27
# File 'lib/ucode/index.rb', line 25

def entries
  @entries
end

Class Method Details

.from_triples(triples) ⇒ Index

Build an Index from raw [first_cp, last_cp, name] triples.

Parameters:

  • triples (Array<Array(Integer, Integer, String)>)

Returns:



81
82
83
# File 'lib/ucode/index.rb', line 81

def self.from_triples(triples)
  new(triples.map { |first, last, name| RangeEntry.new(first, last, name) })
end

.load(path) ⇒ Index

Load from a YAML file previously written by #save.

Parameters:

  • path (String, Pathname)

Returns:



73
74
75
76
# File 'lib/ucode/index.rb', line 73

def self.load(path)
  hashes = YAML.load_file(path)
  new(hashes.map { |h| RangeEntry.from_h(h) })
end

Instance Method Details

#each(&block) ⇒ Object



27
28
29
# File 'lib/ucode/index.rb', line 27

def each(&block)
  @entries.each(&block)
end

#each_overlapping(first, last, &block) ⇒ Enumerator<RangeEntry>?

Enumerate every range whose [first_cp, last_cp] overlaps the inclusive query range. Returns a lazy Enumerator when called without a block.

Parameters:

  • first (Integer)
  • last (Integer)

Returns:



48
49
50
51
52
53
54
55
56
57
58
59
# File 'lib/ucode/index.rb', line 48

def each_overlapping(first, last, &block)
  return enum_for(:each_overlapping, first, last) unless block_given?

  start_idx = bsearch_first_overlap(first)
  return if start_idx.nil?

  @entries[start_idx..].each do |entry|
    break if entry.first_cp > last

    yield entry if entry.last_cp >= first
  end
end

#lookup(codepoint) ⇒ String?

Returns the name of the range covering ‘codepoint`, or nil.

Parameters:

  • codepoint (Integer)

Returns:

  • (String, nil)

    the name of the range covering ‘codepoint`, or nil



37
38
39
40
# File 'lib/ucode/index.rb', line 37

def lookup(codepoint)
  idx = bsearch_index(codepoint)
  idx && @entries[idx].name
end

#save(path) ⇒ void

This method returns an undefined value.

Serialize to a YAML file.

Parameters:

  • path (String, Pathname)


64
65
66
67
68
# File 'lib/ucode/index.rb', line 64

def save(path)
  File.open(path, "w") do |file|
    YAML.dump(@entries.map(&:to_h), file)
  end
end

#sizeObject



31
32
33
# File 'lib/ucode/index.rb', line 31

def size
  @entries.size
end