Class: Rangeable::DisjointSet
- Inherits:
-
Object
- Object
- Rangeable::DisjointSet
- Defined in:
- lib/rangeable/disjoint_set.rb
Overview
Sorted, disjoint, non-adjacent list of ‘Interval` for a single element.
Maintains RFC §5.1 (I1) invariant:
* sorted by lo strictly ascending
* any two adjacent entries (lo1, hi1), (lo2, hi2) satisfy hi1 + 1 < lo2
(no overlap, no integer adjacency)
* lo <= hi for every entry
‘insert` follows RFC §6.1 cleaner variant (containment idempotent fast-path) and signals to the caller whether the mutation actually changed the canonical state. `remove` follows RFC §6.6 with the same mutated-vs-idempotent distinction (both use the same return symbols so the upstream `Rangeable` only writes one version-bump branch).
Constant Summary collapse
- MUTATED =
Result codes returned from ‘insert` / `remove_subrange`. The caller (Rangeable) uses these to decide whether to bump the container-level version counter.
:mutated- IDEMPOTENT =
:idempotent
Class Method Summary collapse
-
.append_or_merge(out, iv) ⇒ Object
private
Helper for ‘merge_disjoint_lists`: append `iv` to `out`, collapsing if it overlaps or is integer-adjacent to `out.last`.
-
.from_entries(entries) ⇒ Object
Build a new DisjointSet directly from a known-canonical Array<Interval>.
-
.intersect_disjoint_lists(l1, l2) ⇒ Object
§6.11 intersect helper: pairwise intersection via two-pointer sweep.
-
.merge_disjoint_lists(l1, l2) ⇒ Object
§6.10 union helper: merge two (I1)-canonical lists into one (I1)-canonical list via two-pointer sweep + adjacency-collapse.
-
.subtract_disjoint_lists(l_a, l_b) ⇒ Object
§6.12 difference helper: subtract every interval in ‘l_b` from every interval in `l_a`.
Instance Method Summary collapse
- #each(&block) ⇒ Object
- #empty? ⇒ Boolean
-
#entries ⇒ Object
Read-only access to the underlying Array<Interval>.
-
#initialize ⇒ DisjointSet
constructor
A new instance of DisjointSet.
-
#insert(lo, hi) ⇒ Object
Insert ‘[lo, hi]` into the set, performing union-with-merge per RFC §6.1.
-
#remove_subrange(start_, end_) ⇒ Object
Subtract ‘[start_, end_]` from the set (§6.6 sweep+splice).
-
#replace_entries(new_entries) ⇒ Object
Replace the live entries with the given (I1)-canonical list.
- #size ⇒ Object
-
#to_pairs ⇒ Object
Returns a frozen snapshot of the merged intervals as ‘[[lo, hi], …]`.
Constructor Details
#initialize ⇒ DisjointSet
Returns a new instance of DisjointSet.
26 27 28 |
# File 'lib/rangeable/disjoint_set.rb', line 26 def initialize @entries = [] end |
Class Method Details
.append_or_merge(out, iv) ⇒ Object
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Helper for ‘merge_disjoint_lists`: append `iv` to `out`, collapsing if it overlaps or is integer-adjacent to `out.last`.
263 264 265 266 267 268 269 270 271 272 273 |
# File 'lib/rangeable/disjoint_set.rb', line 263 def self.append_or_merge(out, iv) if out.empty? || out.last.hi + 1 < iv.lo out << iv else # Overlap or adjacency: extend the last entry's hi if needed. last = out.last if iv.hi > last.hi out[-1] = Interval.new(last.lo, iv.hi) end end end |
.from_entries(entries) ⇒ Object
Build a new DisjointSet directly from a known-canonical Array<Interval>. Used by set-op result construction; bypasses the ‘insert` cost path.
149 150 151 152 153 |
# File 'lib/rangeable/disjoint_set.rb', line 149 def self.from_entries(entries) ds = new ds.replace_entries(entries.dup) ds end |
.intersect_disjoint_lists(l1, l2) ⇒ Object
§6.11 intersect helper: pairwise intersection via two-pointer sweep. No adjacency-collapse needed (Lemma 6.11.A). ‘O(|l1| + |l2|)`.
193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 |
# File 'lib/rangeable/disjoint_set.rb', line 193 def self.intersect_disjoint_lists(l1, l2) out = [] i = 0 j = 0 n1 = l1.size n2 = l2.size while i < n1 && j < n2 a = l1[i] b = l2[j] lo = a.lo > b.lo ? a.lo : b.lo hi = a.hi < b.hi ? a.hi : b.hi out << Interval.new(lo, hi) if lo <= hi if a.hi <= b.hi i += 1 else j += 1 end end out end |
.merge_disjoint_lists(l1, l2) ⇒ Object
§6.10 union helper: merge two (I1)-canonical lists into one (I1)-canonical list via two-pointer sweep + adjacency-collapse. ‘O(|l1| + |l2|)`. Either input may be empty.
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 |
# File 'lib/rangeable/disjoint_set.rb', line 165 def self.merge_disjoint_lists(l1, l2) out = [] i = 0 j = 0 n1 = l1.size n2 = l2.size while i < n1 && j < n2 if l1[i].lo <= l2[j].lo append_or_merge(out, l1[i]) i += 1 else append_or_merge(out, l2[j]) j += 1 end end while i < n1 append_or_merge(out, l1[i]) i += 1 end while j < n2 append_or_merge(out, l2[j]) j += 1 end out end |
.subtract_disjoint_lists(l_a, l_b) ⇒ Object
§6.12 difference helper: subtract every interval in ‘l_b` from every interval in `l_a`. Two-pointer “in-flight current” sweep. `O(|l_a| + |l_b|)`.
216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 |
# File 'lib/rangeable/disjoint_set.rb', line 216 def self.subtract_disjoint_lists(l_a, l_b) out = [] i = 0 j = 0 n_a = l_a.size n_b = l_b.size current_lo = nil current_hi = nil while i < n_a if current_lo.nil? current_lo = l_a[i].lo current_hi = l_a[i].hi end # Skip l_b entries strictly before the current entry. j += 1 while j < n_b && l_b[j].hi < current_lo if j == n_b || l_b[j].lo > current_hi # No more l_b cuts on this entry; commit and advance i. out << Interval.new(current_lo, current_hi) i += 1 current_lo = nil current_hi = nil next end # l_b[j] overlaps [current_lo, current_hi]; cut. # Left residual: (current_lo, l_b[j].lo - 1). Underflow-safe: only # taken when l_b[j].lo > current_lo, so l_b[j].lo > Int.min. out << Interval.new(current_lo, l_b[j].lo - 1) if l_b[j].lo > current_lo if l_b[j].hi < current_hi # Right residual remains as the new current; advance j. # Overflow-safe: l_b[j].hi < current_hi ≤ Int.max. current_lo = l_b[j].hi + 1 j += 1 else # l_b[j] swallows the rest of the current entry; advance i. i += 1 current_lo = nil current_hi = nil end end out end |
Instance Method Details
#each(&block) ⇒ Object
36 37 38 |
# File 'lib/rangeable/disjoint_set.rb', line 36 def each(&block) @entries.each(&block) end |
#empty? ⇒ Boolean
40 41 42 |
# File 'lib/rangeable/disjoint_set.rb', line 40 def empty? @entries.empty? end |
#entries ⇒ Object
Read-only access to the underlying Array<Interval>. Returned reference is the live storage — callers that mutate it violate (I1). Used by the set-op two-pointer sweeps in ‘Rangeable` to avoid an Array allocation on every union/intersect/difference call.
52 53 54 |
# File 'lib/rangeable/disjoint_set.rb', line 52 def entries @entries end |
#insert(lo, hi) ⇒ Object
Insert ‘[lo, hi]` into the set, performing union-with-merge per RFC §6.1. Returns `MUTATED` if the canonical state changed (caller should bump version), `IDEMPOTENT` if the insert was absorbed by an existing entry (caller MUST NOT bump version, per Test #21 and Lemma 6.5.B).
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 |
# File 'lib/rangeable/disjoint_set.rb', line 60 def insert(lo, hi) raise ArgumentError, "lo (#{lo}) > hi (#{hi})" if lo > hi # Step 4 of §6.1: bsearch for the leftmost touch candidate. # Predicate: `iv.hi + 1 >= lo`. We use `iv.hi + 1` (not `lo - 1`) to # avoid Integer underflow at lo == Int.min boundaries (§4.7 C5). i0 = @entries.bsearch_index { |iv| iv.hi + 1 >= lo } || @entries.size # Step 5: collect contiguous touch entries while `entries[i].lo <= hi + 1`. to_merge_end = i0 while to_merge_end < @entries.size && @entries[to_merge_end].lo <= hi + 1 to_merge_end += 1 end merge_count = to_merge_end - i0 # Step 6: containment idempotent fast-path. If we touch exactly one # existing entry that fully covers [lo, hi], this insert is a no-op. # MUST NOT mutate, MUST NOT bump version. if merge_count == 1 existing = @entries[i0] if existing.lo <= lo && hi <= existing.hi return IDEMPOTENT end end # Step 7: real mutation path. Compute merged bounds, splice in. new_lo = lo new_hi = hi if merge_count.positive? first = @entries[i0] last = @entries[to_merge_end - 1] new_lo = first.lo if first.lo < new_lo new_hi = last.hi if last.hi > new_hi end merged = Interval.new(new_lo, new_hi) # `slice!` removes the touched range; then we splice the merged # interval at i0. Equivalent to the §6.1 reference pseudocode but # avoids repeated O(m_e) shifts that delete_at would cause in a loop. @entries.slice!(i0, merge_count) if merge_count.positive? @entries.insert(i0, merged) MUTATED end |
#remove_subrange(start_, end_) ⇒ Object
Subtract ‘[start_, end_]` from the set (§6.6 sweep+splice). Returns `MUTATED` if any entry was changed; `IDEMPOTENT` if no overlap. Caller is responsible for the eager-prune (§4.10 N1) check via `empty?`.
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
# File 'lib/rangeable/disjoint_set.rb', line 106 def remove_subrange(start_, end_) raise ArgumentError, "lo (#{start_}) > hi (#{end_})" if start_ > end_ # Step 3 of §6.6: leftmost entry where `iv.hi >= start_`. Strict # overlap; adjacency does NOT subtract anything. i = @entries.bsearch_index { |iv| iv.hi >= start_ } || @entries.size # Step 4: quick-exit. Either no entry crosses or past the cut window. return IDEMPOTENT if i == @entries.size || @entries[i].lo > end_ # Step 5: sweep all overlapping entries; produce 0..2 residuals each. to_replace_start = i replacements = [] while i < @entries.size && @entries[i].lo <= end_ iv = @entries[i] # Left residual exists iff iv.lo < start_; underflow-safe since # start_ > iv.lo >= Int.min ⇒ start_ - 1 well-defined (§6.6). replacements << Interval.new(iv.lo, start_ - 1) if iv.lo < start_ # Right residual exists iff end_ < iv.hi; overflow-safe since # end_ < iv.hi <= Int.max ⇒ end_ + 1 well-defined. replacements << Interval.new(end_ + 1, iv.hi) if end_ < iv.hi i += 1 end to_replace_end = i # Step 6/7: splice. `slice!` then `insert` mirrors the reference # `replace_subrange` and avoids quadratic shifts. @entries.slice!(to_replace_start, to_replace_end - to_replace_start) @entries.insert(to_replace_start, *replacements) unless replacements.empty? MUTATED end |
#replace_entries(new_entries) ⇒ Object
Replace the live entries with the given (I1)-canonical list. Used by set-op pipelines that compute the canonical result list externally (e.g. ‘merge_disjoint_lists`) and want to drop it in without re-running `insert`. Caller MUST guarantee the list satisfies (I1.1)–(I1.3); we do not re-verify (this is the internal hot path).
143 144 145 |
# File 'lib/rangeable/disjoint_set.rb', line 143 def replace_entries(new_entries) @entries = new_entries end |
#size ⇒ Object
44 45 46 |
# File 'lib/rangeable/disjoint_set.rb', line 44 def size @entries.size end |
#to_pairs ⇒ Object
Returns a frozen snapshot of the merged intervals as ‘[[lo, hi], …]`. The list satisfies RFC §5.1 (I1).
32 33 34 |
# File 'lib/rangeable/disjoint_set.rb', line 32 def to_pairs @entries.map(&:to_a) end |