Class: Canon::Diff::SourceLocator

Inherits:
Object
  • Object
show all
Defined in:
lib/canon/diff/source_locator.rb

Overview

Locates serialized content within source text and maps character offsets to line/column positions. Used during DiffNode enrichment (Phase 1).

The SourceLocator uses String#index on the full source text (not LCS on lines) to find where a DiffNode’s serialized content appears. It then maps the character offset to a line number and column position using a pre-built line offset map.

Examples:

line_map = SourceLocator.build_line_map("line1\nline2\nline3")
SourceLocator.locate("line2", "line1\nline2\nline3", line_map)
# => { char_offset: 6, line_number: 1, col: 0 }

Class Method Summary collapse

Class Method Details

.build_line_map(text) ⇒ Array<Hash>

Build a line offset map from source text. Each entry records the start and end character offset of a line.

Parameters:

  • text (String)

    the full source text

Returns:

  • (Array<Hash>)

    array of { start_offset:, end_offset: } hashes, one per line (0-indexed)



24
25
26
27
28
29
30
31
32
33
34
35
# File 'lib/canon/diff/source_locator.rb', line 24

def self.build_line_map(text)
  return [] if text.nil? || text.empty?

  map = []
  offset = 0
  text.each_line do |line|
    line_end = offset + line.length
    map << { start_offset: offset, end_offset: line_end }
    offset = line_end
  end
  map
end

.locate(substring, text, line_map, start_from: nil) ⇒ Hash?

Locate a substring within source text and return its position.

Parameters:

  • substring (String)

    the content to find (e.g., serialized_before)

  • text (String)

    the full source text

  • line_map (Array<Hash>)

    pre-built line offset map

  • start_from (Integer, nil) (defaults to: nil)

    character offset to start searching from

Returns:

  • (Hash, nil)

    { char_offset:, line_number:, col: } or nil if not found



44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/canon/diff/source_locator.rb', line 44

def self.locate(substring, text, line_map, start_from: nil)
  return nil if substring.nil? || substring.empty?
  return nil if text.nil? || line_map.empty?

  char_offset = if start_from
                  text.index(substring, start_from)
                else
                  text.index(substring)
                end
  return nil if char_offset.nil?

  line_idx = find_line_for_offset(char_offset, line_map)
  return nil if line_idx.nil?

  col = char_offset - line_map[line_idx][:start_offset]

  { char_offset: char_offset, line_number: line_idx, col: col }
end

.locate_all(substring, text, line_map) ⇒ Array<Hash>

Locate ALL occurrences of a substring within source text.

Parameters:

  • substring (String)

    the content to find

  • text (String)

    the full source text

  • line_map (Array<Hash>)

    pre-built line offset map

Returns:

  • (Array<Hash>)

    array of { char_offset:, line_number:, col: } hashes



69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
# File 'lib/canon/diff/source_locator.rb', line 69

def self.locate_all(substring, text, line_map)
  return [] if substring.nil? || substring.empty?
  return [] if text.nil? || line_map.empty?

  results = []
  offset = 0

  while (pos = text.index(substring, offset))
    line_idx = find_line_for_offset(pos, line_map)
    break if line_idx.nil?

    col = pos - line_map[line_idx][:start_offset]
    results << { char_offset: pos, line_number: line_idx, col: col }
    offset = pos + 1
  end

  results
end