Module: Ace::Review::Atoms::DiffBoundaryFinder

Defined in:
lib/ace/review/atoms/diff_boundary_finder.rb

Overview

Pure function for parsing unified diffs into file blocks

Parses ‘diff –git` format diffs and extracts individual file blocks with their paths and content. Used by the chunked strategy to split large diffs at file boundaries.

Thread-safe: This module uses only class methods with no mutable instance state. All methods are pure functions that can be safely called concurrently from multiple threads.

Examples:

Basic usage

blocks = DiffBoundaryFinder.parse(diff_text)
#=> [
#     { path: "lib/foo.rb", content: "diff --git...", lines: 45, change_type: :modified },
#     { path: "test/foo_test.rb", content: "diff --git...", lines: 30, change_type: :modified }
#   ]

Constant Summary collapse

DIFF_HEADER_PATTERN =

Pattern to match the start of a file diff block Matches: diff –git a/path/to/file b/path/to/file

/^diff --git a\/(.+?) b\/(.+?)$/
NEW_FILE_PATTERN =

Pattern to detect new file mode

/^new file mode/
DELETED_FILE_PATTERN =

Pattern to detect deleted file mode

/^deleted file mode/

Class Method Summary collapse

Class Method Details

.file_count(diff_text) ⇒ Integer

Count the number of files in a diff

Examples:

DiffBoundaryFinder.file_count(diff)
#=> 5

Parameters:

  • diff_text (String, nil)

    The unified diff text

Returns:

  • (Integer)

    Number of files in the diff



111
112
113
114
115
# File 'lib/ace/review/atoms/diff_boundary_finder.rb', line 111

def self.file_count(diff_text)
  return 0 if diff_text.nil? || diff_text.empty?

  diff_text.scan(DIFF_HEADER_PATTERN).length
end

.file_paths(diff_text) ⇒ Array<String>

Parse and return just the file paths from a diff

Examples:

DiffBoundaryFinder.file_paths(diff)
#=> ["lib/foo.rb", "test/foo_test.rb"]

Parameters:

  • diff_text (String, nil)

    The unified diff text to parse

Returns:

  • (Array<String>)

    List of file paths in the diff



99
100
101
# File 'lib/ace/review/atoms/diff_boundary_finder.rb', line 99

def self.file_paths(diff_text)
  parse(diff_text).map { |block| block[:path] }
end

.group_by_directory(blocks) ⇒ Hash<String, Array<Hash>>

Group file blocks by directory

Examples:

DiffBoundaryFinder.group_by_directory(blocks)
#=> { "lib/atoms" => [...], "test/atoms" => [...] }

Parameters:

  • blocks (Array<Hash>)

    Array of file blocks from #parse

Returns:

  • (Hash<String, Array<Hash>>)

    Files grouped by directory



125
126
127
128
129
# File 'lib/ace/review/atoms/diff_boundary_finder.rb', line 125

def self.group_by_directory(blocks)
  blocks.group_by do |block|
    File.dirname(block[:path])
  end
end

.parse(diff_text) ⇒ Array<Hash>

Parse a unified diff into individual file blocks

Examples:

DiffBoundaryFinder.parse(diff)
#=> [{ path: "lib/foo.rb", content: "diff --git...", lines: 45, change_type: :modified }]

Parameters:

  • diff_text (String, nil)

    The unified diff text to parse

Returns:

  • (Array<Hash>)

    Array of file blocks, each with:

    • :path [String] - File path (uses ‘b/’ side)

    • :content [String] - Full diff content for this file

    • :lines [Integer] - Number of lines in the diff block

    • :change_type [Symbol] - :added, :deleted, or :modified



45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# File 'lib/ace/review/atoms/diff_boundary_finder.rb', line 45

def self.parse(diff_text)
  return [] if diff_text.nil? || diff_text.empty?

  blocks = []
  current_block = nil
  current_lines = []

  diff_text.each_line do |line|
    if (match = DIFF_HEADER_PATTERN.match(line))
      # Save the previous block if exists
      if current_block
        current_block[:content] = current_lines.join
        current_block[:lines] = current_lines.length
        blocks << current_block
      end

      # Start a new block
      current_block = {
        path: match[2],  # Use the 'b/' side (destination path)
        content: "",
        lines: 0,
        change_type: :modified  # Default, may be updated below
      }
      current_lines = [line]
    elsif current_block
      current_lines << line

      # Detect change type from mode lines
      if NEW_FILE_PATTERN.match?(line)
        current_block[:change_type] = :added
      elsif DELETED_FILE_PATTERN.match?(line)
        current_block[:change_type] = :deleted
      end
    end
  end

  # Don't forget the last block
  if current_block
    current_block[:content] = current_lines.join
    current_block[:lines] = current_lines.length
    blocks << current_block
  end

  blocks
end