Class: Ace::Review::Molecules::Strategies::ChunkedStrategy

Inherits:
Object
  • Object
show all
Defined in:
lib/ace/review/molecules/strategies/chunked_strategy.rb

Overview

Chunked strategy - splits large diffs at file boundaries

This strategy parses diffs into file blocks and groups them into chunks that fit within the model’s context window. It never splits a file mid-diff, maintaining atomic file boundaries.

Features:

  • File-boundary chunking (never splits within a file)

  • Summary header with changed files list (capped per file count)

  • Overflow handling for files larger than context limit

  • Metadata tracking for chunk index and totals

Examples:

Basic usage

strategy = ChunkedStrategy.new(max_tokens_per_chunk: 100_000)
if strategy.can_handle?(subject, 128_000)
  units = strategy.prepare(subject, context)
  # units = [
  #   { content: "Summary...\n\ndiff...", metadata: { strategy: :chunked, chunk_index: 0, ... } },
  #   { content: "Summary...\n\ndiff...", metadata: { strategy: :chunked, chunk_index: 1, ... } }
  # ]
end

Constant Summary collapse

DEFAULT_MAX_TOKENS =

Default maximum tokens per chunk (leaving room for prompts/output)

100_000
SUMMARY_RESERVE_TOKENS =

Reserve tokens for summary header

2_000
SUMMARY_THRESHOLD_FULL =

File count thresholds for summary formatting

20
SUMMARY_THRESHOLD_GROUPED =
100

Instance Method Summary collapse

Constructor Details

#initialize(config = {}) ⇒ ChunkedStrategy

Returns a new instance of ChunkedStrategy.

Parameters:

  • config (Hash) (defaults to: {})

    Strategy configuration

Options Hash (config):

  • :max_tokens_per_chunk (Integer)

    Maximum tokens per chunk

  • :include_change_summary (Boolean)

    Include file summary (default: true)



45
46
47
48
49
50
# File 'lib/ace/review/molecules/strategies/chunked_strategy.rb', line 45

def initialize(config = {})
  # Normalize keys to symbols for consistent access (supports YAML string keys)
  @config = normalize_config_keys(config)
  @max_tokens_per_chunk = @config[:max_tokens_per_chunk] || DEFAULT_MAX_TOKENS
  @include_change_summary = @config.fetch(:include_change_summary, true)
end

Instance Method Details

#can_handle?(subject, model_context_limit) ⇒ Boolean

Check if this strategy can handle the given subject

Returns true if the subject contains parseable diff blocks. The chunked strategy can handle subjects of any size by splitting them into multiple review units.

Examples:

strategy.can_handle?("diff --git...", 128_000)  #=> true
strategy.can_handle?("not a diff", 128_000)     #=> false

Parameters:

  • subject (String)

    The review subject text (expected to be a diff)

  • model_context_limit (Integer)

    Model’s token limit (used for reference)

Returns:

  • (Boolean)

    true if subject contains valid diff format



65
66
67
68
69
70
71
# File 'lib/ace/review/molecules/strategies/chunked_strategy.rb', line 65

def can_handle?(subject, model_context_limit)
  return false if subject.nil? || subject.empty?
  return false if model_context_limit.nil? || model_context_limit <= 0

  # Check if subject looks like a unified diff
  Atoms::DiffBoundaryFinder.file_count(subject) > 0
end

#prepare(subject, context = {}) ⇒ Array<Hash>

Prepare the subject for review by splitting into chunks

Parses the diff into file blocks and groups them into chunks that fit within the configured token limit.

Examples:

Return format

[{
  content: "## Changes Summary\n...\n\ndiff --git...",
  metadata: {
    strategy: :chunked,
    chunk_index: 0,
    total_chunks: 2,
    files: ["lib/foo.rb", "lib/bar.rb"]
  }
}, ...]

Parameters:

  • subject (String)

    The review subject text (diff)

  • context (Hash) (defaults to: {})

    Review context

Options Hash (context):

  • :system_prompt (String)

    Base system prompt for the reviewer

  • :user_prompt (String)

    User instructions or focus areas

  • :model (String)

    Model identifier

  • :model_context_limit (Integer)

    Token limit for the model

  • :preset (Hash)

    Full preset configuration

  • :file_list (Array<String>)

    List of files being reviewed

Returns:

  • (Array<Hash>)

    Array of review units, each with :content and :metadata



98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
# File 'lib/ace/review/molecules/strategies/chunked_strategy.rb', line 98

def prepare(subject, context = {})
  return single_chunk_empty(subject) if subject.nil? || subject.empty?

  # Parse the diff into file blocks
  blocks = Atoms::DiffBoundaryFinder.parse(subject)
  return single_chunk_passthrough(subject) if blocks.empty?

  # Build the file summary once (used in all chunks)
  summary = @include_change_summary ? build_summary(blocks) : ""

  # Calculate available tokens per chunk (minus summary overhead)
  summary_tokens = Atoms::TokenEstimator.estimate(summary)
  available_tokens = @max_tokens_per_chunk - summary_tokens - SUMMARY_RESERVE_TOKENS

  # Guard against non-positive available tokens
  # If summary exceeds budget, use minimum of 1000 tokens to ensure some content
  minimum_available = 1_000
  available_tokens = [available_tokens, minimum_available].max

  # Group blocks into chunks
  chunks = build_chunks(blocks, available_tokens)

  # Format each chunk with summary and metadata
  total_chunks = chunks.length
  chunks.each_with_index.map do |chunk_blocks, index|
    build_review_unit(chunk_blocks, summary, index, total_chunks)
  end
end

#strategy_nameSymbol

Strategy name for logging and debugging

Returns:

  • (Symbol)

    :chunked



130
131
132
# File 'lib/ace/review/molecules/strategies/chunked_strategy.rb', line 130

def strategy_name
  :chunked
end