Class: Woods::Chunking::Chunk

Inherits:
Object
  • Object
show all
Defined in:
lib/woods/chunking/chunk.rb

Overview

A single semantic chunk extracted from an ExtractedUnit.

Chunks represent meaningful subsections of a code unit — associations, callbacks, validations, individual actions, etc. Each chunk is independently embeddable and retrievable, with a back-reference to its parent unit.

Examples:

chunk = Chunk.new(
  content: "has_many :posts\nhas_many :comments",
  chunk_type: :associations,
  parent_identifier: "User",
  parent_type: :model
)
chunk.token_count  # => 20
chunk.identifier   # => "User#associations"

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(content:, chunk_type:, parent_identifier:, parent_type:, metadata: {}) ⇒ Chunk

Returns a new instance of Chunk.

Parameters:

  • content (String)

    The chunk’s source code or text

  • chunk_type (Symbol)

    Semantic type (:summary, :associations, :callbacks, etc.)

  • parent_identifier (String)

    Identifier of the parent ExtractedUnit

  • parent_type (Symbol)

    Type of the parent unit (:model, :controller, etc.)

  • metadata (Hash) (defaults to: {})

    Optional chunk-specific metadata



31
32
33
34
35
36
37
# File 'lib/woods/chunking/chunk.rb', line 31

def initialize(content:, chunk_type:, parent_identifier:, parent_type:, metadata: {})
  @content = content
  @chunk_type = chunk_type
  @parent_identifier = parent_identifier
  @parent_type = parent_type
  @metadata = 
end

Instance Attribute Details

#chunk_typeObject (readonly)

Returns the value of attribute chunk_type.



24
25
26
# File 'lib/woods/chunking/chunk.rb', line 24

def chunk_type
  @chunk_type
end

#contentObject (readonly)

Returns the value of attribute content.



24
25
26
# File 'lib/woods/chunking/chunk.rb', line 24

def content
  @content
end

#metadataObject (readonly)

Returns the value of attribute metadata.



24
25
26
# File 'lib/woods/chunking/chunk.rb', line 24

def 
  @metadata
end

#parent_identifierObject (readonly)

Returns the value of attribute parent_identifier.



24
25
26
# File 'lib/woods/chunking/chunk.rb', line 24

def parent_identifier
  @parent_identifier
end

#parent_typeObject (readonly)

Returns the value of attribute parent_type.



24
25
26
# File 'lib/woods/chunking/chunk.rb', line 24

def parent_type
  @parent_type
end

Instance Method Details

#content_hashString

SHA256 hash of content for change detection.

Returns:

  • (String)


49
50
51
# File 'lib/woods/chunking/chunk.rb', line 49

def content_hash
  @content_hash ||= Digest::SHA256.hexdigest(content)
end

#empty?Boolean

Whether the chunk has no meaningful content.

Returns:

  • (Boolean)


63
64
65
# File 'lib/woods/chunking/chunk.rb', line 63

def empty?
  content.nil? || content.strip.empty?
end

#identifierString

Unique identifier combining parent and chunk type.

Returns:

  • (String)


56
57
58
# File 'lib/woods/chunking/chunk.rb', line 56

def identifier
  "#{parent_identifier}##{chunk_type}"
end

#to_hHash

Serialize to hash for JSON output.

Returns:

  • (Hash)


70
71
72
73
74
75
76
77
78
79
80
81
# File 'lib/woods/chunking/chunk.rb', line 70

def to_h
  {
    content: content,
    chunk_type: chunk_type,
    parent_identifier: parent_identifier,
    parent_type: parent_type,
    identifier: identifier,
    token_count: token_count,
    content_hash: content_hash,
    metadata: 
  }
end

#token_countInteger

Estimated token count using project convention.

Returns:

  • (Integer)


42
43
44
# File 'lib/woods/chunking/chunk.rb', line 42

def token_count
  @token_count ||= (content.length / 4.0).ceil
end