Class: Woods::Chunking::Chunk
- Inherits:
-
Object
- Object
- Woods::Chunking::Chunk
- Defined in:
- lib/woods/chunking/chunk.rb
Overview
A single semantic chunk extracted from an ExtractedUnit.
Chunks represent meaningful subsections of a code unit — associations, callbacks, validations, individual actions, etc. Each chunk is independently embeddable and retrievable, with a back-reference to its parent unit.
Instance Attribute Summary collapse
-
#chunk_type ⇒ Object
readonly
Returns the value of attribute chunk_type.
-
#content ⇒ Object
readonly
Returns the value of attribute content.
-
#metadata ⇒ Object
readonly
Returns the value of attribute metadata.
-
#parent_identifier ⇒ Object
readonly
Returns the value of attribute parent_identifier.
-
#parent_type ⇒ Object
readonly
Returns the value of attribute parent_type.
Instance Method Summary collapse
-
#content_hash ⇒ String
SHA256 hash of content for change detection.
-
#empty? ⇒ Boolean
Whether the chunk has no meaningful content.
-
#identifier ⇒ String
Unique identifier combining parent and chunk type.
-
#initialize(content:, chunk_type:, parent_identifier:, parent_type:, metadata: {}) ⇒ Chunk
constructor
A new instance of Chunk.
-
#to_h ⇒ Hash
Serialize to hash for JSON output.
-
#token_count ⇒ Integer
Estimated token count using project convention.
Constructor Details
#initialize(content:, chunk_type:, parent_identifier:, parent_type:, metadata: {}) ⇒ Chunk
Returns a new instance of Chunk.
31 32 33 34 35 36 37 |
# File 'lib/woods/chunking/chunk.rb', line 31 def initialize(content:, chunk_type:, parent_identifier:, parent_type:, metadata: {}) @content = content @chunk_type = chunk_type @parent_identifier = parent_identifier @parent_type = parent_type @metadata = end |
Instance Attribute Details
#chunk_type ⇒ Object (readonly)
Returns the value of attribute chunk_type.
24 25 26 |
# File 'lib/woods/chunking/chunk.rb', line 24 def chunk_type @chunk_type end |
#content ⇒ Object (readonly)
Returns the value of attribute content.
24 25 26 |
# File 'lib/woods/chunking/chunk.rb', line 24 def content @content end |
#metadata ⇒ Object (readonly)
Returns the value of attribute metadata.
24 25 26 |
# File 'lib/woods/chunking/chunk.rb', line 24 def @metadata end |
#parent_identifier ⇒ Object (readonly)
Returns the value of attribute parent_identifier.
24 25 26 |
# File 'lib/woods/chunking/chunk.rb', line 24 def parent_identifier @parent_identifier end |
#parent_type ⇒ Object (readonly)
Returns the value of attribute parent_type.
24 25 26 |
# File 'lib/woods/chunking/chunk.rb', line 24 def parent_type @parent_type end |
Instance Method Details
#content_hash ⇒ String
SHA256 hash of content for change detection.
49 50 51 |
# File 'lib/woods/chunking/chunk.rb', line 49 def content_hash @content_hash ||= Digest::SHA256.hexdigest(content) end |
#empty? ⇒ Boolean
Whether the chunk has no meaningful content.
63 64 65 |
# File 'lib/woods/chunking/chunk.rb', line 63 def empty? content.nil? || content.strip.empty? end |
#identifier ⇒ String
Unique identifier combining parent and chunk type.
56 57 58 |
# File 'lib/woods/chunking/chunk.rb', line 56 def identifier "#{parent_identifier}##{chunk_type}" end |
#to_h ⇒ Hash
Serialize to hash for JSON output.
70 71 72 73 74 75 76 77 78 79 80 81 |
# File 'lib/woods/chunking/chunk.rb', line 70 def to_h { content: content, chunk_type: chunk_type, parent_identifier: parent_identifier, parent_type: parent_type, identifier: identifier, token_count: token_count, content_hash: content_hash, metadata: } end |
#token_count ⇒ Integer
Estimated token count using project convention.
42 43 44 |
# File 'lib/woods/chunking/chunk.rb', line 42 def token_count @token_count ||= (content.length / 4.0).ceil end |