Class: Parse::Retrieval::Chunk

Inherits:
Object
  • Object
show all
Defined in:
lib/parse/retrieval/chunk.rb

Overview

A single retrieved passage: one chunk of one source document, carrying the document's vector-search score and (optionally projected) source record.

Produced by retrieve. Because embedding is one-vector-per-record (see Core::EmbedManaged), every chunk split from a document shares that document's single score — the chunking is presentation-only, applied after retrieval.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(id:, content:, source:, score: nil, metadata: {}) ⇒ Chunk

Returns a new instance of Chunk.

Parameters:

  • id (String)
  • score (Float, nil) (defaults to: nil)
  • content (String)
  • source (Hash)
  • metadata (Hash) (defaults to: {})


38
39
40
41
42
43
44
45
# File 'lib/parse/retrieval/chunk.rb', line 38

def initialize(id:, content:, source:, score: nil, metadata: {})
  @id = id.to_s
  @score = score
  @content = content
  @source = source
  @metadata = 
  freeze
end

Instance Attribute Details

#contentString (readonly)

Returns the chunk text.

Returns:

  • (String)

    the chunk text.



30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# File 'lib/parse/retrieval/chunk.rb', line 30

class Chunk
  attr_reader :id, :score, :content, :source, :metadata

  # @param id [String]
  # @param score [Float, nil]
  # @param content [String]
  # @param source [Hash]
  # @param metadata [Hash]
  def initialize(id:, content:, source:, score: nil, metadata: {})
    @id = id.to_s
    @score = score
    @content = content
    @source = source
    @metadata = 
    freeze
  end

  # @return [Hash] plain-Hash form for tool output / JSON.
  def to_h
    {
      id: @id,
      score: @score,
      content: @content,
      source: @source,
      metadata: @metadata,
    }
  end

  # Value equality on the identifying triple — convenient for tests
  # and de-duplication. `source`/`metadata` are intentionally not
  # part of identity.
  def ==(other)
    other.is_a?(Chunk) &&
      other.id == @id &&
      other.score == @score &&
      other.content == @content
  end
  alias eql? ==

  def hash
    [@id, @score, @content].hash
  end
end

#idString (readonly)

Returns stable synthetic chunk id, "<objectId>#<index>".

Returns:

  • (String)

    stable synthetic chunk id, "<objectId>#<index>".



30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# File 'lib/parse/retrieval/chunk.rb', line 30

class Chunk
  attr_reader :id, :score, :content, :source, :metadata

  # @param id [String]
  # @param score [Float, nil]
  # @param content [String]
  # @param source [Hash]
  # @param metadata [Hash]
  def initialize(id:, content:, source:, score: nil, metadata: {})
    @id = id.to_s
    @score = score
    @content = content
    @source = source
    @metadata = 
    freeze
  end

  # @return [Hash] plain-Hash form for tool output / JSON.
  def to_h
    {
      id: @id,
      score: @score,
      content: @content,
      source: @source,
      metadata: @metadata,
    }
  end

  # Value equality on the identifying triple — convenient for tests
  # and de-duplication. `source`/`metadata` are intentionally not
  # part of identity.
  def ==(other)
    other.is_a?(Chunk) &&
      other.id == @id &&
      other.score == @score &&
      other.content == @content
  end
  alias eql? ==

  def hash
    [@id, @score, @content].hash
  end
end

#metadataHash (readonly)

Returns presentation metadata: :chunk_index, :chunk_count, :chunks_truncated, and any producer-supplied signals (e.g. :token_chunking_degraded).

Returns:

  • (Hash)

    presentation metadata: :chunk_index, :chunk_count, :chunks_truncated, and any producer-supplied signals (e.g. :token_chunking_degraded).



30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# File 'lib/parse/retrieval/chunk.rb', line 30

class Chunk
  attr_reader :id, :score, :content, :source, :metadata

  # @param id [String]
  # @param score [Float, nil]
  # @param content [String]
  # @param source [Hash]
  # @param metadata [Hash]
  def initialize(id:, content:, source:, score: nil, metadata: {})
    @id = id.to_s
    @score = score
    @content = content
    @source = source
    @metadata = 
    freeze
  end

  # @return [Hash] plain-Hash form for tool output / JSON.
  def to_h
    {
      id: @id,
      score: @score,
      content: @content,
      source: @source,
      metadata: @metadata,
    }
  end

  # Value equality on the identifying triple — convenient for tests
  # and de-duplication. `source`/`metadata` are intentionally not
  # part of identity.
  def ==(other)
    other.is_a?(Chunk) &&
      other.id == @id &&
      other.score == @score &&
      other.content == @content
  end
  alias eql? ==

  def hash
    [@id, @score, @content].hash
  end
end

#scoreFloat? (readonly)

Returns the parent document's Atlas vectorSearchScore, already quantized when the caller requested it.

Returns:

  • (Float, nil)

    the parent document's Atlas vectorSearchScore, already quantized when the caller requested it.



30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# File 'lib/parse/retrieval/chunk.rb', line 30

class Chunk
  attr_reader :id, :score, :content, :source, :metadata

  # @param id [String]
  # @param score [Float, nil]
  # @param content [String]
  # @param source [Hash]
  # @param metadata [Hash]
  def initialize(id:, content:, source:, score: nil, metadata: {})
    @id = id.to_s
    @score = score
    @content = content
    @source = source
    @metadata = 
    freeze
  end

  # @return [Hash] plain-Hash form for tool output / JSON.
  def to_h
    {
      id: @id,
      score: @score,
      content: @content,
      source: @source,
      metadata: @metadata,
    }
  end

  # Value equality on the identifying triple — convenient for tests
  # and de-duplication. `source`/`metadata` are intentionally not
  # part of identity.
  def ==(other)
    other.is_a?(Chunk) &&
      other.id == @id &&
      other.score == @score &&
      other.content == @content
  end
  alias eql? ==

  def hash
    [@id, @score, @content].hash
  end
end

#sourceHash (readonly)

Returns the parent document record. When the producer supplied a source_transform: (the agent tool does, projecting through field_allowlist), this is the projected/redacted form.

Returns:

  • (Hash)

    the parent document record. When the producer supplied a source_transform: (the agent tool does, projecting through field_allowlist), this is the projected/redacted form.



30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# File 'lib/parse/retrieval/chunk.rb', line 30

class Chunk
  attr_reader :id, :score, :content, :source, :metadata

  # @param id [String]
  # @param score [Float, nil]
  # @param content [String]
  # @param source [Hash]
  # @param metadata [Hash]
  def initialize(id:, content:, source:, score: nil, metadata: {})
    @id = id.to_s
    @score = score
    @content = content
    @source = source
    @metadata = 
    freeze
  end

  # @return [Hash] plain-Hash form for tool output / JSON.
  def to_h
    {
      id: @id,
      score: @score,
      content: @content,
      source: @source,
      metadata: @metadata,
    }
  end

  # Value equality on the identifying triple — convenient for tests
  # and de-duplication. `source`/`metadata` are intentionally not
  # part of identity.
  def ==(other)
    other.is_a?(Chunk) &&
      other.id == @id &&
      other.score == @score &&
      other.content == @content
  end
  alias eql? ==

  def hash
    [@id, @score, @content].hash
  end
end

Instance Method Details

#==(other) ⇒ Object Also known as: eql?

Value equality on the identifying triple — convenient for tests and de-duplication. source/metadata are intentionally not part of identity.



61
62
63
64
65
66
# File 'lib/parse/retrieval/chunk.rb', line 61

def ==(other)
  other.is_a?(Chunk) &&
    other.id == @id &&
    other.score == @score &&
    other.content == @content
end

#hashObject



69
70
71
# File 'lib/parse/retrieval/chunk.rb', line 69

def hash
  [@id, @score, @content].hash
end

#to_hHash

Returns plain-Hash form for tool output / JSON.

Returns:

  • (Hash)

    plain-Hash form for tool output / JSON.



48
49
50
51
52
53
54
55
56
# File 'lib/parse/retrieval/chunk.rb', line 48

def to_h
  {
    id: @id,
    score: @score,
    content: @content,
    source: @source,
    metadata: @metadata,
  }
end