Class: Ollama::MultimodalInput

Inherits:
Object
  • Object
show all
Defined in:
lib/ollama/multimodal_input.rb

Overview

Typed input builder for multimodal chat messages.

Handles modality ordering (image/audio before text for Gemma 4), rejects modalities unsupported by the active model profile, and converts the typed part list into a plain message hash.

Usage:

input = MultimodalInput.build(
  [
    { type: :image, data: image_bytes, token_budget: 560 },
    { type: :text,  data: "Summarize this chart." }
  ],
  profile: profile
)
message = input.to_message  # { role: "user", content: "...", images: [...] }

Constant Summary collapse

SUPPORTED_TYPES =
%i[text image audio].freeze

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeMultimodalInput

Returns a new instance of MultimodalInput.



24
25
26
# File 'lib/ollama/multimodal_input.rb', line 24

def initialize
  @parts = []
end

Instance Attribute Details

#partsObject (readonly)

Returns the value of attribute parts.



22
23
24
# File 'lib/ollama/multimodal_input.rb', line 22

def parts
  @parts
end

Class Method Details

.build(inputs, profile:) ⇒ MultimodalInput

Build and reorder inputs for a given profile.

Parameters:

  • inputs (Array<Hash>)

    Each hash: { type:, data:, token_budget: (optional) }

  • profile (ModelProfile)

Returns:



32
33
34
35
36
37
# File 'lib/ollama/multimodal_input.rb', line 32

def self.build(inputs, profile:)
  obj = new
  inputs.each { |part| obj.add(part, profile: profile) }
  obj.reorder!(profile.modality_order)
  obj
end

Instance Method Details

#add(part, profile: nil) ⇒ Object

Add a single input part, validating type and profile support.

Parameters:

  • part (Hash)

    { type: Symbol, data: Object, token_budget: Integer (optional) }

  • profile (ModelProfile, nil) (defaults to: nil)


42
43
44
45
46
47
48
49
50
51
52
53
54
# File 'lib/ollama/multimodal_input.rb', line 42

def add(part, profile: nil)
  type = part[:type].to_sym
  unless SUPPORTED_TYPES.include?(type)
    raise ArgumentError, "Unsupported input type: #{type}. Must be one of: #{SUPPORTED_TYPES.join(", ")}"
  end

  if profile && !profile.supports_modality?(type)
    raise UnsupportedCapabilityError,
          "Model '#{profile.model_name}' does not support #{type} input"
  end

  @parts << { type: type, data: part[:data], token_budget: part[:token_budget] }.compact
end

#reorder!(order) ⇒ self

Reorder parts by the profile’s preferred modality order.

Parameters:

  • order (Array<Symbol>)

Returns:

  • (self)


59
60
61
62
# File 'lib/ollama/multimodal_input.rb', line 59

def reorder!(order)
  @parts.sort_by! { |p| order.index(p[:type]) || 999 }
  self
end

#to_message(role: "user") ⇒ Hash

Build a user message hash from the typed parts. Images are extracted into the :images key; text is joined.

Parameters:

  • role (String) (defaults to: "user")

Returns:

  • (Hash)


68
69
70
71
72
73
74
75
# File 'lib/ollama/multimodal_input.rb', line 68

def to_message(role: "user")
  text  = @parts.select { |p| p[:type] == :text  }.map { |p| p[:data] }.join("\n")
  imgs  = @parts.select { |p| p[:type] == :image }.map { |p| p[:data] }

  msg = { role: role, content: text }
  msg[:images] = imgs unless imgs.empty?
  msg
end