Class: Ollama::MultimodalInput
- Inherits:
-
Object
- Object
- Ollama::MultimodalInput
- Defined in:
- lib/ollama/multimodal_input.rb
Overview
Typed input builder for multimodal chat messages.
Handles modality ordering (image/audio before text for Gemma 4), rejects modalities unsupported by the active model profile, and converts the typed part list into a plain message hash.
Usage:
input = MultimodalInput.build(
[
{ type: :image, data: image_bytes, token_budget: 560 },
{ type: :text, data: "Summarize this chart." }
],
profile: profile
)
= input. # { role: "user", content: "...", images: [...] }
Constant Summary collapse
- SUPPORTED_TYPES =
%i[text image audio].freeze
Instance Attribute Summary collapse
-
#parts ⇒ Object
readonly
Returns the value of attribute parts.
Class Method Summary collapse
-
.build(inputs, profile:) ⇒ MultimodalInput
Build and reorder inputs for a given profile.
Instance Method Summary collapse
-
#add(part, profile: nil) ⇒ Object
Add a single input part, validating type and profile support.
-
#initialize ⇒ MultimodalInput
constructor
A new instance of MultimodalInput.
-
#reorder!(order) ⇒ self
Reorder parts by the profile’s preferred modality order.
-
#to_message(role: "user") ⇒ Hash
Build a user message hash from the typed parts.
Constructor Details
#initialize ⇒ MultimodalInput
Returns a new instance of MultimodalInput.
24 25 26 |
# File 'lib/ollama/multimodal_input.rb', line 24 def initialize @parts = [] end |
Instance Attribute Details
#parts ⇒ Object (readonly)
Returns the value of attribute parts.
22 23 24 |
# File 'lib/ollama/multimodal_input.rb', line 22 def parts @parts end |
Class Method Details
.build(inputs, profile:) ⇒ MultimodalInput
Build and reorder inputs for a given profile.
32 33 34 35 36 37 |
# File 'lib/ollama/multimodal_input.rb', line 32 def self.build(inputs, profile:) obj = new inputs.each { |part| obj.add(part, profile: profile) } obj.reorder!(profile.modality_order) obj end |
Instance Method Details
#add(part, profile: nil) ⇒ Object
Add a single input part, validating type and profile support.
42 43 44 45 46 47 48 49 50 51 52 53 54 |
# File 'lib/ollama/multimodal_input.rb', line 42 def add(part, profile: nil) type = part[:type].to_sym unless SUPPORTED_TYPES.include?(type) raise ArgumentError, "Unsupported input type: #{type}. Must be one of: #{SUPPORTED_TYPES.join(", ")}" end if profile && !profile.supports_modality?(type) raise UnsupportedCapabilityError, "Model '#{profile.model_name}' does not support #{type} input" end @parts << { type: type, data: part[:data], token_budget: part[:token_budget] }.compact end |
#reorder!(order) ⇒ self
Reorder parts by the profile’s preferred modality order.
59 60 61 62 |
# File 'lib/ollama/multimodal_input.rb', line 59 def reorder!(order) @parts.sort_by! { |p| order.index(p[:type]) || 999 } self end |
#to_message(role: "user") ⇒ Hash
Build a user message hash from the typed parts. Images are extracted into the :images key; text is joined.
68 69 70 71 72 73 74 75 |
# File 'lib/ollama/multimodal_input.rb', line 68 def (role: "user") text = @parts.select { |p| p[:type] == :text }.map { |p| p[:data] }.join("\n") imgs = @parts.select { |p| p[:type] == :image }.map { |p| p[:data] } msg = { role: role, content: text } msg[:images] = imgs unless imgs.empty? msg end |