Class: Raix::MultimodalContentAdapter
- Inherits:
-
Object
- Object
- Raix::MultimodalContentAdapter
- Defined in:
- lib/raix/multimodal_content_adapter.rb
Overview
Translates OpenAI-style multimodal content arrays (a ‘text` part plus one or more `image_url` parts) into a RubyLLM::Content so images survive the trip to the provider.
RubyLLM’s ‘add_message`/`ask` treat a raw array of OpenAI content hashes as plain text, so an `{ type: “image_url”, image_url: { url: … } }` part is silently dropped and a vision model receives text only. See github.com/OlympiaAI/raix/issues/51
Anything that is not an array of hashes containing at least one ‘image_url` part is returned untouched, so existing text completions are unaffected.
Class Method Summary collapse
Instance Method Summary collapse
-
#initialize(content) ⇒ MultimodalContentAdapter
constructor
A new instance of MultimodalContentAdapter.
- #translate ⇒ Object
Constructor Details
#initialize(content) ⇒ MultimodalContentAdapter
Returns a new instance of MultimodalContentAdapter.
24 25 26 |
# File 'lib/raix/multimodal_content_adapter.rb', line 24 def initialize(content) @content = content end |
Class Method Details
.translate(content) ⇒ Object
20 21 22 |
# File 'lib/raix/multimodal_content_adapter.rb', line 20 def self.translate(content) new(content).translate end |
Instance Method Details
#translate ⇒ Object
28 29 30 31 32 33 34 35 36 37 38 |
# File 'lib/raix/multimodal_content_adapter.rb', line 28 def translate return @content unless translatable? parts = @content.map(&:with_indifferent_access) = parts.select { |part| part[:type].to_s == "image_url" } .filter_map { |part| (part.dig(:image_url, :url)) } return @content if .empty? text = parts.select { |part| part[:type].to_s == "text" }.filter_map { |part| part[:text] }.join("\n") RubyLLM::Content.new(text.empty? ? nil : text, ) end |