Class: SwarmSDK::V3::Tools::DocumentConverters::DocxConverter
- Defined in:
- lib/swarm_sdk/v3/tools/document_converters/docx_converter.rb
Overview
DOCX document converter
Converts DOCX files to text and extracts images. Requires the docx gem (which includes rubyzip).
Class Method Summary collapse
-
.extensions ⇒ Array<String>
Supported extensions.
-
.format_name ⇒ String
Format name.
-
.gem_name ⇒ String
Gem name.
Instance Method Summary collapse
-
#convert(file_path) ⇒ String, RubyLLM::Content
Convert DOCX to text with optional image attachments.
Methods inherited from Base
Class Method Details
.extensions ⇒ Array<String>
Returns supported extensions.
24 25 26 |
# File 'lib/swarm_sdk/v3/tools/document_converters/docx_converter.rb', line 24 def extensions [".docx"] end |
.format_name ⇒ String
Returns format name.
19 20 21 |
# File 'lib/swarm_sdk/v3/tools/document_converters/docx_converter.rb', line 19 def format_name "DOCX" end |
.gem_name ⇒ String
Returns gem name.
14 15 16 |
# File 'lib/swarm_sdk/v3/tools/document_converters/docx_converter.rb', line 14 def gem_name "docx" end |
Instance Method Details
#convert(file_path) ⇒ String, RubyLLM::Content
Convert DOCX to text with optional image attachments
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
# File 'lib/swarm_sdk/v3/tools/document_converters/docx_converter.rb', line 33 def convert(file_path) return unless self.class.available? return error("Legacy .doc format not supported") if file_path.end_with?(".doc") require "docx" doc = Docx::Document.open(file_path) # Extract text content output = build_text_output(doc, file_path) # Extract images (inline - no separate class) image_paths = extract_images(file_path) if image_paths.any? content = RubyLLM::Content.new(output) image_paths.each { |path| content.(path) } content else output end rescue StandardError => e error("DOCX conversion failed: #{e.}") end |