Class: SwarmSDK::V3::Tools::DocumentConverters::DocxConverter

Inherits:
Base
  • Object
show all
Defined in:
lib/swarm_sdk/v3/tools/document_converters/docx_converter.rb

Overview

DOCX document converter

Converts DOCX files to text and extracts images. Requires the docx gem (which includes rubyzip).

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from Base

available?

Class Method Details

.extensionsArray<String>

Returns supported extensions.

Returns:

  • (Array<String>)

    supported extensions



24
25
26
# File 'lib/swarm_sdk/v3/tools/document_converters/docx_converter.rb', line 24

def extensions
  [".docx"]
end

.format_nameString

Returns format name.

Returns:

  • (String)

    format name



19
20
21
# File 'lib/swarm_sdk/v3/tools/document_converters/docx_converter.rb', line 19

def format_name
  "DOCX"
end

.gem_nameString

Returns gem name.

Returns:

  • (String)

    gem name



14
15
16
# File 'lib/swarm_sdk/v3/tools/document_converters/docx_converter.rb', line 14

def gem_name
  "docx"
end

Instance Method Details

#convert(file_path) ⇒ String, RubyLLM::Content

Convert DOCX to text with optional image attachments

Parameters:

  • file_path (String)

    path to DOCX file

Returns:

  • (String, RubyLLM::Content)

    text or content with images



33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# File 'lib/swarm_sdk/v3/tools/document_converters/docx_converter.rb', line 33

def convert(file_path)
  return unsupported_format_message unless self.class.available?
  return error("Legacy .doc format not supported") if file_path.end_with?(".doc")

  require "docx"
  doc = Docx::Document.open(file_path)

  # Extract text content
  output = build_text_output(doc, file_path)

  # Extract images (inline - no separate class)
  image_paths = extract_images(file_path)

  if image_paths.any?
    content = RubyLLM::Content.new(output)
    image_paths.each { |path| content.add_attachment(path) }
    content
  else
    output
  end
rescue StandardError => e
  error("DOCX conversion failed: #{e.message}")
end