Class: SwarmSDK::V3::Tools::DocumentConverters::PdfConverter

Inherits:

Base

Object
Base
SwarmSDK::V3::Tools::DocumentConverters::PdfConverter

show all

Defined in:: lib/swarm_sdk/v3/tools/document_converters/pdf_converter.rb

Overview

PDF document converter

Converts PDF files to text and extracts JPEG images. Requires the pdf-reader gem.

Class Method Summary collapse

.extensions ⇒ Array<String>

Supported extensions.
.format_name ⇒ String

Format name.
.gem_name ⇒ String

Gem name.

Instance Method Summary collapse

#convert(file_path) ⇒ String, RubyLLM::Content

Convert PDF to text with optional image attachments.

Methods inherited from Base

available?

Class Method Details

.extensions ⇒ `Array<String>`

Returns supported extensions.

Returns:

(Array<String>) —

supported extensions



24
25
26

# File 'lib/swarm_sdk/v3/tools/document_converters/pdf_converter.rb', line 24

def extensions
  [".pdf"]
end

.format_name ⇒ `String`

Returns format name.

Returns:

(String) —

format name



19
20
21

# File 'lib/swarm_sdk/v3/tools/document_converters/pdf_converter.rb', line 19

def format_name
  "PDF"
end

.gem_name ⇒ `String`

Returns gem name.

Returns:

(String) —

gem name



14
15
16

# File 'lib/swarm_sdk/v3/tools/document_converters/pdf_converter.rb', line 14

def gem_name
  "pdf-reader"
end

Instance Method Details

#convert(file_path) ⇒ `String`, `RubyLLM::Content`

Convert PDF to text with optional image attachments

Parameters:

file_path (String) —

path to PDF file

Returns:

(String, RubyLLM::Content) —

text or content with images

# File 'lib/swarm_sdk/v3/tools/document_converters/pdf_converter.rb', line 33

def convert(file_path)
  return unsupported_format_message unless self.class.available?

  require "pdf-reader"
  reader = PDF::Reader.new(file_path)

  # Extract text from all pages
  output = build_text_output(reader, file_path)

  # Extract JPEG images (inline - no separate class)
  image_paths = extract_jpeg_images(reader)

  # Return with images if any extracted
  if image_paths.any?
    content = RubyLLM::Content.new(output)
    image_paths.each { |path| content.add_attachment(path) }
    content
  else
    output
  end
rescue PDF::Reader::MalformedPDFError => e
  error("Malformed PDF: #{e.message}")
rescue StandardError => e
  error("PDF conversion failed: #{e.message}")
end

Class: SwarmSDK::V3::Tools::DocumentConverters::PdfConverter

Overview

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from Base

Class Method Details

.extensions ⇒ Array<String>

.format_name ⇒ String

.gem_name ⇒ String

Instance Method Details

#convert(file_path) ⇒ String, RubyLLM::Content

.extensions ⇒ `Array<String>`

.format_name ⇒ `String`

.gem_name ⇒ `String`

#convert(file_path) ⇒ `String`, `RubyLLM::Content`