Class: Uniword::FormatConverter

Inherits:
Object
  • Object
show all
Defined in:
lib/uniword/format_converter.rb

Overview

Explicit format converter with declarative API.

Responsibility: Coordinate format conversion using model transformation. Single Responsibility - orchestrates conversion, delegates implementation.

Architecture follows separation of concerns:

  1. Reading (DocumentFactory) - separate from transformation

  2. Transformation (Transformer) - separate from I/O

  3. Writing (DocumentWriter) - separate from transformation

Provides explicit, declarative API with no magic or automatic detection.

Examples:

Explicit DOCX to MHTML conversion

converter = Uniword::FormatConverter.new
result = converter.convert(
  source: "input.docx",
  source_format: :docx,
  target: "output.mhtml",
  target_format: :mhtml
)

Named conversion method

result = converter.docx_to_mhtml(
  source: "document.docx",
  target: "document.mhtml"
)

Defined Under Namespace

Classes: ConversionResult

Instance Method Summary collapse

Constructor Details

#initialize(options = {}) ⇒ FormatConverter

Initialize a new format converter

Parameters:

  • options (Hash) (defaults to: {})

    Converter options

Options Hash (options):

  • :logger (Logger) — default: nil

    Logger for conversion progress

  • :preserve_metadata (Boolean) — default: true

    Preserve styles/theme

  • :transformer (Transformation::Transformer) — default: nil

    Custom transformer



40
41
42
43
44
# File 'lib/uniword/format_converter.rb', line 40

def initialize(options = {})
  @logger = options[:logger]
  @preserve_metadata = options.fetch(:preserve_metadata, true)
  @transformer = options[:transformer] || Transformation::Transformer.new
end

Instance Method Details

#batch_convert(sources:, source_format:, target_format:, target_dir:) ⇒ Array<ConversionResult>

Batch convert multiple files explicitly.

Examples:

Batch conversion

results = converter.batch_convert(
  sources: Dir.glob("*.mhtml"),
  source_format: :mhtml,
  target_format: :docx,
  target_dir: "converted/"
)
results.each { |r| puts r }

Parameters:

  • sources (Array<String>)

    Array of source file paths

  • source_format (Symbol)

    Source format for all files

  • target_format (Symbol)

    Target format for all files

  • target_dir (String)

    Directory for output files

Returns:



175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
# File 'lib/uniword/format_converter.rb', line 175

def batch_convert(sources:, source_format:, target_format:, target_dir:)
  # Ensure target directory exists
  FileUtils.mkdir_p(target_dir)

  sources.map do |source|
    basename = File.basename(source, File.extname(source))
    target = File.join(target_dir, "#{basename}.#{target_format}")

    convert(
      source: source,
      source_format: source_format,
      target: target,
      target_format: target_format
    )
  end
end

#convert(source:, source_format:, target:, target_format:, **_options) ⇒ ConversionResult

Convert between formats with explicit specification.

All parameters must be explicitly specified - no automatic detection.

Examples:

Explicit conversion

result = converter.convert(
  source: "input.mhtml",
  source_format: :mhtml,
  target: "output.docx",
  target_format: :docx
)
puts result  # "Conversion: mhtml → docx (SUCCESS)"

Parameters:

  • source (String, IO)

    Source file path or stream

  • source_format (Symbol)

    Source format (:docx or :mhtml)

  • target (String, IO)

    Target file path or stream

  • target_format (Symbol)

    Target format (:docx or :mhtml)

  • options (Hash)

    Conversion options

Returns:

Raises:

  • (ArgumentError)

    if parameters are invalid



68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
# File 'lib/uniword/format_converter.rb', line 68

def convert(source:, source_format:, target:, target_format:, **_options)
  validate_conversion_params(source, source_format, target, target_format)

  log_conversion_start(source, source_format, target, target_format)

  begin
    # Step 1: Read source file to model (deserialization)
    source_document = read_document(source, source_format)

    # Step 2: Transform model (model-to-model transformation)
    target_document = @transformer.transform(
      source: source_document,
      source_format: source_format,
      target_format: target_format
    )

    # Step 3: Write target model to file (serialization)
    write_document(target_document, target, target_format)

    # Create success result
    ConversionResult.new(
      source: source,
      source_format: source_format,
      target: target,
      target_format: target_format,
      success: true,
      paragraphs_count: document_stats(target_document)[:paragraphs],
      tables_count: document_stats(target_document)[:tables],
      images_count: document_stats(target_document)[:images]
    )
  rescue StandardError => e
    # Create failure result
    ConversionResult.new(
      source: source,
      source_format: source_format,
      target: target,
      target_format: target_format,
      success: false,
      error: e.message
    )
  end
end

#docx_to_mhtml(source:, target:) ⇒ ConversionResult

Convert DOCX to MHTML with explicit declaration.

Declarative method that clearly states the conversion operation.

Examples:

Declarative DOCX to MHTML

result = converter.docx_to_mhtml(
  source: "document.docx",
  target: "document.mhtml"
)

Parameters:

  • source (String, IO)

    DOCX source file or stream

  • target (String)

    MHTML target file path

  • options (Hash)

    Conversion options

Returns:



149
150
151
152
153
154
155
156
157
# File 'lib/uniword/format_converter.rb', line 149

def docx_to_mhtml(source:, target:, **)
  convert(
    source: source,
    source_format: :docx,
    target: target,
    target_format: :mhtml,
    **
  )
end

#mhtml_to_docx(source:, target:) ⇒ ConversionResult

Convert MHTML to DOCX with explicit declaration.

Declarative method that clearly states the conversion operation.

Examples:

Declarative MHTML to DOCX

result = converter.mhtml_to_docx(
  source: "document.mhtml",
  target: "document.docx"
)

Parameters:

  • source (String, IO)

    MHTML source file or stream

  • target (String)

    DOCX target file path

  • options (Hash)

    Conversion options

Returns:



125
126
127
128
129
130
131
132
133
# File 'lib/uniword/format_converter.rb', line 125

def mhtml_to_docx(source:, target:, **)
  convert(
    source: source,
    source_format: :mhtml,
    target: target,
    target_format: :docx,
    **
  )
end