Class: Uniword::Transformation::Transformer

Inherits:
Object
  • Object
show all
Defined in:
lib/uniword/transformation/transformer.rb

Overview

Transformer for converting between format-specific model representations.

Responsibility: Transform Document models from one format’s conventions to another format’s conventions. Single Responsibility - transformation only.

The same Uniword model classes (Document, Paragraph, Run, Table) are used for both DOCX and MHTML, but their properties and structure may differ based on format conventions. The Transformer explicitly converts between these conventions using declarative transformation rules.

Architecture:

  • Uses TransformationRuleRegistry (Open/Closed Principle)

  • Delegates to specific rules for each element type (Single Responsibility)

  • Rules are MECE (Mutually Exclusive, Collectively Exhaustive)

  • Clean separation from serialization/deserialization

Examples:

Transform DOCX model to MHTML model

transformer = Uniword::Transformation::Transformer.new
mhtml_document = transformer.transform(
  source: docx_document,
  source_format: :docx,
  target_format: :mhtml
)

Named transformation methods

mhtml_doc = transformer.docx_to_mhtml(docx_doc)
docx_doc = transformer.mhtml_to_docx(mhtml_doc)

Instance Method Summary collapse

Constructor Details

#initializeTransformer

Initialize transformer with transformation rules



34
35
36
37
# File 'lib/uniword/transformation/transformer.rb', line 34

def initialize
  @rule_registry = TransformationRuleRegistry.new
  register_default_rules
end

Instance Method Details

#docx_package_to_mhtml(docx_package, document_name = nil) ⇒ Uniword::Mhtml::Document

Transform DOCX Package to MHTML model (preserves correct core_properties)

Use this when you have a DocxPackage to ensure core_properties are preserved.

Examples:

Transform DOCX Package to MHTML

mhtml_doc = transformer.docx_package_to_mhtml(docx_pkg, 'blank')

Parameters:

  • docx_package (Uniword::Docx::Package)

    DOCX package with document and core_properties

  • document_name (String) (defaults to: nil)

    Optional document name for Content-Location (e.g., ‘blank’ for blank.docx). If not provided, extracts from relationships.

Returns:



104
105
106
107
108
109
110
111
112
113
# File 'lib/uniword/transformation/transformer.rb', line 104

def docx_package_to_mhtml(docx_package, document_name = nil)
  # Prefer document_rels (contains hyperlinks) over package_rels
  rels = docx_package.document_rels || docx_package.package_rels
  OoxmlToMhtmlConverter.document_to_mht(
    docx_package.document,
    docx_package.core_properties,
    rels,
    document_name,
  )
end

#docx_to_mhtml(docx_document) ⇒ Document

Transform DOCX model to MHTML model

Explicitly named method - clear intent, no magic.

Examples:

Transform DOCX to MHTML

mhtml_doc = transformer.docx_to_mhtml(docx_doc)

Parameters:

  • docx_document (Document)

    DOCX document model

Returns:

  • (Document)

    MHTML document model



85
86
87
88
89
90
91
# File 'lib/uniword/transformation/transformer.rb', line 85

def docx_to_mhtml(docx_document)
  transform(
    source: docx_document,
    source_format: :docx,
    target_format: :mhtml,
  )
end

#mhtml_to_docx(mhtml_document) ⇒ Document

Transform MHTML model to DOCX model

Explicitly named method - clear intent, no magic.

Examples:

Transform MHTML to DOCX

docx_doc = transformer.mhtml_to_docx(mhtml_doc)

Parameters:

  • mhtml_document (Document)

    MHTML document model

Returns:

  • (Document)

    DOCX document model



124
125
126
127
128
129
130
# File 'lib/uniword/transformation/transformer.rb', line 124

def mhtml_to_docx(mhtml_document)
  transform(
    source: mhtml_document,
    source_format: :mhtml,
    target_format: :docx,
  )
end

#register_rule(rule) ⇒ self

Register a custom transformation rule

Allows extension without modification (Open/Closed Principle)

Examples:

Register custom rule

transformer.register_rule(CustomTransformationRule.new(
  source_format: :docx,
  target_format: :mhtml
))

Parameters:

Returns:

  • (self)

    Returns self for method chaining



144
145
146
147
# File 'lib/uniword/transformation/transformer.rb', line 144

def register_rule(rule)
  @rule_registry.register(rule)
  self
end

#transform(source:, source_format:, target_format:) ⇒ Wordprocessingml::DocumentRoot

Transform a document from one format to another

Explicitly declares source and target formats - no automatic detection.

Examples:

Explicit transformation

target_doc = transformer.transform(
  source: source_doc,
  source_format: :docx,
  target_format: :mhtml
)

Parameters:

  • source (Wordprocessingml::DocumentRoot)

    Source document model

  • source_format (Symbol)

    Source format (:docx or :mhtml)

  • target_format (Symbol)

    Target format (:docx or :mhtml)

Returns:

Raises:

  • (ArgumentError)

    if parameters are invalid



55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
# File 'lib/uniword/transformation/transformer.rb', line 55

def transform(source:, source_format:, target_format:)
  validate_transformation(source, source_format, target_format)

  # When targeting MHTML, produce an Mhtml::Document with HTML content
  return transform_to_mhtml(source) if target_format == :mhtml

  # When source is MHTML, transform to OOXML document
  return transform_from_mhtml(source) if source_format == :mhtml

  # Create new target document
  target = Wordprocessingml::DocumentRoot.new

  # Transform document-level metadata
  (source, target, source_format, target_format)

  # Transform body elements (paragraphs and tables)
  transform_body_elements(source, target, source_format, target_format)

  target
end