Class: Uniword::Transformation::OoxmlToHtmlConverter
- Inherits:
-
Object
- Object
- Uniword::Transformation::OoxmlToHtmlConverter
- Defined in:
- lib/uniword/transformation/ooxml_to_html_converter.rb
Overview
SERVICE for converting OOXML elements to HTML.
Pure functions - no state, no side effects. Used by Transformer when target_format is :mhtml.
Class Method Summary collapse
-
.document_to_html(document) ⇒ String
Convert OOXML Document to HTML string.
-
.element_to_html(element) ⇒ String
Convert a single OOXML element to HTML.
-
.escape_html(text) ⇒ String
Escape HTML special characters.
-
.font_size_to_html(size_value) ⇒ String
Convert OOXML font size (half-points) to HTML font size.
-
.paragraph_style(paragraph) ⇒ String
Extract paragraph style attribute.
-
.paragraph_to_html(paragraph) ⇒ String
Convert OOXML Paragraph to HTML.
-
.run_to_html(run) ⇒ String
Convert OOXML Run to HTML.
-
.table_cell_to_html(cell) ⇒ String
Convert OOXML TableCell to HTML.
-
.table_row_to_html(row) ⇒ String
Convert OOXML TableRow to HTML.
-
.table_to_html(table) ⇒ String
Convert OOXML Table to HTML.
-
.wrap_html(body_html, document) ⇒ String
Wrap HTML content in full HTML document.
Class Method Details
.document_to_html(document) ⇒ String
Convert OOXML Document to HTML string
19 20 21 22 23 24 25 |
# File 'lib/uniword/transformation/ooxml_to_html_converter.rb', line 19 def self.document_to_html(document) body = document.body return "" unless body elements_html = body.elements.map { |e| element_to_html(e) }.join("\n") wrap_html(elements_html, document) end |
.element_to_html(element) ⇒ String
Convert a single OOXML element to HTML
31 32 33 34 35 36 37 38 39 40 |
# File 'lib/uniword/transformation/ooxml_to_html_converter.rb', line 31 def self.element_to_html(element) case element when Uniword::Wordprocessingml::Paragraph paragraph_to_html(element) when Uniword::Wordprocessingml::Table table_to_html(element) else "" end end |
.escape_html(text) ⇒ String
Escape HTML special characters
162 163 164 165 166 167 168 169 |
# File 'lib/uniword/transformation/ooxml_to_html_converter.rb', line 162 def self.escape_html(text) text.to_s .gsub("&", "&") .gsub("<", "<") .gsub(">", ">") .gsub('"', """) .gsub("'", "'") end |
.font_size_to_html(size_value) ⇒ String
Convert OOXML font size (half-points) to HTML font size
150 151 152 153 154 155 156 |
# File 'lib/uniword/transformation/ooxml_to_html_converter.rb', line 150 def self.font_size_to_html(size_value) return nil unless size_value # Convert half-points to points size_pts = size_value.to_f / 2 "#{size_pts}pt" end |
.paragraph_style(paragraph) ⇒ String
Extract paragraph style attribute
137 138 139 140 141 142 143 144 |
# File 'lib/uniword/transformation/ooxml_to_html_converter.rb', line 137 def self.paragraph_style(paragraph) return "" unless paragraph.properties style = paragraph.properties.style return "" unless style " class=\"#{escape_html(style)}\"" end |
.paragraph_to_html(paragraph) ⇒ String
Convert OOXML Paragraph to HTML
46 47 48 49 50 |
# File 'lib/uniword/transformation/ooxml_to_html_converter.rb', line 46 def self.paragraph_to_html(paragraph) runs_html = paragraph.runs.map { |r| run_to_html(r) }.join style = paragraph_style(paragraph) "<p#{style}>#{runs_html}</p>" end |
.run_to_html(run) ⇒ String
Convert OOXML Run to HTML
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
# File 'lib/uniword/transformation/ooxml_to_html_converter.rb', line 56 def self.run_to_html(run) text = escape_html(run.text || "") return text if text.empty? props = run.properties return text unless props # Apply inline formatting text = "<strong>#{text}</strong>" if props.bold text = "<em>#{text}</em>" if props.italic text = "<u>#{text}</u>" if props.underline&.value text = "<span style=\"color:#{props.color&.value}\">#{text}</span>" if props.color&.value text = "<span style=\"font-size:#{font_size_to_html(props.size&.value)}\">#{text}</span>" if props.size&.value text = "<span style=\"font-family:'#{props.font&.ascii}'\">#{text}</span>" if props.font&.ascii text end |
.table_cell_to_html(cell) ⇒ String
Convert OOXML TableCell to HTML
98 99 100 101 102 103 |
# File 'lib/uniword/transformation/ooxml_to_html_converter.rb', line 98 def self.table_cell_to_html(cell) paragraphs_html = cell.paragraphs.map do |p| paragraph_to_html(p) end.join("\n") "<td>\n#{paragraphs_html}\n</td>" end |
.table_row_to_html(row) ⇒ String
Convert OOXML TableRow to HTML
87 88 89 90 91 92 |
# File 'lib/uniword/transformation/ooxml_to_html_converter.rb', line 87 def self.table_row_to_html(row) cells_html = row.cells.map do |cell| table_cell_to_html(cell) end.join("\n") "<tr>\n#{cells_html}\n</tr>" end |
.table_to_html(table) ⇒ String
Convert OOXML Table to HTML
78 79 80 81 |
# File 'lib/uniword/transformation/ooxml_to_html_converter.rb', line 78 def self.table_to_html(table) rows_html = table.rows.map { |row| table_row_to_html(row) }.join("\n") "<table>\n#{rows_html}\n</table>" end |
.wrap_html(body_html, document) ⇒ String
Wrap HTML content in full HTML document
110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
# File 'lib/uniword/transformation/ooxml_to_html_converter.rb', line 110 def self.wrap_html(body_html, document) title = document.title ? escape_html(document.title) : "Document" core_props = document.core_properties = core_props.respond_to?(:creator) ? core_props.creator : nil = [] << "<meta name=\"author\" content=\"#{escape_html()}\">" if <<~HTML <!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title>#{title}</title> #{.join("\n")} </head> <body> #{body_html} </body> </html> HTML end |