Class: Uniword::Transformation::HtmlToOoxmlConverter
- Inherits:
-
Object
- Object
- Uniword::Transformation::HtmlToOoxmlConverter
- Defined in:
- lib/uniword/transformation/html_to_ooxml_converter.rb
Overview
SERVICE for converting HTML to OOXML elements.
Public API coordinator — delegates to HtmlElementBuilder for OOXML construction and HtmlFormattingMapper for CSS/style handling.
Pure functions — no state, no side effects. Used by Transformer when source_format is :mhtml.
Class Method Summary collapse
-
.create_run(text) ⇒ Uniword::Wordprocessingml::Run
Create a simple run without properties.
-
.create_run_from_element(element) ⇒ Uniword::Wordprocessingml::Run?
Create a run from HTML element with inline formatting.
-
.decode_html_entities(text) ⇒ String
Decode HTML entities.
-
.extract_body(html) ⇒ String
Extract body content from HTML document.
-
.html_cell_to_cell(html_cell) ⇒ Uniword::Wordprocessingml::TableCell?
Convert a single HTML cell to OOXML table cell.
-
.html_element_to_paragraph(element) ⇒ Uniword::Wordprocessingml::Paragraph?
Convert a single HTML element to OOXML paragraph.
-
.html_row_to_row(html_row) ⇒ Uniword::Wordprocessingml::TableRow?
Convert a single HTML row to OOXML table row.
-
.html_table_to_table(html_table) ⇒ Uniword::Wordprocessingml::Table?
Convert a single HTML table to OOXML table.
-
.html_to_paragraphs(html_content) ⇒ Array<Uniword::Wordprocessingml::Paragraph>
Convert HTML content to OOXML paragraphs.
-
.html_to_tables(html_content) ⇒ Array<Uniword::Wordprocessingml::Table>
Convert HTML content to OOXML tables.
-
.map_css_class_to_style(css_class) ⇒ String?
Map MHT CSS class to OOXML style name.
Class Method Details
.create_run(text) ⇒ Uniword::Wordprocessingml::Run
Create a simple run without properties.
111 112 113 |
# File 'lib/uniword/transformation/html_to_ooxml_converter.rb', line 111 def self.create_run(text) HtmlElementBuilder.create_run(text) end |
.create_run_from_element(element) ⇒ Uniword::Wordprocessingml::Run?
Create a run from HTML element with inline formatting.
103 104 105 |
# File 'lib/uniword/transformation/html_to_ooxml_converter.rb', line 103 def self.create_run_from_element(element) HtmlElementBuilder.create_run_from_element(element) end |
.decode_html_entities(text) ⇒ String
Decode HTML entities.
119 120 121 |
# File 'lib/uniword/transformation/html_to_ooxml_converter.rb', line 119 def self.decode_html_entities(text) HtmlFormattingMapper.decode_entities(text) end |
.extract_body(html) ⇒ String
Extract body content from HTML document.
127 128 129 |
# File 'lib/uniword/transformation/html_to_ooxml_converter.rb', line 127 def self.extract_body(html) HtmlFormattingMapper.extract_body(html) end |
.html_cell_to_cell(html_cell) ⇒ Uniword::Wordprocessingml::TableCell?
Convert a single HTML cell to OOXML table cell.
79 80 81 |
# File 'lib/uniword/transformation/html_to_ooxml_converter.rb', line 79 def self.html_cell_to_cell(html_cell) HtmlElementBuilder.build_cell(html_cell) end |
.html_element_to_paragraph(element) ⇒ Uniword::Wordprocessingml::Paragraph?
Convert a single HTML element to OOXML paragraph.
87 88 89 |
# File 'lib/uniword/transformation/html_to_ooxml_converter.rb', line 87 def self.html_element_to_paragraph(element) HtmlElementBuilder.build_paragraph(element) end |
.html_row_to_row(html_row) ⇒ Uniword::Wordprocessingml::TableRow?
Convert a single HTML row to OOXML table row.
71 72 73 |
# File 'lib/uniword/transformation/html_to_ooxml_converter.rb', line 71 def self.html_row_to_row(html_row) HtmlElementBuilder.build_row(html_row) end |
.html_table_to_table(html_table) ⇒ Uniword::Wordprocessingml::Table?
Convert a single HTML table to OOXML table.
63 64 65 |
# File 'lib/uniword/transformation/html_to_ooxml_converter.rb', line 63 def self.html_table_to_table(html_table) HtmlElementBuilder.build_table(html_table) end |
.html_to_paragraphs(html_content) ⇒ Array<Uniword::Wordprocessingml::Paragraph>
Convert HTML content to OOXML paragraphs.
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
# File 'lib/uniword/transformation/html_to_ooxml_converter.rb', line 17 def self.html_to_paragraphs(html_content) return [] if html_content.nil? || html_content.empty? body = HtmlFormattingMapper.extract_body(html_content) doc = Nokogiri::HTML(body) paragraphs = [] doc.css("p, h1, h2, h3, h4, h5, h6, li, div, tr").each do |element| next if element.ancestors("td, th").any? next if %w[tr td].include?(element.name) if %w[div li].include?(element.name) && element.css("p, h1, h2, h3, h4, h5, h6, li, div, tr").any? next end paragraph = HtmlElementBuilder.build_paragraph(element) paragraphs << paragraph if paragraph end paragraphs end |
.html_to_tables(html_content) ⇒ Array<Uniword::Wordprocessingml::Table>
Convert HTML content to OOXML tables.
44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
# File 'lib/uniword/transformation/html_to_ooxml_converter.rb', line 44 def self.html_to_tables(html_content) return [] if html_content.nil? || html_content.empty? body = HtmlFormattingMapper.extract_body(html_content) doc = Nokogiri::HTML(body) tables = [] doc.css("table").each do |html_table| table = HtmlElementBuilder.build_table(html_table) tables << table if table end tables end |
.map_css_class_to_style(css_class) ⇒ String?
Map MHT CSS class to OOXML style name.
95 96 97 |
# File 'lib/uniword/transformation/html_to_ooxml_converter.rb', line 95 def self.map_css_class_to_style(css_class) HtmlFormattingMapper.map_css_class_to_style(css_class) end |