Class: Jekyll::L10n::HtmlTranslator
- Inherits:
-
Object
- Object
- Jekyll::L10n::HtmlTranslator
- Defined in:
- lib/jekyll-l10n/translation/html_translator.rb
Overview
Applies translations from PO files to HTML text nodes and DOM attributes.
HtmlTranslator walks the DOM tree of parsed HTML documents and applies translations to text content and configurable HTML attributes (title, alt, aria-label, etc.). It supports three fallback modes for missing translations: using original text, marking untranslated content, or leaving blank. It also handles block-level translations for elements with complete translations and preserves URL transformations.
Key responsibilities:
-
Parse full HTML documents while preserving DOCTYPE and structure
-
Translate text nodes using normalized text for matching
-
Translate HTML attributes (title, alt, aria-label, placeholder, aria-description)
-
Apply fallback modes when translations are missing (english/marker/empty)
-
Handle block-level translations for content elements
-
Transform relative URLs to locale-prefixed URLs
-
Remove auto-inserted meta charset tags from serialized HTML
Instance Attribute Summary collapse
-
#debug_logging ⇒ Object
readonly
Returns the value of attribute debug_logging.
-
#fallback_mode ⇒ Object
readonly
Returns the value of attribute fallback_mode.
-
#translatable_attrs ⇒ Object
readonly
Returns the value of attribute translatable_attrs.
Instance Method Summary collapse
-
#initialize(fallback_mode, translatable_attrs, debug_logging: false) ⇒ HtmlTranslator
constructor
Initialize a new HtmlTranslator.
-
#translate(html, translations, locale = 'en', baseurl = '') ⇒ String
Translate an HTML document to a specific locale.
Constructor Details
#initialize(fallback_mode, translatable_attrs, debug_logging: false) ⇒ HtmlTranslator
Initialize a new HtmlTranslator.
51 52 53 54 55 |
# File 'lib/jekyll-l10n/translation/html_translator.rb', line 51 def initialize(fallback_mode, translatable_attrs, debug_logging: false) @fallback_mode = fallback_mode @translatable_attrs = translatable_attrs @debug_logging = debug_logging end |
Instance Attribute Details
#debug_logging ⇒ Object (readonly)
Returns the value of attribute debug_logging.
39 40 41 |
# File 'lib/jekyll-l10n/translation/html_translator.rb', line 39 def debug_logging @debug_logging end |
#fallback_mode ⇒ Object (readonly)
Returns the value of attribute fallback_mode.
39 40 41 |
# File 'lib/jekyll-l10n/translation/html_translator.rb', line 39 def fallback_mode @fallback_mode end |
#translatable_attrs ⇒ Object (readonly)
Returns the value of attribute translatable_attrs.
39 40 41 |
# File 'lib/jekyll-l10n/translation/html_translator.rb', line 39 def translatable_attrs @translatable_attrs end |
Instance Method Details
#translate(html, translations, locale = 'en', baseurl = '') ⇒ String
Translate an HTML document to a specific locale.
Parses the HTML document, applies translations to text nodes and attributes, transforms URLs to be locale-aware, and returns the translated HTML with proper structure preserved.
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
# File 'lib/jekyll-l10n/translation/html_translator.rb', line 69 def translate(html, translations, locale = 'en', baseurl = '') # Use HtmlParser to properly parse full HTML documents while preserving # DOCTYPE, html tag, and document structure. Any auto-inserted meta tags are # removed by HtmlParser.remove_meta_charset after serialization. # See: spec/regression/nokogiri_meta_tag_spec.rb for regression tests doc = HtmlParser.parse_document(html) translate_node(doc, translations) # Transform URLs on the document object before serialization to avoid double-parsing # and preserve the correct DOCTYPE and HTML structure. This prevents Nokogiri from # downgrading to HTML 4.0 DOCTYPE when parsing the serialized HTML again. # See: spec/jekyll-l10n/utils/url_transformer_spec.rb for tests UrlTransformer.transform_document(doc, locale, baseurl) result = doc.to_html # Remove the auto-inserted meta tag by libxml2 during HTML serialization # Matches: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> HtmlParser.(result) end |