Class: Jekyll::L10n::UrlTransformer

Inherits:
Object
  • Object
show all
Defined in:
lib/jekyll-l10n/utils/url_transformer.rb

Overview

Transforms relative URLs in HTML to include locale prefixes.

UrlTransformer modifies href attributes in links to prefix them with the target locale (e.g., /docs/page.html becomes /es/docs/page.html). It preserves external links, anchors, mailto, tel links, and links already containing locale prefixes. Skips English locale (default language).

Key responsibilities:

  • Identify relative links to transform

  • Add locale prefix to href values

  • Preserve external links and special URLs

  • Skip already-localized URLs

  • Handle baseurl paths correctly

  • Remove auto-inserted meta tags

Examples:

html = '<a href="/docs/page.html">Link</a>'
transformed = UrlTransformer.transform(html, 'es', '/baseurl')
# Returns '<a href="/baseurl/es/docs/page.html">Link</a>'

Class Method Summary collapse

Class Method Details

.transform(html, locale, baseurl) ⇒ String

Transform all relative URLs in HTML to include locale prefix.

Parses HTML document, identifies relative links, adds locale prefix, removes auto-inserted meta tags, and returns transformed HTML.

Parameters:

  • html (String)

    HTML content with URLs to transform

  • locale (String)

    Target locale code (e.g., ‘es’, ‘fr’)

  • baseurl (String)

    Base URL for site (e.g., ‘/baseurl’)

Returns:

  • (String)

    HTML with locale-prefixed URLs



37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# File 'lib/jekyll-l10n/utils/url_transformer.rb', line 37

def transform(html, locale, baseurl)
  return html if should_skip_transform?(locale)

  # Use Nokogiri::HTML to properly parse full HTML documents while preserving
  # DOCTYPE, html tag, and document structure. Auto-inserted meta tags are
  # removed via regex post-processing (same approach as HtmlTranslator).
  # See: spec/regression/nokogiri_meta_tag_spec.rb for regression tests
  doc = Nokogiri::HTML(html)

  transform_document(doc, locale, baseurl)

  result = doc.to_html

  # Remove the auto-inserted meta tag by libxml2 during HTML serialization
  # Matches: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  pattern = %r{<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">\n?}
  result.gsub(pattern, '')
end

.transform_document(doc, locale, baseurl) ⇒ void

This method returns an undefined value.

Transform URLs in a parsed HTML document.

Modifies href attributes of links in the parsed document in place. Useful when document is already parsed to avoid re-parsing.

Parameters:

  • doc (Nokogiri::HTML::Document)

    Parsed HTML document

  • locale (String)

    Target locale code

  • baseurl (String)

    Base URL for site



65
66
67
68
69
70
71
72
73
74
75
76
# File 'lib/jekyll-l10n/utils/url_transformer.rb', line 65

def transform_document(doc, locale, baseurl)
  return if should_skip_transform?(locale)

  doc.css('a[href]').each do |link|
    href = link['href']
    next unless should_transform_href?(href, locale, baseurl)

    next if language_dropdown_link?(link)

    link['href'] = add_locale_prefix(href, locale, baseurl)
  end
end