Module: Jekyll::L10n::DomAttributeExtractor

Defined in:
lib/jekyll-l10n/extraction/dom_attribute_extractor.rb

Overview

Extracts HTML attribute values from elements for translation.

DomAttributeExtractor identifies and extracts values from configurable HTML attributes (title, alt, aria-label, placeholder, aria-description) on DOM elements. It validates extracted values and generates file location references that include the attribute name for precise debugging and reference.

Key responsibilities:

  • Extract attribute values from DOM elements

  • Filter extractable attributes by whitelist

  • Validate attribute values (minimum length, non-numeric)

  • Generate attribute-specific file location references

  • Return entries ready for PO file format

Examples:

entries = DomAttributeExtractor.extract(node, 'docs/index.html', '_site',
                                        ['title', 'alt', 'aria-label'])
# Returns array of extraction entries for all valid attribute values

Class Method Summary collapse

Class Method Details

.extract(node, file_path, dest, translatable_attrs) ⇒ Array<Hash>

Extract attribute values from an HTML element.

Returns empty array if element is not an element node. For element nodes, identifies all specified translatable attributes with non-empty values and returns extraction entries for each, including attribute name in the reference.

Parameters:

  • node (Nokogiri::XML::Element)

    DOM element to extract from

  • file_path (String)

    Source file path (for file location reference)

  • dest (String)

    Destination directory (for file location reference)

  • translatable_attrs (Array<String>)

    Attribute names to extract (e.g., [‘title’, ‘alt’, ‘aria-label’, ‘placeholder’, ‘aria-description’])

Returns:

  • (Array<Hash>)

    Array of extraction entries, each containing:

    • :msgid [String] The attribute value to translate

    • :msgstr [String] Empty string (to be filled by translator)

    • :reference [String] File location reference including attribute name



44
45
46
47
48
49
50
51
52
# File 'lib/jekyll-l10n/extraction/dom_attribute_extractor.rb', line 44

def extract(node, file_path, dest, translatable_attrs)
  return [] unless node.element?

  attrs = extract_node_attributes(node, translatable_attrs)
  attrs.map do |attr_text, attr_name|
    reference = XPathReferenceGenerator.generate(node, file_path, dest, attr_name)
    { msgid: attr_text, msgstr: '', reference: reference }
  end
end

.extract_node_attributes(node, translatable_attrs) ⇒ Object



54
55
56
57
58
59
60
61
62
63
64
65
66
# File 'lib/jekyll-l10n/extraction/dom_attribute_extractor.rb', line 54

def extract_node_attributes(node, translatable_attrs)
  attrs = {}

  translatable_attrs.each do |attr_name|
    value = node[attr_name]
    next if value.nil? || value.empty?

    value = value.strip
    attrs[value] = attr_name if TextValidator.valid?(value)
  end

  attrs
end