Module: Goodmail::Plaintext

Extended by:
Plaintext
Included in:
Plaintext
Defined in:
lib/goodmail/plaintext.rb

Overview

Plain-text generator for the ‘text/plain` part of every Goodmail multipart message. Single source of truth shared by `Goodmail::Email.render` and `Goodmail::Mailer#compose_message` —before this consolidation, both code paths had a copy of the same gsub cleanup pipeline and the same Premailer call, which made it easy for plaintext quality to drift between them.

Why we DO NOT just hand the raw HTML to Premailer: ────────────────────────────────────────────────────────────────The Goodmail layout is built for HTML email clients that respect display:none, conditional comments, and inline alt attributes — none of which Premailer’s ‘to_plain_text` honors. If we feed it the layout HTML directly, the plaintext part has three artifacts that look like rendering bugs to a recipient using a text-only client:

1. Preheader leak — the layout's hidden inbox-preview
   `<span style="display:none">` is rendered as a phantom
   first line of the message body.
2. Button text duplication — the `button` DSL helper emits
   both a `<v:roundrect>` (Outlook VML, wrapped in
   `<!--[if mso]>...<![endif]-->`) AND a regular `<a>`.
   Premailer ignores the conditional comment and extracts
   text from BOTH, so the button label appears twice in
   plaintext.
3. Image alt-text leak — `image` / `inline_image` calls
   with no explicit alt fall back to `config.company_name`.
   That alt is fine in HTML (screen readers read it) but
   shows up as a stray "CompanyName" line in plaintext
   since Premailer extracts alt attributes verbatim.

We pre-process the HTML to neutralize each of these BEFORE plaintext extraction, then apply a small post-extraction cleanup pass for the residual artifacts (logo alt line, blank-line compaction).

Sources:

- Premailer to_plain_text:
  https://github.com/premailer/premailer/blob/master/lib/premailer/premailer.rb
- MSO conditional comments syntax:
  https://www.litmus.com/blog/a-guide-to-rendering-differences-in-microsoft-outlook-clients

Constant Summary collapse

MSO_CONDITIONAL_BLOCK =

Matches the ‘<!–[if mso]>…<!–>` blocks Outlook reads exclusively. The `m` flag lets `.*?` span newlines (these blocks are usually multi-line). The non-greedy quantifier ensures we don’t eat past the first matching ‘<![endif]–>`.

/<!--\[if mso\]>.*?<!\[endif\]-->/m

Instance Method Summary collapse

Instance Method Details

#generate(raw_html, preheader: nil) ⇒ String

Generates the plaintext part for a multipart message.

Parameters:

  • raw_html (String)

    The full layout-rendered HTML body.

  • preheader (String, nil) (defaults to: nil)

    The preheader text (the value we wrote into the hidden inbox-preview span). Passed in so we can strip it specifically from plaintext rather than guessing at a generic heuristic.

Returns:

  • (String)

    The cleaned plaintext, ready for the text/plain part of the outgoing message.



65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# File 'lib/goodmail/plaintext.rb', line 65

def generate(raw_html, preheader: nil)
  premailer_html = strip_mso_only_markup(raw_html)
  premailer = Premailer.new(
    premailer_html,
    with_html_string: true,
    adapter: :nokogiri,
    preserve_styles: false,
    remove_ids: true,
    remove_comments: false,
    # Goodmail outputs UTF-8 end-to-end. Without this, Premailer's
    # libxml2 backend defaults to Latin-1 when no `<meta charset>`
    # tag is present in the source, double-encoding every accented
    # character ("Duración" → "Duración", "€" → "â¬"). The shipped
    # layout DOES include the meta tag, but custom `layout_path:`
    # callers might not — pinning here makes us robust to either.
    input_encoding: "UTF-8"
  )
  text = premailer.to_plain_text

  text = strip_preheader_line(text, preheader)
  text = strip_logo_alt_line(text)
  text = strip_company_name_alt_line(text)
  text = compact_blank_lines(text)
  text.strip
end