jekyll-email-munge

Five-layer email address munging for Jekyll. Drop a Liquid tag, get a mailto: link that looks normal to humans and ciphertext to scrapers.

{% munge_email "you@example.com" %}

renders as something like:

<a href="#" class="liame"
   data-liame="dDNiRy9OZGNYS3p1OWp0WnFM3VkIVJYNHNicw=="
   rel="nofollow">Reach<span class="liame-decoy" aria-hidden="true">no-reply@spam.invalid</span> out</a>
<noscript> (or use the address shown here:
  <svg ...><text>you@example.com</text></svg>)
</noscript>

Click the link in a real browser → it opens mailto:you@example.com. View the HTML source as a scraper → you see ciphertext, decoy noise, and an SVG.

Two namespaces, by design. The Liquid tag (munge_email) and the config key (email_munge) are clear, discoverable names — they only ever appear in your source code. The rendered HTML uses liame (email reversed) for every class and attribute a scraper might regex for. Source code stays readable; rendered output stays scraper-hostile.


What is "address munging"?

Address munging is the canonical term — going back to early Usenet — for defending an email address on a public surface against bulk harvesters, by any combination of encoding, scrambling, encryption, decoy, or visual substitution. It deliberately covers the whole spectrum from name AT domain DOT com to "encrypted blob revealed on click," and that's exactly the spectrum this gem operates on.

We use the term over alternatives like obfuscate (undersells the encryption) or encrypt (oversells it — the key is public on purpose) because munging is the only word that's both technically and historically correct.


Why

Email addresses on public sites are harvested by bots within hours of being indexed. Most "munging" approaches (HTML entities, percent-encoding, JavaScript concatenation, image rendering) defeat some harvesters but not others. This gem stacks five independently-effective techniques from Spencer Mortensen's email obfuscation study — each blocked 100% of tested harvesters on its own:

  1. AES-128-GCM encryption. The HTML carries ciphertext only. Even if a scraper extracts the data-liame attribute, decrypting it requires the key and a JavaScript runtime.
  2. JS conversion. The decryption logic runs in the browser via Web Crypto. Scrapers without a JS engine see only the encrypted blob.
  3. User-interaction trigger. Decryption only fires on click. Headless bots that do execute JS but don't simulate user input never see the address.
  4. CSS-hidden decoy. A fake address (no-reply@spam.invalid by default) sits inside the link with display: none, visible to scrapers that strip tags but hidden from sighted users.
  5. SVG <noscript> fallback. For visitors without JavaScript, the address appears as text inside an inline <svg><text> element. Most regex-based harvesters don't OCR or parse SVG.

The goal is defeating scrapers, not hiding the address from a determined human. The encryption key is published in plain sight (it has to be — the browser needs it to decrypt). The win is that automated harvesters give up long before they reach a working address.


Installation

Add to your site's Gemfile:

group :jekyll_plugins do
  gem "jekyll-email-munge"
end

…then bundle install.

If your site doesn't use Bundler, add to _config.yml:

plugins:
  - jekyll-email-munge

Hosted on GitHub Pages? Custom plugins aren't allowed there — deploy via Cloudflare Pages, Netlify, or gh-pages with a CI build instead.

Configuration

Add an email_munge block to _config.yml:

email_munge:
  # 32 hex chars = 16 bytes = AES-128 key. Generate one with:
  #   ruby -ropenssl -e 'puts OpenSSL::Random.random_bytes(16).unpack1("H*")'
  key_hex: "84afaaa6886a7b0d195454cc559795cb"

  # Optional. Color of the noscript SVG fallback text.
  svg_color: "#9c3f1d"

  # Optional. The fake address shown inside the (CSS-hidden) decoy span.
  decoy: "no-reply@spam.invalid"

Then add the decoder script once per page — typically just before </body> in your default layout:

<!-- _layouts/default.html -->
{% munge_email_script %}
</body>

Finally, add a tiny CSS rule to hide the decoy span and align the SVG fallback. Drop these into your stylesheet:

.liame-decoy { display: none; }
.liame-svg   { vertical-align: middle; pointer-events: none; }

That's the entire setup. You're done.


Usage

Drop the tag wherever you'd normally write a <a href="mailto:…">:

{% munge_email "support@yourdomain.com" %}

Default visible link text is Reach out. To change it, pass a second argument with | marking where the decoy span gets injected:

{% munge_email "support@yourdomain.com" "Get|in touch" %}
{% munge_email "press@yourdomain.com" "Contact|the press team" %}
{% munge_email "hello@yourdomain.com" "Email|us" %}

Renders as:

Get<decoy> in touch
Contact<decoy> the press team
Email<decoy> us

Without the |, the entire string is used as visible text and an empty decoy follows — works fine, but splitting the text gives the decoy more cover.

Why split the text?

The decoy span sits inside the visible link and is hidden with display: none. A scraper that reads the rendered DOM (or strips CSS) sees:

Get no-reply@spam.invalid in touch

Splitting the visible text means the decoy is interleaved with real text rather than appended at the end — slightly harder for a scraper to recognize and strip.


How it works (under the hood)

1. Build time (Ruby)

When Jekyll renders {% munge_email "user@example.com" %}:

  1. The plugin reads email_munge.key_hex from _config.yml.
  2. It derives a deterministic IV from the email + key (SHA-256("liame-iv:<key_hex>:<email>")[0..11]).
  3. It encrypts the email with AES-128-GCM, packs iv | ciphertext | auth_tag, and base64-encodes the result.
  4. It emits the <a> element with the base64 payload in data-liame, the decoy <span>, and an inline <svg> fallback.

The deterministic IV means rebuilds produce identical output — your git diffs stay quiet and your CDN cache doesn't get invalidated on every build. IV reuse with the same plaintext under the same key is safe under AES-GCM; reuse with different plaintexts is what's catastrophic, and that doesn't happen here because each email gets its own derived IV.

2. Page load (JavaScript)

{% munge_email_script %} emits an inline <script> that:

  1. Reads KEY_HEX (baked into the script at build time).
  2. Attaches a click listener to every [data-liame] element.
  3. On click: base64-decodes, splits into IV / ciphertext / tag, calls crypto.subtle.decrypt (Web Crypto API), and navigates to mailto:<decrypted>.

The script is ~700 bytes minified and inlined — no extra request, no defer race. It only does anything if the user actually clicks a munged link.

3. The liame naming convention (in rendered output)

This is the central trick. Anything a scraper can see in the rendered HTML uses liame (email reversed) instead of email / mail / mailto / contact, so a regex grep for those tokens finds nothing in the output. The plugin guarantees this by construction:

Element Class / attribute name
Anchor class="liame"
Encrypted payload data-liame="..."
Hidden decoy span class="liame-decoy"
Fallback SVG class="liame-svg"
aria-label on the SVG contact

The visible link text in the default output (Reach out) deliberately avoids the words "email" or "mail" — keep that property when you customize the text.

Source code is exempt from this convention because scrapers don't read your Liquid templates or _config.yml. The tag (munge_email) and config key (email_munge) are written for human readability:

Source-code thing Reads as Visible to scrapers?
Liquid tag {% munge_email %} No (build-time only)
Config key email_munge: No
Decoder script tag {% munge_email_script %} No

4. Where the key lives

There is one key, set in _config.yml. The plugin uses it to encrypt at build time, and {% munge_email_script %} interpolates it into the inline JS so the browser can decrypt. The key is therefore visible to anyone viewing the page source — that's intentional. The encryption is a scraper-defeating layer, not a secrecy mechanism.

If you want to rotate the key, generate a new one and update key_hex in _config.yml. All existing payloads automatically re-encrypt on the next build because the plugin re-runs.


Customization

Already covered above:

{% munge_email "user@example.com" "Custom|visible text" %}

Different decoy text

In _config.yml:

email_munge:
  decoy: "abuse@spam-trap.invalid"

Or pick something amusing — it's only seen by scrapers.

SVG fallback color

email_munge:
  svg_color: "#3dff9a"   # match your accent

Multiple sites, one key

If you run several sites and want a single key for all of them, use the same key_hex value in each site's _config.yml. Encrypted payloads are interchangeable; the same munge_email_script decoder works on all of them.

If you rotate the key on one site, you must rotate it on all of them — the deterministic IV depends on the key, so old ciphertext won't decrypt with a new key.


Browser support

The decoder uses Web Crypto (crypto.subtle), which has been available in all major browsers since 2016. On legacy browsers without it, clicking the munged link does nothing — but those users see the inline SVG fallback in the <noscript> block (or, more practically, no <noscript> users have a browser without Web Crypto).

If you need to support pre-2016 browsers, this gem isn't the right tool.


Security notes

This gem provides scraper resistance, not encryption-grade secrecy. In particular:

  • The key is published in JS. Anyone who reads the rendered HTML can decrypt every payload on the page. That's intentional and required for the design.
  • AES-128 is used because the goal is to require any AES decryption, not because the threat model needs 256-bit keys. If you ever need to actually protect the addresses, this gem is the wrong tool — don't put the addresses on the public site at all.
  • The deterministic IV is safe specifically because each email gets its own derived IV. Don't modify the IV derivation to use a constant — that would be a real GCM misuse.

Comparison with alternatives

Approach Layers Maintenance
Plain mailto: link 0 Easiest, harvested instantly
HTML entities / hex encoding 1 Defeats only naïve regex
jekyll-email-protect (percent-encoded mailto) 1 One filter, low ceremony
Concatenated JS string 2 Harvested by JS-aware bots
jekyll-email-munge (this gem) 5 One Liquid tag, build-time encryption
Pure image (e.g., a screenshot of the email) 1 OCR-defeated, terrible UX

Development

git clone https://github.com/framallo/jekyll-email-munge.git
cd jekyll-email-munge
bundle install
rake build         # produces pkg/jekyll-email-munge-X.Y.Z.gem

To install your local checkout into a Jekyll site for testing:

# Gemfile in the consumer site
gem "jekyll-email-munge", path: "../jekyll-email-munge"

Releasing

  1. Bump Jekyll::EmailMunge::VERSION in lib/jekyll-email-munge/version.rb.
  2. Add a section to CHANGELOG.md.
  3. Commit and tag: git commit -am "release vX.Y.Z" && git tag vX.Y.Z
  4. rake release (Bundler task — builds, tags, pushes to RubyGems).

You'll need a RubyGems.org account with MFA enabled before pushing.


License

MIT © Federico Ramallo