email_data

Tests Code Climate Gem version Gem downloads NPM version NPM downloads

This project is a compilation of datasets related to emails.

  • Disposable emails
  • Disposable domains
  • Free email services
  • Roles (e.g. info, spam, etc)

The data is compiled from several different sources, so thank you all for making this data available.

Ruby

Installation

gem install email_data

Or add the following line to your project's Gemfile:

gem "email_data"

Usage

require "email_data"

# <Pathname /> instance pointing to the data directory.
EmailData.data_dir

# List of disposable domains. Punycode is expanded into ASCII domains.
EmailData.disposable_domains
EmailData.disposable_domains_with_mx
EmailData.disposable_domains_without_mx

# List of disposable emails. Some services use free email like Gmail to create
# disposable emails.
EmailData.disposable_emails

# List of free email services.
EmailData.free_email_domains

# List of roles. Can be used to filter out emails like info@ or all@.
EmailData.roles

# List of private relays like Apple's Hide My Email.
EmailData.private_relays

# List of country tlds.
EmailData.country_tlds

# List of official tlds.
EmailData.tlds

# List of second-level domains.
EmailData.slds

# A list of DNSBL providers.
EmailData.dnsbls

Data sources

By default, Ruby will load data from filesystem. You may want to load this data from the database instead. email-data has support for ActiveRecord out of the box. To use the ActiveRecord adapter, you must load email_data/source/active_record.rb. You can easily do it so with Bundler's require key.

gem "email_data", require: "email_data/source/active_record"

Then, you need to assign the new data source.

EmailData.source = EmailData::Source::ActiveRecord

If you need to configure a different database connection than the one defined by ActiveRecord::Base, use EmailData::Source::ActiveRecord::ApplicationRecord for that.

Creating the tables

To create the tables, use the migration code below (tweak it accordingly if you use something different than PostgreSQL, or don't want to use citext).

class SetupEmailData < ActiveRecord::Migration[6.1]
  def change
    enable_extension "citext"

    create_table :tlds do |t|
      t.citext :name, null: false
    end

    add_index :tlds, :name, unique: true

    create_table :slds do |t|
      t.citext :name, null: false
    end

    add_index :slds, :name, unique: true

    create_table :country_tlds do |t|
      t.citext :name, null: false
    end

    add_index :country_tlds, :name, unique: true

    create_table :disposable_emails do |t|
      t.citext :name, null: false
    end

    add_index :disposable_emails, :name, unique: true

    create_table :disposable_domains do |t|
      t.citext :name, null: false
    end

    add_index :disposable_domains, :name, unique: true

    create_table :disposable_domains_with_mx do |t|
      t.citext :name, null: false
    end

    add_index :disposable_domains_with_mx, :name, unique: true

    create_table :disposable_domains_without_mx do |t|
      t.citext :name, null: false
    end

    add_index :disposable_domains_without_mx, :name, unique: true

    create_table :free_email_domains do |t|
      t.citext :name, null: false
    end

    add_index :free_email_domains, :name, unique: true

    create_table :roles do |t|
      t.citext :name, null: false
    end

    add_index :roles, :name, unique: true

    create_table :private_relays do |t|
      t.citext :name, null: false
    end

    add_index :private_relays, :name, unique: true

    create_table :dnsbls do |t|
      t.citext :name, null: false
    end

    add_index :dnsbls, :name, unique: true
  end
end
Loading the data

With PostgreSQL, you load the data using the COPY command. First, you'll need to discover where your gems are being installed. Use gem list for that.

$ gem list email_data -d

*** LOCAL GEMS ***

email_data (1601479967, 1601260789)
    Author: Nando Vieira
    Homepage: https://github.com/fnando/email_data
    License: MIT
    Installed at (1601479967): /usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0
    This project is a compilation of datasets related to emails.
    Includes disposable emails, disposable domains, and free email
    services.

The you can load each dataset using COPY:

COPY tlds (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/tlds.txt';
COPY slds (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/slds.txt';
COPY country_tlds (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/country_tlds.txt';
COPY disposable_emails (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_emails.txt';
COPY disposable_domains (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_domains.txt';
COPY disposable_domains_with_mx (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_domains_with_mx.txt';
COPY disposable_domains_without_mx (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_domains_without_mx.txt';
COPY free_email_domains (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/free_email_domains.txt';
COPY roles (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/roles.txt';
COPY private_relays (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/private_relays.txt';
COPY dnsbls (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/dnsbls.txt';

Alternatively, you could create a migration that executes that same command; given that you'd be running Ruby code, you can replace the steps to find the gem path with EmailData.data_dir.

class LoadEmailData < ActiveRecord::Migration[6.1]
  def change
    copy = lambda do |table_name|
      connection = ActiveRecord::Base.connection
      data_path = EmailData.data_dir

      connection.execute <<~PG
        COPY #{table_name} (name)
        FROM '#{data_path.join(table_name)}.txt'
        (FORMAT CSV)
      PG
    end

    copy.call(:tlds)
    copy.call(:slds)
    copy.call(:country_tlds)
    copy.call(:disposable_emails)
    copy.call(:disposable_domains)
    copy.call(:disposable_domains_with_mx)
    copy.call(:disposable_domains_without_mx)
    copy.call(:free_email_domains)
    copy.call(:roles)
    copy.call(:private_relays)
    copy.call(:dnsbls)
  end
end

Node.js

Installation

$ yarn add @fnando/email_data

or

$ npm install @fnando/email_data

Usage

const disposableEmails = require("@fnando/email_data/data/json/disposable_emails.json");
const disposableDomains = require("@fnando/email_data/data/json/disposable_domains.json");
const disposableDomainsWithMx = require("@fnando/email_data/data/json/disposable_domains_with_mx.json");
const disposableDomainsWithoutMx = require("@fnando/email_data/data/json/disposable_domains_without_mx.json");
const freeEmailDomains = require("@fnando/email_data/data/json/free_email_domains.json");
const roles = require("@fnando/email_data/data/json/roles.json");
const privateRelays = require("@fnando/email_data/data/json/private_relays.json");
const tlds = require("@fnando/email_data/data/json/tlds.json");
const slds = require("@fnando/email_data/data/json/slds.json");
const cctlds = require("@fnando/email_data/data/json/country_tlds.json");
const dnsbls = require("@fnando/email_data/data/json/dnsbls.json");

Dataset

The dataset is updated automatically. If you have any manual entries you would like to add, please make a pull request against the files data/manual/*.txt.

  • data/manual/disposable_domains.txt: only domains from disposable servers must go here.
  • data/manual/disposable_emails.txt: only normalized email addresses that use free email services must go here. E.g. d.i.s.p.o.s.a.b.l.e+1234@gmail.com must be added as disposable@gmail.com.
  • data/manual/free_email_domains.txt: only free email services must go here.
  • data/manual/roles.txt: list of role-based user names like info or no-reply.
  • data/manual/private_relays.txt: list of private relay services, like Apple's Hide My Email.

Maintainer

Contributors

Contributing

For more details about how to contribute, please read https://github.com/fnando/email_data/blob/main/CONTRIBUTING.md.

License

The gem is available as open source under the terms of the MIT License. A copy of the license can be found at https://github.com/fnando/email_data/blob/main/LICENSE.md.

Code of Conduct

Everyone interacting in the email_data project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.