email_data
This project is a compilation of datasets related to emails.
- Disposable emails
- Disposable domains
- Free email services
- Roles (e.g. info, spam, etc)
The data is compiled from several different sources, so thank you all for making this data available.
Ruby
Installation
gem install email_data
Or add the following line to your project's Gemfile:
gem "email_data"
Usage
require "email_data"
# <Pathname /> instance pointing to the data directory.
EmailData.data_dir
# List of disposable domains. Punycode is expanded into ASCII domains.
EmailData.disposable_domains
EmailData.disposable_domains_with_mx
EmailData.disposable_domains_without_mx
# List of disposable emails. Some services use free email like Gmail to create
# disposable emails.
EmailData.disposable_emails
# List of free email services.
EmailData.free_email_domains
# List of roles. Can be used to filter out emails like info@ or all@.
EmailData.roles
# List of private relays like Apple's Hide My Email.
EmailData.private_relays
# List of country tlds.
EmailData.country_tlds
# List of official tlds.
EmailData.tlds
# List of second-level domains.
EmailData.slds
# A list of DNSBL providers.
EmailData.dnsbls
Data sources
By default, Ruby will load data from filesystem. You may want to load this data
from the database instead. email-data
has support for ActiveRecord out of the
box. To use the ActiveRecord adapter, you must load
email_data/source/active_record.rb
. You can easily do it so with Bundler's
require
key.
gem "email_data", require: "email_data/source/active_record"
Then, you need to assign the new data source.
EmailData.source = EmailData::Source::ActiveRecord
If you need to configure a different database connection than the one defined by
ActiveRecord::Base
, use EmailData::Source::ActiveRecord::ApplicationRecord
for that.
Creating the tables
To create the tables, use the migration code below (tweak it accordingly if you
use something different than PostgreSQL, or don't want to use citext
).
class SetupEmailData < ActiveRecord::Migration[6.1]
def change
enable_extension "citext"
create_table :tlds do |t|
t.citext :name, null: false
end
add_index :tlds, :name, unique: true
create_table :slds do |t|
t.citext :name, null: false
end
add_index :slds, :name, unique: true
create_table :country_tlds do |t|
t.citext :name, null: false
end
add_index :country_tlds, :name, unique: true
create_table :disposable_emails do |t|
t.citext :name, null: false
end
add_index :disposable_emails, :name, unique: true
create_table :disposable_domains do |t|
t.citext :name, null: false
end
add_index :disposable_domains, :name, unique: true
create_table :disposable_domains_with_mx do |t|
t.citext :name, null: false
end
add_index :disposable_domains_with_mx, :name, unique: true
create_table :disposable_domains_without_mx do |t|
t.citext :name, null: false
end
add_index :disposable_domains_without_mx, :name, unique: true
create_table :free_email_domains do |t|
t.citext :name, null: false
end
add_index :free_email_domains, :name, unique: true
create_table :roles do |t|
t.citext :name, null: false
end
add_index :roles, :name, unique: true
create_table :private_relays do |t|
t.citext :name, null: false
end
add_index :private_relays, :name, unique: true
create_table :dnsbls do |t|
t.citext :name, null: false
end
add_index :dnsbls, :name, unique: true
end
end
Loading the data
With PostgreSQL, you load the data using the COPY
command. First, you'll need
to discover where your gems are being installed. Use gem list
for that.
$ gem list email_data -d
*** LOCAL GEMS ***
email_data (1601479967, 1601260789)
Author: Nando Vieira
Homepage: https://github.com/fnando/email_data
License: MIT
Installed at (1601479967): /usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0
This project is a compilation of datasets related to emails.
Includes disposable emails, disposable domains, and free email
services.
The you can load each dataset using COPY
:
COPY tlds (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/tlds.txt';
COPY slds (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/slds.txt';
COPY country_tlds (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/country_tlds.txt';
COPY disposable_emails (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_emails.txt';
COPY disposable_domains (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_domains.txt';
COPY disposable_domains_with_mx (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_domains_with_mx.txt';
COPY disposable_domains_without_mx (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/disposable_domains_without_mx.txt';
COPY free_email_domains (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/free_email_domains.txt';
COPY roles (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/roles.txt';
COPY private_relays (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/private_relays.txt';
COPY dnsbls (name) FROM '/usr/local/ruby/2.7.1/lib/ruby/gems/2.7.0/gems/email_data-1601479967/data/dnsbls.txt';
Alternatively, you could create a migration that executes that same command;
given that you'd be running Ruby code, you can replace the steps to find the gem
path with EmailData.data_dir
.
class LoadEmailData < ActiveRecord::Migration[6.1]
def change
copy = lambda do |table_name|
connection = ActiveRecord::Base.connection
data_path = EmailData.data_dir
connection.execute <<~PG
COPY #{table_name} (name)
FROM '#{data_path.join(table_name)}.txt'
(FORMAT CSV)
PG
end
copy.call(:tlds)
copy.call(:slds)
copy.call(:country_tlds)
copy.call(:disposable_emails)
copy.call(:disposable_domains)
copy.call(:disposable_domains_with_mx)
copy.call(:disposable_domains_without_mx)
copy.call(:free_email_domains)
copy.call(:roles)
copy.call(:private_relays)
copy.call(:dnsbls)
end
end
Node.js
Installation
$ yarn add @fnando/email_data
or
$ npm install @fnando/email_data
Usage
const disposableEmails = require("@fnando/email_data/data/json/disposable_emails.json");
const disposableDomains = require("@fnando/email_data/data/json/disposable_domains.json");
const disposableDomainsWithMx = require("@fnando/email_data/data/json/disposable_domains_with_mx.json");
const disposableDomainsWithoutMx = require("@fnando/email_data/data/json/disposable_domains_without_mx.json");
const freeEmailDomains = require("@fnando/email_data/data/json/free_email_domains.json");
const roles = require("@fnando/email_data/data/json/roles.json");
const privateRelays = require("@fnando/email_data/data/json/private_relays.json");
const tlds = require("@fnando/email_data/data/json/tlds.json");
const slds = require("@fnando/email_data/data/json/slds.json");
const cctlds = require("@fnando/email_data/data/json/country_tlds.json");
const dnsbls = require("@fnando/email_data/data/json/dnsbls.json");
Dataset
The dataset is updated automatically. If you have any manual entries you would
like to add, please make a pull request against the files data/manual/*.txt
.
data/manual/disposable_domains.txt
: only domains from disposable servers must go here.data/manual/disposable_emails.txt
: only normalized email addresses that use free email services must go here. E.g.d.i.s.p.o.s.a.b.l.e+1234@gmail.com
must be added asdisposable@gmail.com
.data/manual/free_email_domains.txt
: only free email services must go here.data/manual/roles.txt
: list of role-based user names likeinfo
orno-reply
.data/manual/private_relays.txt
: list of private relay services, like Apple's Hide My Email.
Maintainer
Contributors
Contributing
For more details about how to contribute, please read https://github.com/fnando/email_data/blob/main/CONTRIBUTING.md.
License
The gem is available as open source under the terms of the MIT License. A copy of the license can be found at https://github.com/fnando/email_data/blob/main/LICENSE.md.
Code of Conduct
Everyone interacting in the email_data project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.