philiprehberger-mask

Tests Gem Version Last updated

Data masking library with auto-detect PII redaction for strings and nested structures

Requirements

  • Ruby >= 3.1

Installation

Add to your Gemfile:

gem "philiprehberger-mask"

Or install directly:

gem install philiprehberger-mask

Usage

require "philiprehberger/mask"

Philiprehberger::Mask.scrub("Contact us at user@example.com")
# => "Contact us at u***@e******.com"

Hash Scrubbing

Philiprehberger::Mask.scrub_hash({
  name: "Alice",
  email: "alice@example.com",
  password: "secret123",
  nested: { ssn: "123-45-6789" }
})
# => { name: "Alice", email: "a***@e******.com", password: "[FILTERED]", nested: { ssn: "***-**-6789" } }

Partial Masking

Show partial information like last 4 digits or first initial:

Philiprehberger::Mask.scrub("Card: 4111-1111-1111-1111", mode: :partial)
# => "Card: ****1111"

Philiprehberger::Mask.scrub("Email: user@example.com", mode: :partial)
# => "Email: u***@example.com"

Format-Preserving Masking

Replace characters while keeping separators and format intact:

Philiprehberger::Mask.scrub("SSN: 123-45-6789", mode: :format_preserving)
# => "SSN: 000-00-0000"

Philiprehberger::Mask.scrub("Email: user@example.com", mode: :format_preserving)
# => "Email: XXXX@XXXXXXX.XXX"

Tokenization

Replace PII with reversible tokens:

result = Philiprehberger::Mask.tokenize("Contact user@example.com")
# => { masked: "Contact <TOKEN_EMAIL_1>", tokens: { "<TOKEN_EMAIL_1>" => "user@example.com" } }

Philiprehberger::Mask.detokenize(result[:masked], tokens: result[:tokens])
# => "Contact user@example.com"

Audit Trail

Track what was masked and where:

result = Philiprehberger::Mask.scrub_with_audit("SSN: 123-45-6789")
# => { result: "SSN: ***-**-6789", audit: [{ detector: :ssn, original: "123-45-6789", masked: "***-**-6789", position: 5 }] }

Hash Scrubbing with Modes

Apply partial or format-preserving masking to nested structures:

Philiprehberger::Mask.scrub_hash({ card: '4111-1111-1111-1111' }, mode: :partial)
# => { card: "****1111" }

Philiprehberger::Mask.scrub_hash({ ssn: '123-45-6789' }, mode: :format_preserving)
# => { ssn: "000-00-0000" }

Hash Audit Trail

Track what was masked in structured data with path information:

result = Philiprehberger::Mask.scrub_hash_with_audit({ user: { email: 'alice@example.com', password: 'secret' } })
# => { result: { user: { email: "a***@e******.com", password: "[FILTERED]" } },
#      audit: [{ detector: :email, path: [:user, :email], ... }, { detector: :sensitive_key, key: "password", path: [:user, :password], ... }] }

Batch Processing

Process multiple strings efficiently with shared compiled patterns:

results = Philiprehberger::Mask.batch_scrub([
  "Contact user@example.com",
  "SSN: 123-45-6789",
  "Call 555-123-4567"
])
# => ["Contact u***@e******.com", "SSN: ***-**-6789", "Call ***-***-4567"]

Detector Priority

Control which detector wins when patterns overlap:

Philiprehberger::Mask.configure_priority(%i[ssn phone email credit_card ip_address jwt passport iban drivers_license mrn])
# SSN detector now evaluates first

Locale Patterns

Register locale-specific detection patterns:

Philiprehberger::Mask.add_locale(:de, { phone: /\b0\d{3}[- ]?\d{7,8}\b/ })

# Use locale patterns with scrub_io or batch_scrub
Philiprehberger::Mask.batch_scrub(["Call 0301-1234567"], locale: :de)

IO Streaming

Scrub IO sources line by line:

io = StringIO.new("user@example.com\nSSN: 123-45-6789\n")
results = Philiprehberger::Mask.scrub_io(io)
# => ["u***@e******.com\n", "SSN: ***-**-6789\n"]

File.open("sensitive.log") do |f|
  scrubbed_lines = Philiprehberger::Mask.scrub_io(f, mode: :partial)
end

Custom Patterns

Philiprehberger::Mask.configure do |c|
  c.add_pattern(:order_id, /ORD-\d{10}/, replacement: "ORD-XXXXXXXXXX")
end

Custom Sensitive Keys

Register additional key names to redact in hash scrubbing:

Philiprehberger::Mask.configure do |c|
  c.add_sensitive_key(:ssn_field)
  c.add_sensitive_key(:credit_card_number)
end

Custom Detector DSL

Register detectors with block-based replacers:

Philiprehberger::Mask.configure do |c|
  c.detect(:employee_id, /EMP\d{6}/) { |match| "[EMPLOYEE_ID]" }
end

Built-in Detectors

Detector Pattern Masking
Email user@example.com u***@e******.com
Credit Card 4111-1111-1111-1111 ****-****-****-1111
SSN 123-45-6789 ***-**-6789
Phone 555-123-4567 ***-***-4567
IP Address 192.168.1.1 ***.***.***.***
JWT eyJ... [REDACTED_JWT]
Passport C12345678 [REDACTED_PASSPORT]
IBAN GB29NWBK60161331926819 [REDACTED_IBAN]
Driver's License D1234567 [REDACTED_DL]
MRN MRN12345678 [REDACTED_MRN]

API

Method Description
Mask.scrub(string, mode: :full) Detect and redact PII in a string
Mask.scrub_hash(hash, keys: nil, mode: :full) Deep-walk and redact hash values
Mask.scrub_hash_with_audit(hash, keys: nil) Deep-walk, redact, and return audit trail with paths
Mask.scrub_with_audit(string) Scrub and return audit trail of detections
Mask.batch_scrub(strings, **opts) Process array of strings with shared compiled patterns
Mask.scrub_io(io, **opts) Read IO line by line and scrub each line
Mask.tokenize(string) Replace PII with reversible tokens
Mask.detokenize(string, tokens:) Restore original values from tokens
Mask.configure_priority(detector_order) Set detector evaluation order
Mask.add_locale(locale, patterns) Register locale-specific detection patterns
`Mask.configure { \ c\
Mask.reset_configuration! Reset to default patterns and sensitive keys

Development

bundle install
bundle exec rspec
bundle exec rubocop

Support

If you find this project useful:

Star the repo

🐛 Report issues

💡 Suggest features

❤️ Sponsor development

🌐 All Open Source Projects

💻 GitHub Profile

🔗 LinkedIn Profile

License

MIT