Module: Mailmate::DuplicateScanner

Defined in:
lib/mailmate/duplicate_scanner.rb

Overview

Detect duplicate Message-ID copies across MailMate’s tree. The same RFC Message-ID can appear in multiple ‘.eml` files — Gmail’s label-creates-a-copy semantics produce this for any message that hits a labeled mailbox, and self-to-self messages can end up in both Sent and INBOX folders.

Why this matters: MailMate’s ‘mid:` URL resolves to a single message non-deterministically, so an action keyed by Message-ID can land on a different `.eml` file than the one the user typed. `mailmate-modify` warns when this is the case.

Implementation uses MailMate’s own ‘message-id` index — O(n) over the decoded index instead of a recursive `grep -rli` over the whole IMAP tree.

[Mailmate scripts speed]

§1.

Class Method Summary collapse

Class Method Details

.duplicate?(message_id) ⇒ Boolean

Convenience: is there more than one copy of this Message-ID in the tree?

Returns:

  • (Boolean)


37
38
39
# File 'lib/mailmate/duplicate_scanner.rb', line 37

def self.duplicate?(message_id)
  eml_ids_for(message_id).size > 1
end

.duplicatesObject

Build a Hash=> Array<eml_id> for every duplicated Message-ID in the index. One full pass; useful as a session-cached lookup when many messages will be processed in a batch.



44
45
46
47
48
49
50
51
52
53
# File 'lib/mailmate/duplicate_scanner.rb', line 44

def self.duplicates
  reader = Mailmate::IndexReader.for("message-id")

  groups = Hash.new { |h, k| h[k] = [] }
  iterate(reader) do |eml_id, cached|
    next if cached.nil? || cached.empty?
    groups[strip_brackets(cached).downcase] << eml_id
  end
  groups.select { |_, ids| ids.size > 1 }
end

.eml_ids_for(message_id) ⇒ Object

Returns Array<Integer> of eml-ids that share ‘message_id`. The array is ordered as `#message-id` records them; callers don’t depend on the order.



22
23
24
25
26
27
28
29
30
31
32
33
34
# File 'lib/mailmate/duplicate_scanner.rb', line 22

def self.eml_ids_for(message_id)
  return [] if message_id.nil? || message_id.empty?

  reader = Mailmate::IndexReader.for("message-id")
  target = strip_brackets(message_id).downcase

  ids = []
  iterate(reader) do |eml_id, cached|
    next if cached.nil?
    ids << eml_id if strip_brackets(cached).downcase == target
  end
  ids
end

.iterate(reader, &block) ⇒ Object

Internal — iterate the IndexReader’s records. Delegates to the reader’s public ‘each_record` API.



57
58
59
# File 'lib/mailmate/duplicate_scanner.rb', line 57

def self.iterate(reader, &block)
  reader.each_record(&block)
end

.strip_brackets(s) ⇒ Object



61
62
63
# File 'lib/mailmate/duplicate_scanner.rb', line 61

def self.strip_brackets(s)
  s.to_s.sub(/\A</, "").sub(/>\z/, "").strip
end