Module: Mailmate::DuplicateScanner
- Defined in:
- lib/mailmate/duplicate_scanner.rb
Overview
Detect duplicate Message-ID copies across MailMate’s tree. The same RFC Message-ID can appear in multiple ‘.eml` files — Gmail’s label-creates-a-copy semantics produce this for any message that hits a labeled mailbox, and self-to-self messages can end up in both Sent and INBOX folders.
Why this matters: MailMate’s ‘mid:` URL resolves to a single message non-deterministically, so an action keyed by Message-ID can land on a different `.eml` file than the one the user typed. `mailmate-modify` warns when this is the case.
Implementation uses MailMate’s own ‘message-id` index — O(n) over the decoded index instead of a recursive `grep -rli` over the whole IMAP tree.
- [Mailmate scripts speed]
-
§1.
Class Method Summary collapse
-
.duplicate?(message_id) ⇒ Boolean
Convenience: is there more than one copy of this Message-ID in the tree?.
-
.duplicates ⇒ Object
Build a Hash=> Array<eml_id> for every duplicated Message-ID in the index.
-
.eml_ids_for(message_id) ⇒ Object
Returns Array<Integer> of eml-ids that share ‘message_id`.
-
.iterate(reader, &block) ⇒ Object
Internal — iterate the IndexReader’s records.
- .strip_brackets(s) ⇒ Object
Class Method Details
.duplicate?(message_id) ⇒ Boolean
Convenience: is there more than one copy of this Message-ID in the tree?
37 38 39 |
# File 'lib/mailmate/duplicate_scanner.rb', line 37 def self.duplicate?() eml_ids_for().size > 1 end |
.duplicates ⇒ Object
Build a Hash=> Array<eml_id> for every duplicated Message-ID in the index. One full pass; useful as a session-cached lookup when many messages will be processed in a batch.
44 45 46 47 48 49 50 51 52 53 |
# File 'lib/mailmate/duplicate_scanner.rb', line 44 def self.duplicates reader = Mailmate::IndexReader.for("message-id") groups = Hash.new { |h, k| h[k] = [] } iterate(reader) do |eml_id, cached| next if cached.nil? || cached.empty? groups[strip_brackets(cached).downcase] << eml_id end groups.select { |_, ids| ids.size > 1 } end |
.eml_ids_for(message_id) ⇒ Object
Returns Array<Integer> of eml-ids that share ‘message_id`. The array is ordered as `#message-id` records them; callers don’t depend on the order.
22 23 24 25 26 27 28 29 30 31 32 33 34 |
# File 'lib/mailmate/duplicate_scanner.rb', line 22 def self.eml_ids_for() return [] if .nil? || .empty? reader = Mailmate::IndexReader.for("message-id") target = strip_brackets().downcase ids = [] iterate(reader) do |eml_id, cached| next if cached.nil? ids << eml_id if strip_brackets(cached).downcase == target end ids end |
.iterate(reader, &block) ⇒ Object
Internal — iterate the IndexReader’s records. Delegates to the reader’s public ‘each_record` API.
57 58 59 |
# File 'lib/mailmate/duplicate_scanner.rb', line 57 def self.iterate(reader, &block) reader.each_record(&block) end |
.strip_brackets(s) ⇒ Object
61 62 63 |
# File 'lib/mailmate/duplicate_scanner.rb', line 61 def self.strip_brackets(s) s.to_s.sub(/\A</, "").sub(/>\z/, "").strip end |