Class: Mailmate::IndexReader

Inherits:
Object
  • Object
show all
Defined in:
lib/mailmate/index_reader.rb

Constant Summary collapse

RECORD_SIZE =
12
FRESHNESS_INTERVAL =

Re-stat the underlying files at most this often per reader (seconds). Short-lived CLI processes never hit the recheck; the persistent MCP server picks up MailMate’s continuous index rewrites within this window instead of serving a snapshot from its first request forever.

1.0

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(name) ⇒ IndexReader

Returns a new instance of IndexReader.

Raises:

  • (ArgumentError)


90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
# File 'lib/mailmate/index_reader.rb', line 90

def initialize(name)
  @name = name
  @base = "#{Mailmate.config.db_headers}/#{name}"
  raise ArgumentError, "Index not found: #{name} (looked at #{@base}.{cache,offsets})" \
    unless File.exist?("#{@base}.cache") && File.exist?("#{@base}.offsets")

  @cache_bytes   = File.binread("#{@base}.cache")
  @offsets_bytes = File.binread("#{@base}.offsets")
  @cache_sig     = file_sig("#{@base}.cache")
  @offsets_sig   = file_sig("#{@base}.offsets")
  @checked_at    = Process.clock_gettime(Process::CLOCK_MONOTONIC)
  # The id→ranges hash builds lazily (see index): ids_matching-only
  # consumers (the inverted body search) never need it, and skipping it
  # saves ~250 ms of construction on the big body indexes.
  @index = nil
end

Instance Attribute Details

#nameObject (readonly)



88
89
90
# File 'lib/mailmate/index_reader.rb', line 88

def name
  @name
end

Class Method Details

.for(name) ⇒ Object

Per-process cache of readers keyed by [name, db_headers]. Including db_headers means a Mailmate.config swap (e.g. a test pointing at a different tmpdir) doesn’t return stale readers built from the old path. Cached readers are re-validated against the on-disk files’ mtime+size (throttled; see FRESHNESS_INTERVAL) so long-lived processes don’t serve stale data after MailMate rewrites an index.



61
62
63
64
65
66
# File 'lib/mailmate/index_reader.rb', line 61

def for(name)
  @cache ||= {}
  key = cache_key(name)
  @cache.delete(key) if @cache[key]&.stale?
  @cache[key] ||= new(name)
end

.reset!(name = nil) ⇒ Object

Invalidate cached readers. With no argument, drops the entire cache (useful for tests or when MailMate’s database swaps out). With a name, invalidates only entries for that name across all db_headers — the common case (cache-bust after a write) doesn’t need to thread config through.



73
74
75
76
77
78
79
# File 'lib/mailmate/index_reader.rb', line 73

def reset!(name = nil)
  if name.nil?
    @cache = nil
  elsif @cache
    @cache.delete_if { |(n, _dir), _reader| n == name }
  end
end

Instance Method Details

#each_eml_id(&block) ⇒ Object

Iterate every recorded eml-id. Yields just the id; callers that also want the value should pair this with ‘value_for`. Exists so other gem modules don’t have to reach into ‘@index` directly.



195
196
197
198
# File 'lib/mailmate/index_reader.rb', line 195

def each_eml_id(&block)
  return enum_for(:each_eml_id) unless block
  index.each_key(&block)
end

#each_recordObject

Iterate every (eml_id, raw_value) pair, once per on-disk record. Multi-record ids yield multiple times. The value comes back as the bare cache substring; callers that need parsed form (e.g. flag tokens) should massage it themselves.



204
205
206
207
208
209
# File 'lib/mailmate/index_reader.rb', line 204

def each_record
  return enum_for(:each_record) unless block_given?
  index.each do |eml_id, packs|
    packs.each { |v| yield eml_id, @cache_bytes[(v >> 32)...(v & 0xFFFFFFFF)] }
  end
end

#flags_for(eml_id) ⇒ Object

‘#flags.flag` semantics: the cache stores a space-separated list of IMAP keywords. Split into individual flag tokens.



145
146
147
148
149
# File 'lib/mailmate/index_reader.rb', line 145

def flags_for(eml_id)
  v = value_for(eml_id)
  return [] if v.nil? || v.empty?
  v.split(/\s+/).reject(&:empty?)
end

#ids_matching(needle) ⇒ Object

Inverted substring search: returns a Hash whose keys are every id with at least one record containing ‘needle` (byte-wise; pass pre-downcased bytes when querying an #lc index). One memchr-fast String#index scan of the whole cache instead of one substring test per record — for a 77 MB body cache that’s ~75 ms versus seconds of per-message lookups.

A raw cache hit can span two adjacent records’ ranges; interval stabbing keeps only hits that fall entirely inside a single record (per-segment semantics, matching MailMate’s own body search). Records sharing a byte range (deduped values) all report their ids.



167
168
169
170
171
172
173
174
175
176
177
178
179
# File 'lib/mailmate/index_reader.rb', line 167

def ids_matching(needle)
  needle = needle.b
  found = {}
  return found if needle.empty? || @cache_bytes.empty?
  ensure_stab_table!
  nlen = needle.bytesize
  pos = 0
  while (pos = @cache_bytes.index(needle, pos))
    stab(pos, pos + nlen) { |id| found[id] = true }
    pos += 1
  end
  found
end

#key?(eml_id) ⇒ Boolean

True when the index has at least one record for this id. Cheaper than values_for(id).empty? — no substring slicing.

Returns:

  • (Boolean)


153
154
155
# File 'lib/mailmate/index_reader.rb', line 153

def key?(eml_id)
  index.key?(eml_id.to_i)
end

#record_countObject

Total number of on-disk records (sum across all ids). Diagnostics.



188
189
190
# File 'lib/mailmate/index_reader.rb', line 188

def record_count
  index.values.sum(&:size)
end

#sizeObject

Number of distinct ids in the index. For multi-record indexes this is smaller than the on-disk record count (use record_count for that).



183
184
185
# File 'lib/mailmate/index_reader.rb', line 183

def size
  index.size
end

#stale?Boolean

True when the on-disk files no longer match what this reader was built from. Throttled to one stat-pair per FRESHNESS_INTERVAL; a vanished file (mid-swap while MailMate rewrites) counts as not-stale so we keep serving the last good snapshot rather than racing the writer.

Returns:

  • (Boolean)


111
112
113
114
115
116
117
118
119
# File 'lib/mailmate/index_reader.rb', line 111

def stale?
  now = Process.clock_gettime(Process::CLOCK_MONOTONIC)
  return false if now - @checked_at < FRESHNESS_INTERVAL
  @checked_at = now
  cache_sig   = file_sig("#{@base}.cache")
  offsets_sig = file_sig("#{@base}.offsets")
  return false if cache_sig.nil? || offsets_sig.nil?
  cache_sig != @cache_sig || offsets_sig != @offsets_sig
end

#value_for(eml_id) ⇒ Object

Returns the raw cached value for a given .eml body-part ID, or nil if the id isn’t in this index. Returns the LAST record for the id — for accumulator-style header indexes (‘#flags`, `#source`, `subject`, etc.) that’s the latest state; the older records are stale versions. For body indexes (‘#unquoted#lc`, `#quoted#lc`) last-alone is meaningless — use values_for to read every segment.



127
128
129
130
131
132
# File 'lib/mailmate/index_reader.rb', line 127

def value_for(eml_id)
  packs = index[eml_id.to_i]
  return nil if packs.nil? || packs.empty?
  v = packs[-1]
  @cache_bytes[(v >> 32)...(v & 0xFFFFFFFF)]
end

#values_for(eml_id) ⇒ Object

Returns every recorded value for an id, in offsets-file order. Returns

if the id isn’t in the index. Use this for body indexes

(#unquoted#lc, #quoted#lc), which store one record per text segment.



137
138
139
140
141
# File 'lib/mailmate/index_reader.rb', line 137

def values_for(eml_id)
  packs = index[eml_id.to_i]
  return [] if packs.nil?
  packs.map { |v| @cache_bytes[(v >> 32)...(v & 0xFFFFFFFF)] }
end