Class: Legion::Extensions::MicrosoftTeams::LocalCache::Extractor

Inherits:
Object
  • Object
show all
Defined in:
lib/legion/extensions/microsoft_teams/local_cache/extractor.rb

Overview

Extracts Teams messages from the local Chromium IndexedDB LevelDB cache. Works offline - reads the local file system, no Graph API needed.

Two record types contain messages:

1. Conversation records: metadata + lastMessage (one per conversation)
2. MessageMap records: replyChainId + messageMap with multiple messages

Defined Under Namespace

Classes: Message

Constant Summary collapse

DEFAULT_PATH =
File.expand_path(
  '~/Library/Containers/com.microsoft.teams2/Data/Library/Application Support/' \
  'Microsoft/MSTeams/EBWebView/WV2Profile_tfw/IndexedDB/' \
  'https_teams.microsoft.com_0.indexeddb.leveldb'
).freeze
SKIP_MESSAGE_TYPES =
%w[
  ThreadActivity/AddMember
  ThreadActivity/DeleteMember
  ThreadActivity/TopicUpdate
  Event/Call
  RichText/Media_CallRecording
].freeze

Instance Method Summary collapse

Constructor Details

#initialize(db_path: DEFAULT_PATH) ⇒ Extractor

Returns a new instance of Extractor.



47
48
49
# File 'lib/legion/extensions/microsoft_teams/local_cache/extractor.rb', line 47

def initialize(db_path: DEFAULT_PATH)
  @db_path = db_path
end

Instance Method Details

#available?Boolean

Returns true if the Teams LevelDB directory exists.

Returns:

  • (Boolean)


52
53
54
# File 'lib/legion/extensions/microsoft_teams/local_cache/extractor.rb', line 52

def available?
  Dir.exist?(@db_path)
end

#extract(since: nil, channels: nil, senders: nil, skip_bots: true) ⇒ Object

Extract all messages. Returns array of Message structs. Options:

since:      Time - only messages after this time
channels:   Array<String> - filter by thread topic/name
senders:    Array<String> - filter by sender display name
skip_bots:  Boolean - skip integration/bot messages (default: true)


62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
# File 'lib/legion/extensions/microsoft_teams/local_cache/extractor.rb', line 62

def extract(since: nil, channels: nil, senders: nil, skip_bots: true)
  raise "Teams cache not found at #{@db_path}" unless available?

  messages = []
  seen_hashes = Set.new

  each_ldb_file do |path|
    reader = SSTableReader.new(path)
    reader.each_entry do |_key, value|
      extract_from_record(value, messages, seen_hashes)
    end
  rescue StandardError => e
    warn "LocalCache: error reading #{File.basename(path)}: #{e.message}"
  end

  messages = apply_filters(messages, since: since, channels: channels,
                                     senders: senders, skip_bots: skip_bots)
  messages.sort_by { |m| m.compose_time || '' }
end

#statsObject

Returns summary stats without extracting full messages.



83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
# File 'lib/legion/extensions/microsoft_teams/local_cache/extractor.rb', line 83

def stats
  return nil unless available?

  file_count = 0
  total_bytes = 0

  each_ldb_file do |path|
    file_count += 1
    total_bytes += File.size(path)
  end

  {
    path:        @db_path,
    ldb_files:   file_count,
    total_bytes: total_bytes,
    total_mb:    (total_bytes / 1_048_576.0).round(1)
  }
end