Class: Woods::Temporal::SnapshotStore

Inherits:
Object
  • Object
show all
Defined in:
lib/woods/temporal/snapshot_store.rb

Overview

SnapshotStore captures and queries temporal snapshots of extraction runs.

Each snapshot is anchored to a git commit SHA and stores per-unit content hashes for efficient diff computation. Full source is not duplicated —only hashes of source, metadata, and dependencies are stored per snapshot.

Examples:

Capturing a snapshot

store = SnapshotStore.new(connection: db)
store.capture(manifest, unit_hashes)

Comparing snapshots

diff = store.diff("abc123", "def456")
diff[:added]    # => [{ identifier: "NewModel", ... }]
diff[:modified] # => [{ identifier: "User", ... }]
diff[:deleted]  # => [{ identifier: "OldService", ... }]

Constant Summary collapse

REQUIRED_TABLES =
%w[woods_snapshots woods_snapshot_units].freeze

Instance Method Summary collapse

Constructor Details

#initialize(connection:, validate_schema: true) ⇒ SnapshotStore

Returns a new instance of SnapshotStore.

Parameters:

  • connection (Object)

    Database connection supporting #execute and #get_first_row

  • validate_schema (Boolean) (defaults to: true)

    If true (default), probe both required tables at construction time and raise a descriptive error pointing at migrations 004+005 when they are missing. Set false in tests that construct the store with a bare mock.



30
31
32
33
# File 'lib/woods/temporal/snapshot_store.rb', line 30

def initialize(connection:, validate_schema: true)
  @db = connection
  validate_schema! if validate_schema
end

Instance Method Details

#capture(manifest, unit_hashes) ⇒ Hash

Capture a snapshot after extraction completes.

Stores the manifest metadata and per-unit content hashes. Computes diff stats vs. the most recent previous snapshot.

Parameters:

  • manifest (Hash)

    The manifest data (string or symbol keys)

  • unit_hashes (Array<Hash>)

    Per-unit content hashes

Returns:

  • (Hash)

    Snapshot record with diff stats



86
87
88
89
90
91
92
93
94
95
96
97
98
99
# File 'lib/woods/temporal/snapshot_store.rb', line 86

def capture(manifest, unit_hashes)
  git_sha = mget(manifest, 'git_sha')
  return nil unless git_sha

  previous = find_latest
  upsert_snapshot(manifest, git_sha, unit_hashes.size)

  snapshot_id = fetch_snapshot_id(git_sha)
  @db.execute('DELETE FROM woods_snapshot_units WHERE snapshot_id = ?', [snapshot_id])
  insert_unit_hashes(snapshot_id, unit_hashes)

  update_diff_stats(snapshot_id, previous)
  find(git_sha)
end

#diff(sha_a, sha_b) ⇒ Hash

Compute diff between two snapshots.

Parameters:

  • sha_a (String)

    Before snapshot git SHA

  • sha_b (String)

    After snapshot git SHA

Returns:

  • (Hash)

    […], modified: […], deleted: […]



138
139
140
141
142
143
144
145
146
147
148
# File 'lib/woods/temporal/snapshot_store.rb', line 138

def diff(sha_a, sha_b)
  id_a = fetch_snapshot_id(sha_a)
  id_b = fetch_snapshot_id(sha_b)

  return { added: [], modified: [], deleted: [] } unless id_a && id_b

  units_a = load_snapshot_units(id_a)
  units_b = load_snapshot_units(id_b)

  compute_diff(units_a, units_b)
end

#find(git_sha) ⇒ Hash?

Find a specific snapshot by git SHA.

Parameters:

  • git_sha (String)

Returns:

  • (Hash, nil)

    Snapshot metadata or nil if not found



126
127
128
129
130
131
# File 'lib/woods/temporal/snapshot_store.rb', line 126

def find(git_sha)
  row = @db.get_first_row('SELECT * FROM woods_snapshots WHERE git_sha = ?', [git_sha])
  return nil unless row

  row_to_hash(row)
end

#list(limit: 20, branch: nil) ⇒ Array<Hash>

List snapshots, optionally filtered by branch.

Parameters:

  • limit (Integer) (defaults to: 20)

    Max results (default 20)

  • branch (String, nil) (defaults to: nil)

    Filter by branch name

Returns:

  • (Array<Hash>)

    Snapshot summaries sorted by extracted_at descending



106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
# File 'lib/woods/temporal/snapshot_store.rb', line 106

def list(limit: 20, branch: nil)
  rows = if branch
           @db.execute(
             'SELECT * FROM woods_snapshots WHERE git_branch = ? ORDER BY extracted_at DESC LIMIT ?',
             [branch, limit]
           )
         else
           @db.execute(
             'SELECT * FROM woods_snapshots ORDER BY extracted_at DESC LIMIT ?',
             [limit]
           )
         end

  rows.map { |row| row_to_hash(row) }
end

#unit_history(identifier, limit: 20) ⇒ Array<Hash>

History of a single unit across snapshots.

Parameters:

  • identifier (String)

    Unit identifier

  • limit (Integer) (defaults to: 20)

    Max snapshots to return (default 20)

Returns:

  • (Array<Hash>)

    Entries with git_sha, extracted_at, source_hash, changed flag



155
156
157
158
159
160
161
162
163
164
165
166
167
168
# File 'lib/woods/temporal/snapshot_store.rb', line 155

def unit_history(identifier, limit: 20)
  rows = @db.execute(<<~SQL, [identifier, limit])
    SELECT su.source_hash, su.metadata_hash, su.dependencies_hash, su.unit_type,
           s.git_sha, s.extracted_at, s.git_branch
    FROM woods_snapshot_units su
    JOIN woods_snapshots s ON s.id = su.snapshot_id
    WHERE su.identifier = ?
    ORDER BY s.extracted_at DESC
    LIMIT ?
  SQL

  entries = rows.map { |row| history_entry_from_row(row) }
  mark_changed_entries(entries)
end

#validate_schema!Object

Probe that ‘woods_snapshots` and `woods_snapshot_units` exist. If they don’t, raise with guidance to run migrations 004 + 005 —without this, the first call to #capture/#find raises a generic adapter error that doesn’t tell operators why.

When the connection responds to ‘#columns` (ActiveRecord-shaped) or `#table_exists?`, use that — these are hard to spoof from a test mock, so a partial mock can no longer silently pass. Falls back to the `SELECT 1 FROM t LIMIT 1` probe for minimal connections.

Raises:



48
49
50
51
52
53
54
# File 'lib/woods/temporal/snapshot_store.rb', line 48

def validate_schema!
  REQUIRED_TABLES.each { |t| probe_table!(t) }
rescue Woods::Error
  raise
rescue StandardError => e
  raise Woods::Error, schema_error_message(e)
end