Module: Woods::Storage::Snapshotter::Metadata

Defined in:
lib/woods/storage/snapshotter/metadata.rb

Overview

Reads and writes the metadata store snapshot (metadata.msgpack).

MessagePack is chosen over pack(“e*”) because metadata is heterogeneous hash-shaped data — type tags matter here. The vector format uses packed float32 for dense numeric data; metadata uses MessagePack for everything else.

On-disk format

A stream of MessagePack-packed objects in a single file:

  1. Header hash (one MessagePack object):

    { "magic" => "WMD1", "schema_version" => 1, "record_count" => N,
      "gem_version" => "1.2.0", "created_at" => "2026-04-23T03:42:17Z" }
    
  2. One hash per record, streamed directly after the header:

    { "id" => "PostsController", "metadata" => { ... } }
    

Stream-written via MessagePack::Packer to avoid loading all records into memory at once. Stream-read via MessagePack::Unpacker on load. Written atomically via Tempfile + File.rename.

Constant Summary collapse

MAGIC =

Magic string identifying a valid Woods Metadata Dump file.

'WMD1'
SCHEMA_VERSION =

Current schema version written by this implementation.

1
MAX_SUPPORTED_SCHEMA_VERSION =

Maximum schema version this code can read. A dump with a higher version raises MCP::UnsupportedArtifact rather than silently misreading data.

1
FILENAME =

Filename written inside the dump directory.

'metadata.msgpack'

Class Method Summary collapse

Class Method Details

.dump(store, artifact, dump_dir, resolved_config: nil) ⇒ void

This method returns an undefined value.

Write the metadata store to dump_dir/metadata.msgpack atomically.

Streams header then one packed hash per record — no full in-memory copy of the record set. Uses Tempfile + File.rename for atomicity.

Parameters:

  • store (#each_entry, #bulk_load)

    an in-memory MetadataStore adapter

  • artifact (Woods::IndexArtifact)

    the artifact layout object

  • dump_dir (Pathname, String)

    target directory; must be under artifact.dumps_root

  • resolved_config (Object, nil) (defaults to: nil)

    reserved for future use

Raises:



90
91
92
93
94
95
96
97
# File 'lib/woods/storage/snapshotter/metadata.rb', line 90

def self.dump(store, artifact, dump_dir, resolved_config: nil) # rubocop:disable Lint/UnusedMethodArgument
  validate_store!(store)
  validate_dump_dir!(artifact, dump_dir)
  target = Pathname.new(dump_dir.to_s).join(FILENAME)
  target.dirname.mkpath
  write_atomic(target, store)
  nil
end

.load_or_empty(artifact, resolved_config: nil) ⇒ Woods::Storage::MetadataStore::InMemory

Load a metadata store from the latest dump in artifact, or return an empty store if no dump exists yet.

Never raises for a missing dump — callers that need an empty store on first run get one without special-casing.

Parameters:

  • artifact (Woods::IndexArtifact)

    the artifact layout object

  • resolved_config (Object, nil) (defaults to: nil)

    reserved for future validation

Returns:

Raises:



61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
# File 'lib/woods/storage/snapshotter/metadata.rb', line 61

def self.load_or_empty(artifact, resolved_config: nil) # rubocop:disable Lint/UnusedMethodArgument
  dump_path = dump_file_path(artifact)
  return MetadataStore::InMemory.new unless dump_path&.exist?

  store = MetadataStore::InMemory.new
  File.open(dump_path.to_s, 'rb') do |io|
    unpacker = MessagePack::Unpacker.new(io)
    header = unpacker.read
    validate_header!(header, dump_path)
    header['record_count'].times do
      record = unpacker.read
      store.store(record['id'], record['metadata'])
    end
  end
  store
end