Module: Woods::MCP::Bootstrapper

Defined in:
lib/woods/mcp/bootstrapper.rb

Overview

Shared setup logic for MCP server executables.

Validates the index directory, checks for a manifest, and builds an optional retriever for semantic search — all duplicated between the stdio and HTTP server entry points.

Class Method Summary collapse

Class Method Details

.build_retriever(index_dir: nil) ⇒ Array(Woods::Retriever, Woods::MCP::BootstrapState)

Build a retriever for MCP semantic search.

Flow:

1. Wrap output_dir in an IndexArtifact (owns path semantics).
2. If woods.json is present, resolve config from it; otherwise fall
   back to env-var auto-detect by default (pattern/structural mode when
   nothing is found). Set WOODS_REQUIRE_INDEX=1 to fail closed instead
   (raise MissingArtifact). See #138.
3. Build provider + stores from config (no mutation of
   Woods.configuration — the host's initializer stays intact).
4. Hydrate in-memory stores from dumps (stubs in PR 2; real in PR 3).
5. Probe the provider. If reachable, state :hydrated. If unreachable,
   state :degraded — retriever is still returned, queries will
   retry on first use.

Config-invalid failures raise typed BootstrapError subclasses; exe/woods-mcp’s top-level catches them and prints a one-line operator message. Dependency-unreachable failures start degraded and surface via woods_status.

Parameters:

  • index_dir (String, nil) (defaults to: nil)

    Path to the extraction output directory. When nil, uses Woods.configuration.output_dir.

Returns:

Raises:

  • (Woods::MCP::BootstrapError)

    on config-invalid (missing credentials, dimension mismatch, unsupported artifact, or a missing artifact under WOODS_REQUIRE_INDEX=1).



109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
# File 'lib/woods/mcp/bootstrapper.rb', line 109

def self.build_retriever(index_dir: nil)
  state = BootstrapState.new
  state.mark(:hydrating)

  artifact = build_artifact(index_dir)
  config, _source = ConfigResolver.resolve(Woods.configuration,
                                           artifact: artifact,
                                           ollama_probe: method(:ollama_reachable?))
  return [nil, state] unless config.embedding_provider

  # Build the provider once so {ResolvedConfig.from_configuration} can
  # probe +provider.dimensions+ — without this, Ollama's runtime-only
  # dimension never makes it into +resolved+ and the downstream
  # Snapshotter.load_or_empty validation compares stored-vs-0.
  #
  # The probe is tolerant: if the provider is unreachable we still
  # need a non-nil +resolved+ so the MCP server can start degraded
  # (see the "provider unreachable" branch below). Snapshotter then
  # surfaces a DimensionMismatch only if there's actually a stored
  # artifact to validate against.
  resolved = build_resolved_config(config)
  state.resolved_config = resolved
  retriever = build_retriever_from_config(config, resolved, artifact)
  probe_and_mark_state(config, state)
  warn "[woods-mcp] semantic search: #{state.status} (#{config.embedding_provider})"

  [retriever, state]
end

.build_retriever_compat(index_dir: nil) ⇒ Object

Backwards-compatible wrapper — existing callers (exe/woods-mcp and exe/woods-mcp-http) just want the retriever. They rescue typed BootstrapError at their own top level; we do not catch here.



141
142
143
144
# File 'lib/woods/mcp/bootstrapper.rb', line 141

def self.build_retriever_compat(index_dir: nil)
  retriever, _state = build_retriever(index_dir: index_dir)
  retriever
end

.build_snapshot_store(index_dir) ⇒ Woods::Temporal::SnapshotStore, ...

Build a snapshot store for temporal tracking.

Auto-enables when a SQLite database already exists in the index directory, or when WOODS_SNAPSHOTS=true is set. The database is created and migrated automatically. Falls back to JSON file store when SQLite is unavailable or encounters errors.

Parameters:

  • index_dir (String)

    Path to extraction output directory

Returns:



54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
# File 'lib/woods/mcp/bootstrapper.rb', line 54

def self.build_snapshot_store(index_dir)
  db_path = File.join(index_dir, 'woods.sqlite3')
  enabled = ENV['WOODS_SNAPSHOTS'] == 'true' ||
            Woods.configuration.enable_snapshots ||
            File.exist?(db_path)

  return nil unless enabled

  begin
    require 'sqlite3'
    require_relative '../db/migrator'
    require_relative '../temporal/snapshot_store'

    db = SQLite3::Database.new(db_path)
    db.results_as_hash = true

    Woods::Db::Migrator.new(connection: db).migrate!
    Woods::Temporal::SnapshotStore.new(connection: db)
  rescue LoadError
    warn 'Note: sqlite3 gem not available, using JSON file-based snapshot store.'
    require_relative '../temporal/json_snapshot_store'
    Woods::Temporal::JsonSnapshotStore.new(dir: index_dir)
  rescue StandardError => e
    warn "Note: SQLite snapshot store failed (#{e.class}: #{e.message}), using JSON fallback."
    require_relative '../temporal/json_snapshot_store'
    Woods::Temporal::JsonSnapshotStore.new(dir: index_dir)
  end
end

.ollama_reachable?Boolean

Check whether Ollama is reachable at the configured base URL.

Kept for backwards compatibility with existing specs. Delegates to ConfigResolver and is passed as the ollama_probe: callable in build_retriever so that specs stubbing this method continue to intercept Ollama checks in the autodetect path.

New code should use ProviderProbe.reachable! via the ResolvedConfig flow.

Returns:

  • (Boolean)


241
242
243
# File 'lib/woods/mcp/bootstrapper.rb', line 241

def self.ollama_reachable?
  ConfigResolver.send(:ollama_reachable?)
end

.reload_stores!(retriever, index_dir:) ⇒ Hash

Refresh a live retriever’s in-memory stores from the latest dumps on disk. Used by the MCP reload tool so agents can pick up a fresh embed run without restarting the process. The retriever instance is preserved (tool closures kept their reference) — only the stores are mutated.

No-op when:

- +retriever+ is nil (no embedding provider configured)
- stores are durable (pgvector / Qdrant auto-refresh externally)
- +woods.json+ is absent (Shape-1 deployments don't use Snapshotter)

Parameters:

Returns:

  • (Hash)

    Stats — { vectors:, metadata:, graph: } record counts

Raises:



160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
# File 'lib/woods/mcp/bootstrapper.rb', line 160

def self.reload_stores!(retriever, index_dir:)
  return { vectors: 0, metadata: 0, graph: 0 } unless retriever

  artifact = build_artifact(index_dir)
  config, _source = ConfigResolver.resolve(Woods.configuration,
                                           artifact: artifact,
                                           ollama_probe: method(:ollama_reachable?))
  resolved = build_resolved_config(config)

  vectors_count = refill_in_memory_vector_store(retriever, config, resolved, artifact)
   = (retriever, config, resolved, artifact)
  graph_count = refill_in_memory_graph_store(retriever, config, artifact)

  # Context-cache entries from the previous embed run no longer agree
  # with the refreshed stores. Drop them so the next codebase_retrieve
  # call goes through the full pipeline with the new data. Embedding
  # caches (query → vector) survive — that mapping is deterministic
  # for a given provider+model.
  retriever.invalidate_context_cache! if retriever.respond_to?(:invalidate_context_cache!)

  { vectors: vectors_count, metadata: , graph: graph_count }
end

.resolve_index_dir(argv) ⇒ String

Resolve and validate the index directory from CLI args or environment.

Parameters:

  • argv (Array<String>)

    Command-line arguments

Returns:

  • (String)

    Validated index directory path



28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# File 'lib/woods/mcp/bootstrapper.rb', line 28

def self.resolve_index_dir(argv)
  dir = argv[0] || ENV['WOODS_DIR'] || Dir.pwd

  unless Dir.exist?(dir)
    warn "Error: Index directory does not exist: #{dir}"
    exit 1
  end

  unless File.exist?(File.join(dir, 'manifest.json'))
    warn "Error: No manifest.json found in: #{dir}"
    warn 'Run `bundle exec rake woods:extract` in your Rails app first.'
    exit 1
  end

  dir
end