Module: Woods::MCP::Bootstrapper

Defined in:
lib/woods/mcp/bootstrapper.rb

Overview

Shared setup logic for MCP server executables.

Validates the index directory, checks for a manifest, and builds an optional retriever for semantic search — all duplicated between the stdio and HTTP server entry points.

Class Method Summary collapse

Class Method Details

.build_retriever(index_dir: nil) ⇒ Array(Woods::Retriever, Woods::MCP::BootstrapState)

Build a retriever for MCP semantic search.

Flow:

1. Wrap output_dir in an IndexArtifact (owns path semantics).
2. If woods.json is present, resolve config from it; otherwise
   either raise MissingArtifact or, if WOODS_ALLOW_AUTODETECT=1,
   fall back to env-var auto-detect (deprecated path).
3. Build provider + stores from config (no mutation of
   Woods.configuration — the host's initializer stays intact).
4. Hydrate in-memory stores from dumps (stubs in PR 2; real in PR 3).
5. Probe the provider. If reachable, state :hydrated. If unreachable,
   state :degraded — retriever is still returned, queries will
   retry on first use.

Config-invalid failures raise typed BootstrapError subclasses; exe/woods-mcp’s top-level catches them and prints a one-line operator message. Dependency-unreachable failures start degraded and surface via woods_status.

Parameters:

  • index_dir (String, nil) (defaults to: nil)

    Path to the extraction output directory. When nil, uses Woods.configuration.output_dir.

Returns:

Raises:

  • (Woods::MCP::BootstrapError)

    on config-invalid (missing credentials, dimension mismatch, unsupported artifact, missing artifact with autodetect off).



108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
# File 'lib/woods/mcp/bootstrapper.rb', line 108

def self.build_retriever(index_dir: nil)
  state = BootstrapState.new
  state.mark(:hydrating)

  artifact = build_artifact(index_dir)
  config, _source = ConfigResolver.resolve(Woods.configuration,
                                           artifact: artifact,
                                           ollama_probe: method(:ollama_reachable?))
  return [nil, state] unless config.embedding_provider

  # Build the provider once so {ResolvedConfig.from_configuration} can
  # probe +provider.dimensions+ — without this, Ollama's runtime-only
  # dimension never makes it into +resolved+ and the downstream
  # Snapshotter.load_or_empty validation compares stored-vs-0.
  #
  # The probe is tolerant: if the provider is unreachable we still
  # need a non-nil +resolved+ so the MCP server can start degraded
  # (see the "provider unreachable" branch below). Snapshotter then
  # surfaces a DimensionMismatch only if there's actually a stored
  # artifact to validate against.
  resolved = build_resolved_config(config)
  state.resolved_config = resolved
  retriever = build_retriever_from_config(config, resolved, artifact)
  probe_and_mark_state(config, state)
  warn "[woods-mcp] semantic search: #{state.status} (#{config.embedding_provider})"

  [retriever, state]
end

.build_retriever_compat(index_dir: nil) ⇒ Object

Backwards-compatible wrapper — existing callers (exe/woods-mcp and exe/woods-mcp-http) just want the retriever. They rescue typed BootstrapError at their own top level; we do not catch here.



140
141
142
143
# File 'lib/woods/mcp/bootstrapper.rb', line 140

def self.build_retriever_compat(index_dir: nil)
  retriever, _state = build_retriever(index_dir: index_dir)
  retriever
end

.build_snapshot_store(index_dir) ⇒ Woods::Temporal::SnapshotStore, ...

Build a snapshot store for temporal tracking.

Auto-enables when a SQLite database already exists in the index directory, or when WOODS_SNAPSHOTS=true is set. The database is created and migrated automatically. Falls back to JSON file store when SQLite is unavailable or encounters errors.

Parameters:

  • index_dir (String)

    Path to extraction output directory

Returns:



54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
# File 'lib/woods/mcp/bootstrapper.rb', line 54

def self.build_snapshot_store(index_dir)
  db_path = File.join(index_dir, 'woods.sqlite3')
  enabled = ENV['WOODS_SNAPSHOTS'] == 'true' ||
            Woods.configuration.enable_snapshots ||
            File.exist?(db_path)

  return nil unless enabled

  begin
    require 'sqlite3'
    require_relative '../db/migrator'
    require_relative '../temporal/snapshot_store'

    db = SQLite3::Database.new(db_path)
    db.results_as_hash = true

    Woods::Db::Migrator.new(connection: db).migrate!
    Woods::Temporal::SnapshotStore.new(connection: db)
  rescue LoadError
    warn 'Note: sqlite3 gem not available, using JSON file-based snapshot store.'
    require_relative '../temporal/json_snapshot_store'
    Woods::Temporal::JsonSnapshotStore.new(dir: index_dir)
  rescue StandardError => e
    warn "Note: SQLite snapshot store failed (#{e.class}: #{e.message}), using JSON fallback."
    require_relative '../temporal/json_snapshot_store'
    Woods::Temporal::JsonSnapshotStore.new(dir: index_dir)
  end
end

.ollama_reachable?Boolean

Check whether Ollama is reachable at the configured base URL.

Kept for backwards compatibility with existing specs. Delegates to ConfigResolver and is passed as the ollama_probe: callable in build_retriever so that specs stubbing this method continue to intercept Ollama checks in the autodetect path.

New code should use ProviderProbe.reachable! via the ResolvedConfig flow.

Returns:

  • (Boolean)


240
241
242
# File 'lib/woods/mcp/bootstrapper.rb', line 240

def self.ollama_reachable?
  ConfigResolver.send(:ollama_reachable?)
end

.reload_stores!(retriever, index_dir:) ⇒ Hash

Refresh a live retriever’s in-memory stores from the latest dumps on disk. Used by the MCP reload tool so agents can pick up a fresh embed run without restarting the process. The retriever instance is preserved (tool closures kept their reference) — only the stores are mutated.

No-op when:

- +retriever+ is nil (no embedding provider configured)
- stores are durable (pgvector / Qdrant auto-refresh externally)
- +woods.json+ is absent (Shape-1 deployments don't use Snapshotter)

Parameters:

Returns:

  • (Hash)

    Stats — { vectors:, metadata:, graph: } record counts

Raises:



159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
# File 'lib/woods/mcp/bootstrapper.rb', line 159

def self.reload_stores!(retriever, index_dir:)
  return { vectors: 0, metadata: 0, graph: 0 } unless retriever

  artifact = build_artifact(index_dir)
  config, _source = ConfigResolver.resolve(Woods.configuration,
                                           artifact: artifact,
                                           ollama_probe: method(:ollama_reachable?))
  resolved = build_resolved_config(config)

  vectors_count = refill_in_memory_vector_store(retriever, config, resolved, artifact)
   = (retriever, config, resolved, artifact)
  graph_count = refill_in_memory_graph_store(retriever, config, artifact)

  # Context-cache entries from the previous embed run no longer agree
  # with the refreshed stores. Drop them so the next codebase_retrieve
  # call goes through the full pipeline with the new data. Embedding
  # caches (query → vector) survive — that mapping is deterministic
  # for a given provider+model.
  retriever.invalidate_context_cache! if retriever.respond_to?(:invalidate_context_cache!)

  { vectors: vectors_count, metadata: , graph: graph_count }
end

.resolve_index_dir(argv) ⇒ String

Resolve and validate the index directory from CLI args or environment.

Parameters:

  • argv (Array<String>)

    Command-line arguments

Returns:

  • (String)

    Validated index directory path



28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# File 'lib/woods/mcp/bootstrapper.rb', line 28

def self.resolve_index_dir(argv)
  dir = argv[0] || ENV['WOODS_DIR'] || Dir.pwd

  unless Dir.exist?(dir)
    warn "Error: Index directory does not exist: #{dir}"
    exit 1
  end

  unless File.exist?(File.join(dir, 'manifest.json'))
    warn "Error: No manifest.json found in: #{dir}"
    warn 'Run `bundle exec rake woods:extract` in your Rails app first.'
    exit 1
  end

  dir
end