Class: Woods::Storage::VectorStore::Qdrant

Inherits:
Object
  • Object
show all
Includes:
Interface
Defined in:
lib/woods/storage/qdrant.rb

Overview

Qdrant adapter for vector storage and similarity search via HTTP API.

Communicates with a Qdrant instance over HTTP. Supports optional API key authentication for managed/cloud deployments.

Examples:

store = Qdrant.new(url: "http://localhost:6333", collection: "codebase")
store.ensure_collection!(dimensions: 768)
store.store("User", [0.1, 0.2, ...], { type: "model" })
results = store.search([0.1, 0.2, ...], limit: 5)

Constant Summary collapse

ALLOWED_SCHEMES =

URL schemes allowed for the Qdrant endpoint. ‘file://`, `gopher://`, and anything else would let a misconfigured or attacker-controlled config value turn the adapter into an SSRF vector against the host running extraction.

%w[http https].freeze
PRIVATE_IP_RANGES =

IP ranges that always resolve to loopback, link-local, private, or CGNAT space and should never be contacted as a vector store unless the operator explicitly opts in via ‘allow_private_hosts: true`.

Covers:

  • IPv4 “this network” / wildcard (0.0.0.0/8)

  • IPv4 loopback, RFC1918 (10/8, 172.16/12, 192.168/16)

  • IPv4 link-local 169.254/16 (AWS / Azure / GCP IMDS)

  • IPv4 CGNAT 100.64/10 (common in managed clouds behind NAT)

  • IPv6 loopback (::1) and unspecified (::)

  • IPv6 ULA fc00::/7 (private IPv6 equivalent of RFC1918)

  • IPv6 link-local fe80::/10

NOTE: IPv4-mapped IPv6 (‘::ffff:169.254.169.254`) is handled separately in private_host? by detecting the `::ffff:` prefix and extracting the embedded IPv4 portion before range comparison. A blanket `::ffff:0:0/96` range here would (on some Ruby versions, including 3.0) match every IPv4 address due to IPAddr’s cross-family auto-mapping in ‘#include?`.

[
  '0.0.0.0/8',
  '10.0.0.0/8',
  '127.0.0.0/8',
  '169.254.0.0/16',
  '172.16.0.0/12',
  '192.168.0.0/16',
  '100.64.0.0/10',
  '::/128',
  '::1/128',
  'fc00::/7',
  'fe80::/10'
].map { |cidr| IPAddr.new(cidr) }.freeze
PRIVATE_HOSTNAMES =

Hostnames that always map to loopback regardless of DNS.

%w[localhost localhost. ip6-localhost ip6-loopback].freeze

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Interface

#bulk_load, #each_entry

Constructor Details

#initialize(url:, collection:, api_key: nil, dimensions: nil, allow_private_hosts: false) ⇒ Qdrant

Returns a new instance of Qdrant.

Parameters:

  • url (String)

    Qdrant server URL

  • collection (String)

    Collection name

  • api_key (String, nil) (defaults to: nil)

    Optional API key for authentication

  • dimensions (Integer, nil) (defaults to: nil)

    Expected vector dimension. When set, #store_batch/#store pre-validate every vector’s length before sending the HTTP request — Qdrant returns a 400 on mismatch, but detecting it client-side avoids wasted network round-trips and keeps error shape consistent with the pgvector adapter.

  • allow_private_hosts (Boolean) (defaults to: false)

    Explicitly permit a URL whose host resolves inside loopback, link-local, or RFC1918 space. Off by default to block the common SSRF footgun. Set to true when the operator intentionally runs Qdrant on ‘localhost:6333` or inside a private network.



83
84
85
86
87
88
89
# File 'lib/woods/storage/qdrant.rb', line 83

def initialize(url:, collection:, api_key: nil, dimensions: nil, allow_private_hosts: false)
  @uri = self.class.validate_url!(url, allow_private_hosts: allow_private_hosts)
  @url = url
  @collection = collection
  @api_key = api_key
  @dimensions = dimensions
end

Class Method Details

.validate_url!(url, allow_private_hosts: false) ⇒ Object

Validate a Qdrant endpoint URL — scheme in ALLOWED_SCHEMES and, unless opted out, host outside loopback / link-local / RFC1918. Public so callers can pre-check configuration before constructing.



94
95
96
97
98
99
100
101
102
# File 'lib/woods/storage/qdrant.rb', line 94

def self.validate_url!(url, allow_private_hosts: false)
  uri = URI(url)
  validate_scheme!(uri)
  validate_host_present!(uri, url)
  validate_host_visibility!(uri.host.to_s, allow_private_hosts: allow_private_hosts)
  uri
rescue URI::InvalidURIError => e
  raise ArgumentError, "Qdrant URL is not a valid URI: #{e.message}"
end

Instance Method Details

#countObject

See Also:



286
287
288
289
# File 'lib/woods/storage/qdrant.rb', line 286

def count
  response = request(:post, "/collections/#{@collection}/points/count", { exact: true })
  response['result']['count']
end

#delete(id) ⇒ Object

See Also:



274
275
276
277
# File 'lib/woods/storage/qdrant.rb', line 274

def delete(id)
  body = { points: [id] }
  request(:post, "/collections/#{@collection}/points/delete", body)
end

#delete_by_filter(filters) ⇒ Object



280
281
282
283
# File 'lib/woods/storage/qdrant.rb', line 280

def delete_by_filter(filters)
  body = { filter: build_filter(filters) }
  request(:post, "/collections/#{@collection}/points/delete", body)
end

#ensure_collection!(dimensions:) ⇒ Object

Create the collection if it doesn’t exist.

Parameters:

  • dimensions (Integer)

    Vector dimensionality



187
188
189
190
191
192
193
194
195
196
# File 'lib/woods/storage/qdrant.rb', line 187

def ensure_collection!(dimensions:)
  @dimensions ||= dimensions
  body = {
    vectors: {
      size: dimensions,
      distance: 'Cosine'
    }
  }
  request(:put, "/collections/#{@collection}", body)
end

#search(query_vector, limit: 10, filters: {}) ⇒ Array<SearchResult>

Search for similar vectors.

Parameters:

  • query_vector (Array<Float>)

    The query embedding

  • limit (Integer) (defaults to: 10)

    Maximum results to return

  • filters (Hash) (defaults to: {})

    Metadata key-value filters

Returns:

  • (Array<SearchResult>)

    Results sorted by descending similarity

See Also:



253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
# File 'lib/woods/storage/qdrant.rb', line 253

def search(query_vector, limit: 10, filters: {})
  body = {
    vector: query_vector,
    limit: limit,
    with_payload: true
  }
  body[:filter] = build_filter(filters) unless filters.empty?

  response = request(:post, "/collections/#{@collection}/points/search", body)
  results = response['result'] || []

  results.map do |hit|
    SearchResult.new(
      id: hit['id'],
      score: hit['score'],
      metadata: hit['payload']
    )
  end
end

#store(id, vector, metadata = {}) ⇒ Object

Store or update a vector with metadata payload.

Parameters:

  • id (String)

    Unique identifier

  • vector (Array<Float>)

    The embedding vector

  • metadata (Hash) (defaults to: {})

    Optional payload metadata

See Also:



204
205
206
207
208
209
210
211
212
213
214
215
216
# File 'lib/woods/storage/qdrant.rb', line 204

def store(id, vector,  = {})
  validate_dimensions!(vector) if @dimensions
  body = {
    points: [
      {
        id: id,
        vector: vector,
        payload: 
      }
    ]
  }
  request(:put, "/collections/#{@collection}/points", body)
end

#store_batch(entries) ⇒ Object

Store multiple vectors in a single batch upsert request.

Sends the entire entries array in one HTTP call. Callers are responsible for chunking into reasonable batch sizes (e.g., 100–500 points) before calling this method; the embedding Indexer’s batch_size config controls the upstream chunk size.

Parameters:

  • entries (Array<Hash>)

    Each entry has :id, :vector, :metadata keys

Raises:

  • (Woods::Error)

    if any entry’s vector doesn’t match the configured dimension. Validation runs before the HTTP request so partial-batch state is impossible on dimension mismatch.



229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
# File 'lib/woods/storage/qdrant.rb', line 229

def store_batch(entries)
  return if entries.empty?

  if @dimensions
    entries.each_with_index do |entry, idx|
      validate_dimensions!(entry[:vector], index: idx)
    end
  end

  body = {
    points: entries.map do |entry|
      { id: entry[:id], vector: entry[:vector], payload: entry[:metadata] || {} }
    end
  }
  request(:put, "/collections/#{@collection}/points", body)
end