Class: Woods::Storage::VectorStore::Qdrant
- Inherits:
-
Object
- Object
- Woods::Storage::VectorStore::Qdrant
- Includes:
- Interface
- Defined in:
- lib/woods/storage/qdrant.rb
Overview
Qdrant adapter for vector storage and similarity search via HTTP API.
Communicates with a Qdrant instance over HTTP. Supports optional API key authentication for managed/cloud deployments.
Constant Summary collapse
- ALLOWED_SCHEMES =
URL schemes allowed for the Qdrant endpoint. ‘file://`, `gopher://`, and anything else would let a misconfigured or attacker-controlled config value turn the adapter into an SSRF vector against the host running extraction.
%w[http https].freeze
- PRIVATE_IP_RANGES =
IP ranges that always resolve to loopback, link-local, private, or CGNAT space and should never be contacted as a vector store unless the operator explicitly opts in via ‘allow_private_hosts: true`.
Covers:
-
IPv4 “this network” / wildcard (0.0.0.0/8)
-
IPv4 loopback, RFC1918 (10/8, 172.16/12, 192.168/16)
-
IPv4 link-local 169.254/16 (AWS / Azure / GCP IMDS)
-
IPv4 CGNAT 100.64/10 (common in managed clouds behind NAT)
-
IPv6 loopback (::1) and unspecified (::)
-
IPv6 ULA fc00::/7 (private IPv6 equivalent of RFC1918)
-
IPv6 link-local fe80::/10
NOTE: IPv4-mapped IPv6 (‘::ffff:169.254.169.254`) is handled separately in private_host? by detecting the `::ffff:` prefix and extracting the embedded IPv4 portion before range comparison. A blanket `::ffff:0:0/96` range here would (on some Ruby versions, including 3.0) match every IPv4 address due to IPAddr’s cross-family auto-mapping in ‘#include?`.
-
[ '0.0.0.0/8', '10.0.0.0/8', '127.0.0.0/8', '169.254.0.0/16', '172.16.0.0/12', '192.168.0.0/16', '100.64.0.0/10', '::/128', '::1/128', 'fc00::/7', 'fe80::/10' ].map { |cidr| IPAddr.new(cidr) }.freeze
- PRIVATE_HOSTNAMES =
Hostnames that always map to loopback regardless of DNS.
%w[localhost localhost. ip6-localhost ip6-loopback].freeze
Class Method Summary collapse
-
.validate_url!(url, allow_private_hosts: false) ⇒ Object
Validate a Qdrant endpoint URL — scheme in ALLOWED_SCHEMES and, unless opted out, host outside loopback / link-local / RFC1918.
Instance Method Summary collapse
- #count ⇒ Object
- #delete(id) ⇒ Object
- #delete_by_filter(filters) ⇒ Object
-
#ensure_collection!(dimensions:) ⇒ Object
Create the collection if it doesn’t exist.
-
#initialize(url:, collection:, api_key: nil, dimensions: nil, allow_private_hosts: false) ⇒ Qdrant
constructor
A new instance of Qdrant.
-
#search(query_vector, limit: 10, filters: {}) ⇒ Array<SearchResult>
Search for similar vectors.
-
#store(id, vector, metadata = {}) ⇒ Object
Store or update a vector with metadata payload.
-
#store_batch(entries) ⇒ Object
Store multiple vectors in a single batch upsert request.
Methods included from Interface
Constructor Details
#initialize(url:, collection:, api_key: nil, dimensions: nil, allow_private_hosts: false) ⇒ Qdrant
Returns a new instance of Qdrant.
83 84 85 86 87 88 89 |
# File 'lib/woods/storage/qdrant.rb', line 83 def initialize(url:, collection:, api_key: nil, dimensions: nil, allow_private_hosts: false) @uri = self.class.validate_url!(url, allow_private_hosts: allow_private_hosts) @url = url @collection = collection @api_key = api_key @dimensions = dimensions end |
Class Method Details
.validate_url!(url, allow_private_hosts: false) ⇒ Object
Validate a Qdrant endpoint URL — scheme in ALLOWED_SCHEMES and, unless opted out, host outside loopback / link-local / RFC1918. Public so callers can pre-check configuration before constructing.
94 95 96 97 98 99 100 101 102 |
# File 'lib/woods/storage/qdrant.rb', line 94 def self.validate_url!(url, allow_private_hosts: false) uri = URI(url) validate_scheme!(uri) validate_host_present!(uri, url) validate_host_visibility!(uri.host.to_s, allow_private_hosts: allow_private_hosts) uri rescue URI::InvalidURIError => e raise ArgumentError, "Qdrant URL is not a valid URI: #{e.}" end |
Instance Method Details
#count ⇒ Object
286 287 288 289 |
# File 'lib/woods/storage/qdrant.rb', line 286 def count response = request(:post, "/collections/#{@collection}/points/count", { exact: true }) response['result']['count'] end |
#delete(id) ⇒ Object
274 275 276 277 |
# File 'lib/woods/storage/qdrant.rb', line 274 def delete(id) body = { points: [id] } request(:post, "/collections/#{@collection}/points/delete", body) end |
#delete_by_filter(filters) ⇒ Object
280 281 282 283 |
# File 'lib/woods/storage/qdrant.rb', line 280 def delete_by_filter(filters) body = { filter: build_filter(filters) } request(:post, "/collections/#{@collection}/points/delete", body) end |
#ensure_collection!(dimensions:) ⇒ Object
Create the collection if it doesn’t exist.
187 188 189 190 191 192 193 194 195 196 |
# File 'lib/woods/storage/qdrant.rb', line 187 def ensure_collection!(dimensions:) @dimensions ||= dimensions body = { vectors: { size: dimensions, distance: 'Cosine' } } request(:put, "/collections/#{@collection}", body) end |
#search(query_vector, limit: 10, filters: {}) ⇒ Array<SearchResult>
Search for similar vectors.
253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 |
# File 'lib/woods/storage/qdrant.rb', line 253 def search(query_vector, limit: 10, filters: {}) body = { vector: query_vector, limit: limit, with_payload: true } body[:filter] = build_filter(filters) unless filters.empty? response = request(:post, "/collections/#{@collection}/points/search", body) results = response['result'] || [] results.map do |hit| SearchResult.new( id: hit['id'], score: hit['score'], metadata: hit['payload'] ) end end |
#store(id, vector, metadata = {}) ⇒ Object
Store or update a vector with metadata payload.
204 205 206 207 208 209 210 211 212 213 214 215 216 |
# File 'lib/woods/storage/qdrant.rb', line 204 def store(id, vector, = {}) validate_dimensions!(vector) if @dimensions body = { points: [ { id: id, vector: vector, payload: } ] } request(:put, "/collections/#{@collection}/points", body) end |
#store_batch(entries) ⇒ Object
Store multiple vectors in a single batch upsert request.
Sends the entire entries array in one HTTP call. Callers are responsible for chunking into reasonable batch sizes (e.g., 100–500 points) before calling this method; the embedding Indexer’s batch_size config controls the upstream chunk size.
229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 |
# File 'lib/woods/storage/qdrant.rb', line 229 def store_batch(entries) return if entries.empty? if @dimensions entries.each_with_index do |entry, idx| validate_dimensions!(entry[:vector], index: idx) end end body = { points: entries.map do |entry| { id: entry[:id], vector: entry[:vector], payload: entry[:metadata] || {} } end } request(:put, "/collections/#{@collection}/points", body) end |