Class: Parse::Embeddings::LocalHTTP
- Defined in:
- lib/parse/embeddings/local_http.rb
Overview
Generic OpenAI-compatible local embedding provider. Talks to any
server that exposes POST <base_url>/embeddings with the OpenAI
request/response shape — covers Ollama (/v1), LM Studio (/v1),
vLLM, llama.cpp's server, and any reverse-proxy that translates
to a local model runner.
== SSRF gate
The base_url is resolved at construction time and the resolved
addresses are checked against File::BLOCKED_CIDRS
(loopback, RFC1918, link-local, cloud-metadata, CGNAT, IPv6 ULA,
…). When ANY resolved address falls in a private/internal range,
the constructor refuses unless the caller opts in via
allow_private_endpoint: true.
The opt-in is a deliberate, audit-able gate — Parse::Embeddings
registration is configuration code, not user input, so opting in
to "yes, this base_url really is my Ollama on localhost" is a
one-line decision by the operator at boot time. A Kernel#warn
fires when the opt-in is taken so the choice shows up in operator
logs / bundle exec rake about output.
http:// base URLs are accepted with allow_private_endpoint: true
(the typical local-runner deployment), and refused otherwise unless
the caller also passes allow_insecure_base_url: true (escape
hatch for self-signed internal HTTPS proxies fronted by http://).
== Why no fixed model whitelist
Ollama, LM Studio, and vLLM all serve operator-chosen models —
we cannot enumerate "supported" models the way OpenAI can. The
constructor instead takes the dimensions: explicitly, and the
provider's Provider#validate_response! (inherited) enforces that every
returned vector matches that width. Mis-specified dimensions
surface as InvalidResponseError on the first embed call.
== Security
Defined Under Namespace
Classes: AuthenticationError, BadRequestError, RateLimitError, TransientError
Constant Summary collapse
- DEFAULT_TIMEOUT =
30- DEFAULT_OPEN_TIMEOUT =
5- DEFAULT_MAX_RETRIES =
3- DEFAULT_BATCH_SIZE =
32- MAX_RESPONSE_BYTES =
16 * 1024 * 1024
Constants inherited from Provider
Provider::AS_NOTIFICATION_NAME
Instance Method Summary collapse
- #backoff_seconds(attempt) ⇒ Object protected
- #build_connection ⇒ Object protected
- #dimensions ⇒ Object
- #embed_batch_size ⇒ Object
-
#embed_text(strings, input_type: :search_document) ⇒ Array<Array<Float>>
Vectors aligned 1:1 with
strings. -
#extract_vectors!(payload, input_count) ⇒ Object
protected
Accept the OpenAI-compatible shape.
-
#initialize(base_url:, model:, dimensions:, api_key: nil, normalize: false, timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, max_retries: DEFAULT_MAX_RETRIES, embed_batch_size: DEFAULT_BATCH_SIZE, allow_private_endpoint: false, allow_insecure_base_url: false, allow_faraday_proxy: false, connection: nil) ⇒ LocalHTTP
constructor
A new instance of LocalHTTP.
- #inspect_attrs ⇒ Object
- #model_name ⇒ Object
- #normalize? ⇒ Boolean
- #parse_json_body!(body) ⇒ Object protected
- #post_embeddings(body) ⇒ Object protected
- #retry_after_seconds(response) ⇒ Object protected
- #supports_input_type? ⇒ Boolean
Methods inherited from Provider
#embed_image, #embed_text_batched, #inspect, #instrument_embed, #max_input_tokens, #modalities, #validate_response!
Constructor Details
#initialize(base_url:, model:, dimensions:, api_key: nil, normalize: false, timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, max_retries: DEFAULT_MAX_RETRIES, embed_batch_size: DEFAULT_BATCH_SIZE, allow_private_endpoint: false, allow_insecure_base_url: false, allow_faraday_proxy: false, connection: nil) ⇒ LocalHTTP
Returns a new instance of LocalHTTP.
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 |
# File 'lib/parse/embeddings/local_http.rb', line 114 def initialize( base_url:, model:, dimensions:, api_key: nil, normalize: false, timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, max_retries: DEFAULT_MAX_RETRIES, embed_batch_size: DEFAULT_BATCH_SIZE, allow_private_endpoint: false, allow_insecure_base_url: false, allow_faraday_proxy: false, connection: nil ) validate_model!(model) validate_dimensions!(dimensions) validate_optional_api_key!(api_key) unless [true, false].include?(normalize) raise ArgumentError, "Parse::Embeddings::LocalHTTP: normalize must be true or false (got #{normalize.inspect})." end validate_positive_integer!(:timeout, timeout) validate_positive_integer!(:open_timeout, open_timeout) validate_non_negative_integer!(:max_retries, max_retries) validate_positive_integer!(:embed_batch_size, ) sanitized_base_url, resolved_addrs, is_private = validate_base_url_and_gate_ssrf!(base_url, allow_private_endpoint: allow_private_endpoint, allow_insecure_base_url: allow_insecure_base_url) if is_private # Audit log. Emits once per instance — Kernel#warn so it lands # on stderr and any logger that captures it. Operators running # a hardened environment can grep this to confirm every # private-endpoint opt-in was intentional. warn "Parse::Embeddings::LocalHTTP: allow_private_endpoint=true for #{sanitized_base_url} — " \ "resolved to private address(es) #{resolved_addrs.map(&:to_s).inspect}." end @base_url = sanitized_base_url @model = model @dimensions = dimensions @api_key = api_key @normalize = normalize @timeout = timeout @open_timeout = open_timeout @max_retries = max_retries @embed_batch_size = @allow_faraday_proxy = allow_faraday_proxy @connection = connection || build_connection end |
Instance Method Details
#backoff_seconds(attempt) ⇒ Object (protected)
355 356 357 |
# File 'lib/parse/embeddings/local_http.rb', line 355 def backoff_seconds(attempt) [0.5 * (2**(attempt - 1)), 30.0].min end |
#build_connection ⇒ Object (protected)
234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 |
# File 'lib/parse/embeddings/local_http.rb', line 234 def build_connection headers = { "Content-Type" => "application/json", "Accept" => "application/json", "User-Agent" => "parse-stack-embeddings/#{user_agent_version}", } headers["Authorization"] = "Bearer #{@api_key}" if @api_key faraday_opts = { url: @base_url, headers: headers } faraday_opts[:proxy] = nil unless @allow_faraday_proxy conn = Faraday.new(**faraday_opts) do |f| f..timeout = @timeout f..open_timeout = @open_timeout f.adapter Faraday.default_adapter end conn.proxy = nil if !@allow_faraday_proxy && conn.respond_to?(:proxy=) conn end |
#dimensions ⇒ Object
167 168 169 |
# File 'lib/parse/embeddings/local_http.rb', line 167 def dimensions @dimensions end |
#embed_batch_size ⇒ Object
175 176 177 |
# File 'lib/parse/embeddings/local_http.rb', line 175 def @embed_batch_size end |
#embed_text(strings, input_type: :search_document) ⇒ Array<Array<Float>>
Returns vectors aligned 1:1 with strings.
196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 |
# File 'lib/parse/embeddings/local_http.rb', line 196 def (strings, input_type: :search_document) unless strings.is_a?(Array) raise ArgumentError, "Parse::Embeddings::LocalHTTP#embed_text expects Array<String> (got #{strings.class})." end return [] if strings.empty? strings.each_with_index do |s, i| unless s.is_a?(String) raise ArgumentError, "Parse::Embeddings::LocalHTTP#embed_text strings[#{i}] is not a String (#{s.class})." end if s.empty? raise ArgumentError, "Parse::Embeddings::LocalHTTP#embed_text strings[#{i}] is empty; local runners typically reject empty inputs." end end body = { input: strings, model: @model } (strings.length, input_type) do |emit_payload| payload = (body) # Local runners may or may not include `usage`. When present, # forward total_tokens to the AS::N payload. if payload.is_a?(Hash) && payload["usage"].is_a?(Hash) tt = payload["usage"]["total_tokens"] emit_payload[:total_tokens] = tt if tt.is_a?(Integer) && tt >= 0 end vectors = extract_vectors!(payload, strings.length) validate_response!(strings.length, vectors) end end |
#extract_vectors!(payload, input_count) ⇒ Object (protected)
Accept the OpenAI-compatible shape. Some local runners omit
index or return data in request order without it; tolerate
both forms by falling back to positional alignment when the
field is missing across the entire response.
315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 |
# File 'lib/parse/embeddings/local_http.rb', line 315 def extract_vectors!(payload, input_count) unless payload.is_a?(Hash) raise InvalidResponseError, "Parse::Embeddings::LocalHTTP: response body is not a JSON object." end data = payload["data"] unless data.is_a?(Array) raise InvalidResponseError, "Parse::Embeddings::LocalHTTP: response.data is not an Array." end if data.length != input_count raise InvalidResponseError, "Parse::Embeddings::LocalHTTP: response.data.length #{data.length} != input count #{input_count}." end all_have_index = data.all? { |e| e.is_a?(Hash) && e["index"].is_a?(Integer) } if all_have_index sorted = data.map do |entry| idx = entry["index"] unless idx >= 0 && idx < input_count raise InvalidResponseError, "Parse::Embeddings::LocalHTTP: response.data entry index #{idx} out of range." end [idx, entry["embedding"]] end if sorted.map(&:first).uniq.length != sorted.length raise InvalidResponseError, "Parse::Embeddings::LocalHTTP: duplicate index in response.data." end sorted.sort_by(&:first).map(&:last) else data.each_with_index.map do |entry, i| unless entry.is_a?(Hash) raise InvalidResponseError, "Parse::Embeddings::LocalHTTP: response.data[#{i}] is not a JSON object." end entry["embedding"] end end end |
#inspect_attrs ⇒ Object
228 229 230 |
# File 'lib/parse/embeddings/local_http.rb', line 228 def inspect_attrs super.merge(base: safe_base_host, retries: @max_retries) end |
#model_name ⇒ Object
171 172 173 |
# File 'lib/parse/embeddings/local_http.rb', line 171 def model_name @model end |
#normalize? ⇒ Boolean
179 180 181 |
# File 'lib/parse/embeddings/local_http.rb', line 179 def normalize? @normalize end |
#parse_json_body!(body) ⇒ Object (protected)
298 299 300 301 302 303 304 305 306 307 308 309 |
# File 'lib/parse/embeddings/local_http.rb', line 298 def parse_json_body!(body) s = body.to_s if s.bytesize > MAX_RESPONSE_BYTES raise InvalidResponseError, "Parse::Embeddings::LocalHTTP: response body exceeds #{MAX_RESPONSE_BYTES} bytes " \ "(#{s.bytesize}). Refusing to parse." end JSON.parse(s, max_nesting: 32) rescue JSON::ParserError => e raise InvalidResponseError, "Parse::Embeddings::LocalHTTP: response is not valid JSON (#{e.})." end |
#post_embeddings(body) ⇒ Object (protected)
254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 |
# File 'lib/parse/embeddings/local_http.rb', line 254 def (body) attempts = 0 loop do attempts += 1 begin response = @connection.post("embeddings") do |req| req.body = body.to_json end rescue Faraday::TimeoutError, Faraday::ConnectionFailed => e if attempts > @max_retries raise TransientError, "Parse::Embeddings::LocalHTTP: #{e.class} after #{attempts} attempt(s)." end sleep(backoff_seconds(attempts)) next end status = response.status return parse_json_body!(response.body) if status >= 200 && status < 300 if status == 401 raise AuthenticationError, "Parse::Embeddings::LocalHTTP: 401 Unauthorized — check api_key." end if status == 429 if attempts > @max_retries raise RateLimitError, "Parse::Embeddings::LocalHTTP: 429 rate limited after #{attempts} attempt(s)." end sleep(retry_after_seconds(response) || backoff_seconds(attempts)) next end if status >= 500 if attempts > @max_retries raise TransientError, "Parse::Embeddings::LocalHTTP: #{status} after #{attempts} attempt(s)." end sleep(backoff_seconds(attempts)) next end raise BadRequestError, "Parse::Embeddings::LocalHTTP: #{status} from POST /embeddings." end end |
#retry_after_seconds(response) ⇒ Object (protected)
359 360 361 362 363 364 |
# File 'lib/parse/embeddings/local_http.rb', line 359 def retry_after_seconds(response) ra = response.respond_to?(:headers) ? response.headers["retry-after"] || response.headers["Retry-After"] : nil return nil unless ra v = ra.to_f v.positive? ? [v, 60.0].min : nil end |
#supports_input_type? ⇒ Boolean
183 184 185 186 187 188 189 190 |
# File 'lib/parse/embeddings/local_http.rb', line 183 def supports_input_type? # The OpenAI-compatible local runners do not asymmetrize. Some # models (bge-*) have a documented query prefix, but the local # server itself doesn't expose `input_type:` — callers wrap the # query text instead. We accept the kwarg for cache-key stability # but drop it at the wire level. false end |