Class: Parse::Embeddings::LocalHTTP
- Defined in:
- lib/parse/embeddings/local_http.rb
Overview
Generic OpenAI-compatible local embedding provider. Talks to any server that exposes ‘POST <base_url>/embeddings` with the OpenAI request/response shape — covers Ollama (`/v1`), LM Studio (`/v1`), vLLM, llama.cpp’s ‘server`, and any reverse-proxy that translates to a local model runner.
SSRF gate
The ‘base_url` is resolved at construction time and the resolved addresses are checked against File::BLOCKED_CIDRS (loopback, RFC1918, link-local, cloud-metadata, CGNAT, IPv6 ULA, …). When ANY resolved address falls in a private/internal range, the constructor refuses unless the caller opts in via `allow_private_endpoint: true`.
The opt-in is a deliberate, audit-able gate — Parse::Embeddings registration is configuration code, not user input, so opting in to “yes, this base_url really is my Ollama on localhost” is a one-line decision by the operator at boot time. A ‘Kernel#warn` fires when the opt-in is taken so the choice shows up in operator logs / `bundle exec rake about` output.
‘http://` base URLs are accepted with `allow_private_endpoint: true` (the typical local-runner deployment), and refused otherwise unless the caller also passes `allow_insecure_base_url: true` (escape hatch for self-signed internal HTTPS proxies fronted by http://).
Why no fixed model whitelist
Ollama, LM Studio, and vLLM all serve operator-chosen models —we cannot enumerate “supported” models the way OpenAI can. The constructor instead takes the ‘dimensions:` explicitly, and the provider’s Provider#validate_response! (inherited) enforces that every returned vector matches that width. Mis-specified dimensions surface as InvalidResponseError on the first embed call.
Security
Defined Under Namespace
Classes: AuthenticationError, BadRequestError, RateLimitError, TransientError
Constant Summary collapse
- DEFAULT_TIMEOUT =
30- DEFAULT_OPEN_TIMEOUT =
5- DEFAULT_MAX_RETRIES =
3- DEFAULT_BATCH_SIZE =
32- MAX_RESPONSE_BYTES =
16 * 1024 * 1024
Constants inherited from Provider
Provider::AS_NOTIFICATION_NAME
Instance Method Summary collapse
- #dimensions ⇒ Object
- #embed_batch_size ⇒ Object
-
#embed_text(strings, input_type: :search_document) ⇒ Array<Array<Float>>
Vectors aligned 1:1 with ‘strings`.
-
#initialize(base_url:, model:, dimensions:, api_key: nil, normalize: false, timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, max_retries: DEFAULT_MAX_RETRIES, embed_batch_size: DEFAULT_BATCH_SIZE, allow_private_endpoint: false, allow_insecure_base_url: false, allow_faraday_proxy: false, connection: nil) ⇒ LocalHTTP
constructor
A new instance of LocalHTTP.
- #inspect_attrs ⇒ Object
- #model_name ⇒ Object
- #normalize? ⇒ Boolean
- #supports_input_type? ⇒ Boolean
Methods inherited from Provider
#embed_image, #embed_text_batched, #inspect, #instrument_embed, #max_input_tokens, #modalities, #validate_response!
Constructor Details
#initialize(base_url:, model:, dimensions:, api_key: nil, normalize: false, timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, max_retries: DEFAULT_MAX_RETRIES, embed_batch_size: DEFAULT_BATCH_SIZE, allow_private_endpoint: false, allow_insecure_base_url: false, allow_faraday_proxy: false, connection: nil) ⇒ LocalHTTP
Returns a new instance of LocalHTTP.
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 |
# File 'lib/parse/embeddings/local_http.rb', line 114 def initialize( base_url:, model:, dimensions:, api_key: nil, normalize: false, timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, max_retries: DEFAULT_MAX_RETRIES, embed_batch_size: DEFAULT_BATCH_SIZE, allow_private_endpoint: false, allow_insecure_base_url: false, allow_faraday_proxy: false, connection: nil ) validate_model!(model) validate_dimensions!(dimensions) validate_optional_api_key!(api_key) unless [true, false].include?(normalize) raise ArgumentError, "Parse::Embeddings::LocalHTTP: normalize must be true or false (got #{normalize.inspect})." end validate_positive_integer!(:timeout, timeout) validate_positive_integer!(:open_timeout, open_timeout) validate_non_negative_integer!(:max_retries, max_retries) validate_positive_integer!(:embed_batch_size, ) sanitized_base_url, resolved_addrs, is_private = validate_base_url_and_gate_ssrf!(base_url, allow_private_endpoint: allow_private_endpoint, allow_insecure_base_url: allow_insecure_base_url) if is_private # Audit log. Emits once per instance — Kernel#warn so it lands # on stderr and any logger that captures it. Operators running # a hardened environment can grep this to confirm every # private-endpoint opt-in was intentional. warn "Parse::Embeddings::LocalHTTP: allow_private_endpoint=true for #{sanitized_base_url} — " \ "resolved to private address(es) #{resolved_addrs.map(&:to_s).inspect}." end @base_url = sanitized_base_url @model = model @dimensions = dimensions @api_key = api_key @normalize = normalize @timeout = timeout @open_timeout = open_timeout @max_retries = max_retries @embed_batch_size = @allow_faraday_proxy = allow_faraday_proxy @connection = connection || build_connection end |
Instance Method Details
#dimensions ⇒ Object
167 168 169 |
# File 'lib/parse/embeddings/local_http.rb', line 167 def dimensions @dimensions end |
#embed_batch_size ⇒ Object
175 176 177 |
# File 'lib/parse/embeddings/local_http.rb', line 175 def @embed_batch_size end |
#embed_text(strings, input_type: :search_document) ⇒ Array<Array<Float>>
Returns vectors aligned 1:1 with ‘strings`.
196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 |
# File 'lib/parse/embeddings/local_http.rb', line 196 def (strings, input_type: :search_document) unless strings.is_a?(Array) raise ArgumentError, "Parse::Embeddings::LocalHTTP#embed_text expects Array<String> (got #{strings.class})." end return [] if strings.empty? strings.each_with_index do |s, i| unless s.is_a?(String) raise ArgumentError, "Parse::Embeddings::LocalHTTP#embed_text strings[#{i}] is not a String (#{s.class})." end if s.empty? raise ArgumentError, "Parse::Embeddings::LocalHTTP#embed_text strings[#{i}] is empty; local runners typically reject empty inputs." end end body = { input: strings, model: @model } (strings.length, input_type) do |emit_payload| payload = (body) # Local runners may or may not include `usage`. When present, # forward total_tokens to the AS::N payload. if payload.is_a?(Hash) && payload["usage"].is_a?(Hash) tt = payload["usage"]["total_tokens"] emit_payload[:total_tokens] = tt if tt.is_a?(Integer) && tt >= 0 end vectors = extract_vectors!(payload, strings.length) validate_response!(strings.length, vectors) end end |
#inspect_attrs ⇒ Object
228 229 230 |
# File 'lib/parse/embeddings/local_http.rb', line 228 def inspect_attrs super.merge(base: safe_base_host, retries: @max_retries) end |
#model_name ⇒ Object
171 172 173 |
# File 'lib/parse/embeddings/local_http.rb', line 171 def model_name @model end |
#normalize? ⇒ Boolean
179 180 181 |
# File 'lib/parse/embeddings/local_http.rb', line 179 def normalize? @normalize end |
#supports_input_type? ⇒ Boolean
183 184 185 186 187 188 189 190 |
# File 'lib/parse/embeddings/local_http.rb', line 183 def supports_input_type? # The OpenAI-compatible local runners do not asymmetrize. Some # models (bge-*) have a documented query prefix, but the local # server itself doesn't expose `input_type:` — callers wrap the # query text instead. We accept the kwarg for cache-key stability # but drop it at the wire level. false end |