Module: Apidepth::ModelNameExtractor

Defined in:
lib/apidepth/model_name_extractor.rb

Constant Summary collapse

AI_VENDOR_HOSTS =
%w[
  api.openai.com
  api.anthropic.com
  generativelanguage.googleapis.com
  api.mistral.ai
  api.cohere.com
].to_set.freeze
MODEL_SCAN_MAX_BYTES =

Upper bound on how far into the body we scan for the model field. 256 KB comfortably covers realistic embeddings/batch responses (a few-input OpenAI embeddings body is ~23 KB) while bounding work on pathologically large bodies.

262_144
MODEL_RE =

Matches a structural JSON “model”: “<value>” pair. Escaped quotes inside string values appear as " so this never matches a “model” mentioned inside another JSON string. First match wins (the top-level model field).

/"model"\s*:\s*"([^"]+)"/.freeze

Class Method Summary collapse

Class Method Details

.extract(host, response) ⇒ Object



48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# File 'lib/apidepth/model_name_extractor.rb', line 48

def self.extract(host, response)
  return nil unless Apidepth.configuration.capture_model_names
  # Case-insensitive host match (RUBY-019): DNS hostnames are case-insensitive,
  # so a vendor declared with mixed case (e.g. via extra_vendors) still matches.
  return nil unless AI_VENDOR_HOSTS.include?(host.to_s.downcase)
  return nil unless response["content-type"]&.include?("application/json")

  body = response.body
  return nil if body.nil? || body.empty?

  scan = body.byteslice(0, MODEL_SCAN_MAX_BYTES).to_s.dup.force_encoding("UTF-8")
  match = MODEL_RE.match(scan)
  match && !match[1].empty? ? match[1] : nil
rescue StandardError
  # Covers malformed/invalid-encoding bodies and non-buffered streaming
  # bodies (e.g. Net::ReadAdapter, which has no #empty?). Returning nil keeps
  # the surrounding telemetry event intact rather than dropping it (RUBY-017).
  nil
end