Class: Payloop::Wrappers::Groq

Inherits:
Object
  • Object
show all
Defined in:
lib/payloop/wrappers/groq.rb

Overview

Wrapper for the Groq Ruby client (‘groq` gem, drnic/groq-ruby).

Groq is an inference framework that hosts third-party models (Meta Llama, OpenAI gpt-oss, Qwen, compound systems), so it sits in ‘conversation.client.provider = “groq”` rather than `title`. `title` is derived per-call from the model-ID prefix — e.g. `meta-llama/llama-4-scout-17b-16e-instruct` → `“meta-llama”`, `openai/gpt-oss-20b` → `“openai”`. Legacy un-prefixed IDs (`llama-3.1-8b-instant`, `allam-2-7b`) fall back to the first alphanumeric run (`“llama”`, `“allam”`); anything unparseable falls back to `GROQ_PROVIDER` (`“groq”`). `title` is never nil.

Unlike the JS / Python groq-sdk, the Ruby ‘groq` gem’s ‘Client#chat` returns only the assistant message hash (`response.body.dig(“choices”, 0, “message”)`), discarding `usage`, `model`, and the rest of the chat completion envelope. To preserve the wire shape the backend extractor expects, we patch the lower-level `Client#post(path:, body:)` and filter by path — every chat call goes through `/openai/v1/chat/completions`, and `body`/`response.body` at that layer carry the full chat-completion request and response.

Constant Summary collapse

CHAT_COMPLETIONS_PATH =

HTTP path for Groq’s chat-completion endpoint (Groq’s API is OpenAI-compatible, namespaced under ‘/openai/v1/`). The `post` wrapper filters on this so non-chat requests pass through without analytics or sentinel.

"/openai/v1/chat/completions"

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(config, collector, sentinel = nil) ⇒ Groq

Returns a new instance of Groq.



33
34
35
36
37
# File 'lib/payloop/wrappers/groq.rb', line 33

def initialize(config, collector, sentinel = nil)
  @config = config
  @collector = collector
  @sentinel = sentinel
end

Class Method Details

.build_error_response(streaming:, accumulated:, error:) ⇒ Object

Build the ‘response` hash sent on a failed call. For streaming requests that errored after one or more chunks were already merged, the partial `accumulated` response is preserved and the error info is folded in —so the backend can record what was generated before the failure. Non-streaming and pre-chunk failures get the original error-only shape.

Public for the same reason as ‘extract_model_title` — invoked from the wrapped-method closure inside `singleton_class.class_eval`.



99
100
101
102
103
104
105
# File 'lib/payloop/wrappers/groq.rb', line 99

def self.build_error_response(streaming:, accumulated:, error:)
  if streaming && accumulated.is_a?(Hash) && !accumulated.empty?
    accumulated.merge("error" => error.message, "error_class" => error.class.name)
  else
    { error: error.message, class: error.class.name }
  end
end

.extract_model_title(model, fallback) ⇒ Object

Derive a telemetry ‘title` from a Groq model identifier. Mirrors `extractModelTitle` in the JS SDK (`javascript-sdk/src/utils.ts`).

Rules:

- String with `/`: return everything before the first `/`
  (`meta-llama/llama-4-...` → `"meta-llama"`, `openai/gpt-oss-20b`
  → `"openai"`).
- String without `/`: return the first run of alphanumeric characters
  (`llama-3.1-8b-instant` → `"llama"`, `allam-2-7b` → `"allam"`,
  `gpt-4o` → `"gpt"`).
- Anything else (non-string, empty, unrecognized shape): return
  `fallback`.

Telemetry invariant: ‘conversation.client.title` is never nil — the caller always supplies a sensible string fallback (`GROQ_PROVIDER` for Groq calls).

Public because the wrapped-method closure (inside singleton_class.class_eval) needs to call it.



82
83
84
85
86
87
88
89
# File 'lib/payloop/wrappers/groq.rb', line 82

def self.extract_model_title(model, fallback)
  return fallback unless model.is_a?(String)

  slash = model.index("/")
  return model[0...slash] || fallback if slash&.positive?

  model[/\A[A-Za-z0-9]+/] || fallback
end

Instance Method Details

#register(client) ⇒ Object



39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/payloop/wrappers/groq.rb', line 39

def register(client)
  validate_client!(client)

  # Prevent double registration
  return client if client.instance_variable_defined?(:@payloop_registered)

  # Patch the gem's streaming JSON parser. Class-level patch on
  # ::Groq::Client, idempotent via @_payloop_stream_patched — runs once
  # per process. Placed after the per-client guard so repeat register
  # calls on the same client are a true no-op. See patch_stream_handler!
  # for the bug being worked around.
  patch_stream_handler!

  # Store references in client instance
  client.instance_variable_set(:@payloop_config, @config)
  client.instance_variable_set(:@payloop_collector, @collector)
  client.instance_variable_set(:@payloop_sentinel, @sentinel)
  client.instance_variable_set(:@payloop_registered, true)

  wrap_post_method(client)

  client
end