groq_ruby

Idiomatic Ruby client for the Groq API.

groq_ruby mirrors the surface of the official groq-python SDK in a Ruby-native shape: typed response objects, single-purpose resource classes, internal dry-monads Result pipelines, and request validation via dry-schema. Streaming chat completions are supported via Server-Sent Events. A built-in MCP client lets you wire one or more MCP servers — local stdio processes or remote HTTP endpoints — into a Groq chat completion as tools.

This gem is not an official Groq product. The wire protocol it implements and the API surface it mirrors come from the publicly available groq-python SDK.

Installation

# Gemfile
gem "groq_ruby"

bundle install

Quick start

require "groq_ruby"

client = GroqRuby::Client.new # reads GROQ_API_KEY from the environment

response = client.chat.completions.create(
  model: "llama-3.3-70b-versatile",
  messages: [
    {role: "system", content: "You are a helpful assistant."},
    {role: "user", content: "Explain low-latency LLMs in one sentence."}
  ]
)

puts response.choices.first.message.content
puts "tokens: #{response.usage.total_tokens}"

Configuration

Configuration is per-client and immutable — no global state, no GroqRuby.configure. Build one client per tenant or set of credentials.

client = GroqRuby::Client.new(
  api_key: ENV["GROQ_API_KEY"],         # default: ENV["GROQ_API_KEY"]
  base_url: "https://api.groq.com",     # default: ENV["GROQ_BASE_URL"] || "https://api.groq.com"
  open_timeout: 10,                     # connect-phase timeout, seconds
  read_timeout: 60,                     # socket-read timeout, seconds
  user_agent: "myapp/1.0"               # default: "groq_ruby/<version> (ruby; net-http)"
)

A ConfigurationError is raised at construction time if no API key can be found.

Chat completions

Buffered

response = client.chat.completions.create(
  model: "llama-3.3-70b-versatile",
  messages: [{role: "user", content: "Hello"}],
  temperature: 0.5,
  max_completion_tokens: 256
)
response.choices.first.message.content
response.usage.prompt_tokens

Streaming

Pass stream: true. With a block, each chunk is yielded as it arrives. Without a block, you get a lazy Enumerable you can iterate later.

client.chat.completions.create(
  model: "llama-3.3-70b-versatile",
  messages: [{role: "user", content: "Write a poem about latency."}],
  stream: true
) do |chunk|
  print chunk.choices.first.delta.content
end

# Or:
stream = client.chat.completions.create(model: "...", messages: [...], stream: true)
stream.each { |chunk| ... }

Validation rejects out-of-range values before any HTTP request fires:

client.chat.completions.create(
  model: "...", messages: [{role: "user", content: "x"}], temperature: 5.0
)
# => GroqRuby::ParameterError: invalid parameters: {:temperature=>["must be less than or equal to 2.0"]}

Function (tool) calling

Pass tools: (and optionally tool_choice:) just like the OpenAI/Groq schema — groq_ruby doesn't transform them. When the model decides to call a tool, it comes back in response.choices.first.message.tool_calls.

tools = [
  {
    type: "function",
    function: {
      name: "get_weather",
      description: "Look up the current weather for a city",
      parameters: {
        type: "object",
        properties: {city: {type: "string"}},
        required: ["city"]
      }
    }
  }
]

response = client.chat.completions.create(
  model: "llama-3.3-70b-versatile",
  messages: [{role: "user", content: "What's the weather in Berlin?"}],
  tools: tools,
  tool_choice: "auto"   # "auto" | "required" | "none" | {type: "function", function: {name: "get_weather"}}
)

call = response.choices.first.message.tool_calls&.first
if call
  args = JSON.parse(call["function"]["arguments"])
  result = get_weather(args["city"])

  # Feed the result back as a follow-up turn:
  followup = client.chat.completions.create(
    model: "llama-3.3-70b-versatile",
    messages: [
      *original_messages,
      {role: "assistant", content: nil, tool_calls: response.choices.first.message.tool_calls},
      {role: "tool", tool_call_id: call["id"], content: JSON.generate(result)}
    ]
  )
end

For an MCP-driven version of this loop (where the tools come from an MCP server instead of being hand-defined), see the MCP section below.

Embeddings

response = client.embeddings.create(
  model: "nomic-embed-text-v1_5",
  input: "Groq makes inference very fast."
)
vector = response.data.first.embedding

Audio

Text → speech

audio_bytes = client.audio.speech.create(
  input: "Hello from Groq.",
  model: "playai-tts",
  voice: "Aaliyah-PlayAI",
  response_format: "wav"
)
File.binwrite("speech.wav", audio_bytes)

Speech → text (transcription)

response = File.open("audio.mp3", "rb") do |file|
  client.audio.transcriptions.create(
    file: file,
    filename: "audio.mp3",
    model: "whisper-large-v3-turbo"
  )
end
puts response.text

Speech → English text (translation)

response = File.open("audio.mp3", "rb") do |file|
  client.audio.translations.create(
    file: file,
    filename: "audio.mp3",
    model: "whisper-large-v3"
  )
end

Models

client.models.list.data.each { |m| puts m.id }
client.models.retrieve("llama-3.3-70b-versatile")
client.models.delete("custom-model-id")

Files

uploaded = File.open("requests.jsonl", "rb") do |file|
  client.files.create(file: file, filename: "requests.jsonl", purpose: "batch")
end

client.files.list.data
client.files.info(uploaded.id)
client.files.content(uploaded.id) # raw bytes
client.files.delete(uploaded.id)

Batches

batch = client.batches.create(
  input_file_id: uploaded.id,
  endpoint: "/v1/chat/completions",
  completion_window: "24h"
)

client.batches.retrieve(batch.id)
client.batches.list
client.batches.cancel(batch.id)

MCP — Model Context Protocol

groq_ruby ships with stdio and HTTP Streamable MCP clients, so a Groq agent can use tools exposed by any MCP-compatible server — local processes (filesystem, custom tooling) or remote HTTPS endpoints. Coverage matches what host applications like Claude Desktop surface:

MCP capability	Coverage
Tools	Surfaced as Groq function tools, namespaced `<server>__<tool>`
Resources	Surfaced via a synthetic `<server>__read_resource(uri)` tool the LLM can call. `bridge.resources` returns the inventory if you want to advertise specific URIs in your system prompt
Prompts	Listed via `bridge.prompts` for your application to surface (e.g. as a picker in your UI) and rendered via `bridge.get_prompt(name, args)`
Sampling, notifications	Not supported in v1

Optional capabilities are probed gracefully — if a server doesn't implement resources/list or prompts/list, those entries are simply empty for that server.

Configuring servers

Two transports are available — pick by config class:

ServerConfig → stdio, child process launched locally.
HttpServerConfig → HTTP Streamable, single HTTPS endpoint (the transport defined for MCP 2025-03-26+ and required by 2025-11-25).

# (a) stdio — local process
fs = GroqRuby::MCP::ServerConfig.new(
  name: "fs",
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/docs"]
)

# (b) HTTP — remote endpoint, headers carry auth
remote = GroqRuby::MCP::HttpServerConfig.new(
  name: "spectrum-ferret",
  url: "https://mcp-staging.spectrumferret.com/mcp",
  headers: {"Authorization" => "Bearer #{ENV.fetch("SF_TOKEN")}"}
)

bridge = GroqRuby::MCP::Bridge.new([fs, remote])  # mix freely

Or load the same JSON shape Claude Desktop uses (mcpServers block) for stdio servers — the adapter expands ${VAR} references against each server's env block first, then the process's ENV, and raises on unresolved references.

configs = GroqRuby::MCP::ClaudeDesktopConfig.load(
  "~/Library/Application Support/Claude/claude_desktop_config.json"
)
bridge = GroqRuby::MCP::Bridge.new(configs)

Picking a protocol version

Each Client carries a Protocol object that selects an MCP spec year. Defaults are sensible for each transport:

Transport	Default protocol
`ServerConfig`	`Protocol::V2024_11_05`
`HttpServerConfig`	`Protocol::V2025_11_25`

Override per-client when you need to:

mcp = GroqRuby::MCP::Client.connect(
  config,
  protocol: GroqRuby::MCP::Protocol::V2024_11_05.new
)

mcp.protocol.version  # => "2024-11-05"

GroqRuby::MCP::Protocol.for("2025-11-25") returns the matching class (or nil for unknown versions); Protocol.default returns the conservative default (V2024_11_05).

Direct usage

config = GroqRuby::MCP::ServerConfig.new(
  name: "fs",
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/docs"]
)

mcp = GroqRuby::MCP::Client.connect(config)
mcp.tools_list                                # [Tool(name: "read_file", ...), ...]
mcp.tools_call(name: "read_file", arguments: {path: "/Users/me/docs/foo.txt"})
mcp.resources_list                            # [Resource(uri: "fs://...", ...)]
mcp.resources_read("fs://docs/foo.md")
mcp.prompts_list                              # [Prompt(name: "summarize", ...)]
mcp.prompts_get("summarize", {path: "foo.md"})
mcp.stop

Bridge into chat.completions

Important: Groq's chat-completions API does not have a request parameter for "MCP server URL." MCP integration here is client-side orchestration: this gem talks to the MCP servers, exposes their tools as ordinary OpenAI/Groq function tools through the standard tools: parameter, then routes the model's tool_calls back to the right server. From Groq's perspective there is no MCP — just function tools.

Bridge does three things at construction:

opens a transport for each config (stdio child process for ServerConfig, HTTPS connection for HttpServerConfig),
runs the MCP initialize handshake and asks each server for its tool list,
indexes those tools by namespaced name <server>__<tool> so collisions across servers are impossible.

bridge.tools then returns an array shaped exactly like Groq's tools: parameter:

bridge.tools
# => [
#   {
#     type: "function",
#     function: {
#       name: "fs__read_file",
#       description: "Read the contents of a file at the given path",
#       parameters: {           # JSON Schema, straight from the MCP server
#         "type" => "object",
#         "properties" => { "path" => { "type" => "string" } },
#         "required" => ["path"]
#       }
#     }
#   },
#   ...
# ]

A complete agent loop using a real MCP server (@modelcontextprotocol/server-filesystem):

require "groq_ruby"
require "json"

filesystem = GroqRuby::MCP::ServerConfig.new(
  name: "fs",
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/docs"]
)

bridge = GroqRuby::MCP::Bridge.new([filesystem])
groq = GroqRuby::Client.new

messages = [
  {role: "system", content: "You can use filesystem tools. Prefer tools over speculation."},
  {role: "user", content: "Summarise README.md."}
]

begin
  loop do
    response = groq.chat.completions.create(
      model: "llama-3.3-70b-versatile",
      messages: messages,
      tools: bridge.tools          # <-- MCP tools surfaced as Groq function tools
    )

    message = response.choices.first.message
    tool_calls = message.tool_calls

    if tool_calls.nil? || tool_calls.empty?
      puts message.content
      break
    end

    messages << {role: "assistant", content: message.content, tool_calls: tool_calls}
    tool_calls.each do |call|
      fn = call["function"]
      # bridge.call accepts either a Hash or the raw JSON string Groq
      # returns in `function.arguments` — it routes to the owning server.
      result = bridge.call(fn["name"], fn["arguments"])
      messages << {role: "tool", tool_call_id: call["id"], content: JSON.generate(result)}
    end
  end
ensure
  bridge.stop                       # kills child processes, closes stdio
end

Runnable variants in examples/:

examples/mcp_agent.rb — minimal version of the loop above.
examples/mcp_chat_with_tools.rb — same loop, heavily annotated step by step.
examples/mcp_resources_and_prompts.rb — adds resources (catalogued in the system prompt + fetched via the synthetic read_resource tool) and prompts.

Error handling

Every API failure raises a subclass of GroqRuby::APIError. Rescue the base class to handle anything; rescue specific subclasses to react to particular conditions.

begin
  client.chat.completions.create(...)
rescue GroqRuby::AuthenticationError => e
  warn "auth failed: #{e.message}"
rescue GroqRuby::RateLimitError => e
  warn "rate limited; retry after backoff"
rescue GroqRuby::APIStatusError => e
  warn "status #{e.status}: #{e.message}"
rescue GroqRuby::APIConnectionError => e
  warn "network: #{e.message}"
end

Hierarchy:

Class	When
`GroqRuby::Error`	Base for everything in the gem
`GroqRuby::ConfigurationError`	Missing or invalid configuration
`GroqRuby::ParameterError`	Request params failed validation
`GroqRuby::APIError`	Base for any failure talking to the API
`GroqRuby::APIConnectionError`	Network failure before a response
`GroqRuby::APITimeoutError`	Connection or read timeout
`GroqRuby::APIStatusError`	4xx/5xx response (carries `status`, `headers`, `body`)
`GroqRuby::BadRequestError`	400
`GroqRuby::AuthenticationError`	401
`GroqRuby::PermissionDeniedError`	403
`GroqRuby::NotFoundError`	404
`GroqRuby::ConflictError`	409
`GroqRuby::UnprocessableEntityError`	422
`GroqRuby::RateLimitError`	429
`GroqRuby::InternalServerError`	5xx
`GroqRuby::APIResponseError`	API returned an unexpected payload
`GroqRuby::MCP::Error`	Base for any MCP-layer failure
`GroqRuby::MCP::TransportError`	Stdio pipe broke / HTTP request failed / non-2xx
`GroqRuby::MCP::TimeoutError`	MCP request timed out
`GroqRuby::MCP::ProtocolError`	Server sent malformed JSON-RPC
`GroqRuby::MCP::JsonRpcError`	Server returned a JSON-RPC `error`
`GroqRuby::MCP::UnknownToolError`	`Bridge#call` couldn't find the tool

Examples

The examples/ directory has one runnable script per major endpoint, plus the MCP agent loop. Each reads GROQ_API_KEY from the environment. See examples/README.md for the full list.

Compatibility

Ruby 3.2+
Net::HTTP (no Faraday/HTTParty dependency)

Not yet supported

The python SDK has a few features that aren't in groq_ruby v1:

Async client (everything here is synchronous; use threads/fibers if you need concurrency).
with_raw_response / with_streaming_response accessors (responses are always parsed into typed models).
Built-in retries / backoff (handle in your own caller).
MCP sampling and notifications/list_changed (resource and prompt inventories are snapshotted at Bridge construction).
MCP tasks API (tasks/get, tasks/result, tasks/list, tasks/cancel) and notifications/tasks/status from the 2025-11-25 spec — the transport is in place; the methods land in a follow-up.
Long-lived GET SSE channel for server-initiated notifications without an in-flight request — request-correlated SSE responses on POST are supported.
MCP version negotiation downgrade: the Protocol object selected at construction is the version sent on every request; the server's response protocolVersion is recorded in server_capabilities but isn't auto-applied.

Development

bin/setup                              # install deps
bundle exec rake                       # tests + lint + RBS validate
bundle exec rake test                  # tests only
bundle exec rake test TESTOPTS="--name=/pattern/"   # subset
bin/console                            # IRB with the gem preloaded

Tests use Minitest + WebMock; no test makes real network calls.

Attribution

The API surface, parameter names, and resource layout follow the official Groq Python SDK at https://github.com/groq/groq-python, distributed under the Apache 2.0 licence. This gem is an independent Ruby implementation and is not affiliated with or endorsed by Groq.

License

MIT — see LICENSE.txt.