groq_ruby

Idiomatic Ruby client for the Groq API.

groq_ruby mirrors the surface of the official groq-python SDK in a Ruby-native shape: typed response objects, single-purpose resource classes, internal dry-monads Result pipelines, and request validation via dry-schema. Streaming chat completions are supported via Server-Sent Events. A built-in MCP client lets you wire one or more MCP servers into a Groq chat completion as tools.

This gem is not an official Groq product. The wire protocol it implements and the API surface it mirrors come from the publicly available groq-python SDK.

Installation

# Gemfile
gem "groq_ruby"

bundle install

Quick start

require "groq_ruby"

client = GroqRuby::Client.new # reads GROQ_API_KEY from the environment

response = client.chat.completions.create(
  model: "llama-3.3-70b-versatile",
  messages: [
    {role: "system", content: "You are a helpful assistant."},
    {role: "user", content: "Explain low-latency LLMs in one sentence."}
  ]
)

puts response.choices.first.message.content
puts "tokens: #{response.usage.total_tokens}"

Configuration

Configuration is per-client and immutable — no global state, no GroqRuby.configure. Build one client per tenant or set of credentials.

client = GroqRuby::Client.new(
  api_key: ENV["GROQ_API_KEY"],         # default: ENV["GROQ_API_KEY"]
  base_url: "https://api.groq.com",     # default: ENV["GROQ_BASE_URL"] || "https://api.groq.com"
  open_timeout: 10,                     # connect-phase timeout, seconds
  read_timeout: 60,                     # socket-read timeout, seconds
  user_agent: "myapp/1.0"               # default: "groq_ruby/<version> (ruby; net-http)"
)

A ConfigurationError is raised at construction time if no API key can be found.

Chat completions

Buffered

response = client.chat.completions.create(
  model: "llama-3.3-70b-versatile",
  messages: [{role: "user", content: "Hello"}],
  temperature: 0.5,
  max_completion_tokens: 256
)
response.choices.first.message.content
response.usage.prompt_tokens

Streaming

Pass stream: true. With a block, each chunk is yielded as it arrives. Without a block, you get a lazy Enumerable you can iterate later.

client.chat.completions.create(
  model: "llama-3.3-70b-versatile",
  messages: [{role: "user", content: "Write a poem about latency."}],
  stream: true
) do |chunk|
  print chunk.choices.first.delta.content
end

# Or:
stream = client.chat.completions.create(model: "...", messages: [...], stream: true)
stream.each { |chunk| ... }

Validation rejects out-of-range values before any HTTP request fires:

client.chat.completions.create(
  model: "...", messages: [{role: "user", content: "x"}], temperature: 5.0
)
# => GroqRuby::ParameterError: invalid parameters: {:temperature=>["must be less than or equal to 2.0"]}

Function (tool) calling

Pass tools: (and optionally tool_choice:) just like the OpenAI/Groq schema — groq_ruby doesn't transform them. When the model decides to call a tool, it comes back in response.choices.first.message.tool_calls.

tools = [
  {
    type: "function",
    function: {
      name: "get_weather",
      description: "Look up the current weather for a city",
      parameters: {
        type: "object",
        properties: {city: {type: "string"}},
        required: ["city"]
      }
    }
  }
]

response = client.chat.completions.create(
  model: "llama-3.3-70b-versatile",
  messages: [{role: "user", content: "What's the weather in Berlin?"}],
  tools: tools,
  tool_choice: "auto"   # "auto" | "required" | "none" | {type: "function", function: {name: "get_weather"}}
)

call = response.choices.first.message.tool_calls&.first
if call
  args = JSON.parse(call["function"]["arguments"])
  result = get_weather(args["city"])

  # Feed the result back as a follow-up turn:
  followup = client.chat.completions.create(
    model: "llama-3.3-70b-versatile",
    messages: [
      *original_messages,
      {role: "assistant", content: nil, tool_calls: response.choices.first.message.tool_calls},
      {role: "tool", tool_call_id: call["id"], content: JSON.generate(result)}
    ]
  )
end

For an MCP-driven version of this loop (where the tools come from an MCP server instead of being hand-defined), see the MCP section below.

Embeddings

response = client.embeddings.create(
  model: "nomic-embed-text-v1_5",
  input: "Groq makes inference very fast."
)
vector = response.data.first.embedding

Audio

Text → speech

audio_bytes = client.audio.speech.create(
  input: "Hello from Groq.",
  model: "playai-tts",
  voice: "Aaliyah-PlayAI",
  response_format: "wav"
)
File.binwrite("speech.wav", audio_bytes)

Speech → text (transcription)

response = File.open("audio.mp3", "rb") do |file|
  client.audio.transcriptions.create(
    file: file,
    filename: "audio.mp3",
    model: "whisper-large-v3-turbo"
  )
end
puts response.text

Speech → English text (translation)

response = File.open("audio.mp3", "rb") do |file|
  client.audio.translations.create(
    file: file,
    filename: "audio.mp3",
    model: "whisper-large-v3"
  )
end

Models

client.models.list.data.each { |m| puts m.id }
client.models.retrieve("llama-3.3-70b-versatile")
client.models.delete("custom-model-id")

Files

uploaded = File.open("requests.jsonl", "rb") do |file|
  client.files.create(file: file, filename: "requests.jsonl", purpose: "batch")
end

client.files.list.data
client.files.info(uploaded.id)
client.files.content(uploaded.id) # raw bytes
client.files.delete(uploaded.id)

Batches

batch = client.batches.create(
  input_file_id: uploaded.id,
  endpoint: "/v1/chat/completions",
  completion_window: "24h"
)

client.batches.retrieve(batch.id)
client.batches.list
client.batches.cancel(batch.id)

MCP — Model Context Protocol

groq_ruby ships with a stdio MCP client, so a Groq agent can use tools exposed by any MCP-compatible server (filesystem, web, custom tooling). Coverage matches what host applications like Claude Desktop surface:

MCP capability	Coverage
Tools	Surfaced as Groq function tools, namespaced `<server>__<tool>`
Resources	Surfaced via a synthetic `<server>__read_resource(uri)` tool the LLM can call. `bridge.resources` returns the inventory if you want to advertise specific URIs in your system prompt
Prompts	Listed via `bridge.prompts` for your application to surface (e.g. as a picker in your UI) and rendered via `bridge.get_prompt(name, args)`
Sampling, notifications	Not supported in v1

Optional capabilities are probed gracefully — if a server doesn't implement resources/list or prompts/list, those entries are simply empty for that server.

Configuring servers

Build ServerConfig directly, or load from the same JSON shape Claude Desktop uses (mcpServers block). The Claude Desktop adapter expands ${VAR} references against each server's env block first, then the process's ENV, and raises on unresolved references.

# (a) direct
config = GroqRuby::MCP::ServerConfig.new(
  name: "fs",
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/docs"]
)

# (b) load Claude Desktop JSON
configs = GroqRuby::MCP::ClaudeDesktopConfig.load(
  "~/Library/Application Support/Claude/claude_desktop_config.json"
)
bridge = GroqRuby::MCP::Bridge.new(configs)

# (c) parse an in-memory hash (e.g. for a private server with a PAT)
configs = GroqRuby::MCP::ClaudeDesktopConfig.parse({
  "mcpServers" => {
    "spectrum-ferret-staging" => {
      "command" => "npx",
      "args" => [
        "-y", "mcp-remote@latest", "https://mcp-staging.spectrumferret.com",
        "--header", "Authorization: Bearer ${SF_PAT}"
      ],
      "env" => {"SF_PAT" => ENV.fetch("SF_PAT")}
    }
  }
})

Direct usage

config = GroqRuby::MCP::ServerConfig.new(
  name: "fs",
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/docs"]
)

mcp = GroqRuby::MCP::Client.connect(config)
mcp.tools_list                                # [Tool(name: "read_file", ...), ...]
mcp.tools_call(name: "read_file", arguments: {path: "/Users/me/docs/foo.txt"})
mcp.resources_list                            # [Resource(uri: "fs://...", ...)]
mcp.resources_read("fs://docs/foo.md")
mcp.prompts_list                              # [Prompt(name: "summarize", ...)]
mcp.prompts_get("summarize", {path: "foo.md"})
mcp.stop

Bridge into chat.completions

Important: Groq's chat-completions API does not have a request parameter for "MCP server URL." MCP integration here is client-side orchestration: this gem talks to the MCP servers, exposes their tools as ordinary OpenAI/Groq function tools through the standard tools: parameter, then routes the model's tool_calls back to the right server. From Groq's perspective there is no MCP — just function tools.

Bridge does three things at construction:

spawns each ServerConfig's child process via stdio,
runs the MCP initialize handshake and asks each server for its tool list,
indexes those tools by namespaced name <server>__<tool> so collisions across servers are impossible.

bridge.tools then returns an array shaped exactly like Groq's tools: parameter:

bridge.tools
# => [
#   {
#     type: "function",
#     function: {
#       name: "fs__read_file",
#       description: "Read the contents of a file at the given path",
#       parameters: {           # JSON Schema, straight from the MCP server
#         "type" => "object",
#         "properties" => { "path" => { "type" => "string" } },
#         "required" => ["path"]
#       }
#     }
#   },
#   ...
# ]

A complete agent loop using a real MCP server (@modelcontextprotocol/server-filesystem):

require "groq_ruby"
require "json"

filesystem = GroqRuby::MCP::ServerConfig.new(
  name: "fs",
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/docs"]
)

bridge = GroqRuby::MCP::Bridge.new([filesystem])
groq = GroqRuby::Client.new

messages = [
  {role: "system", content: "You can use filesystem tools. Prefer tools over speculation."},
  {role: "user", content: "Summarise README.md."}
]

begin
  loop do
    response = groq.chat.completions.create(
      model: "llama-3.3-70b-versatile",
      messages: messages,
      tools: bridge.tools          # <-- MCP tools surfaced as Groq function tools
    )

    message = response.choices.first.message
    tool_calls = message.tool_calls

    if tool_calls.nil? || tool_calls.empty?
      puts message.content
      break
    end

    messages << {role: "assistant", content: message.content, tool_calls: tool_calls}
    tool_calls.each do |call|
      fn = call["function"]
      # bridge.call accepts either a Hash or the raw JSON string Groq
      # returns in `function.arguments` — it routes to the owning server.
      result = bridge.call(fn["name"], fn["arguments"])
      messages << {role: "tool", tool_call_id: call["id"], content: JSON.generate(result)}
    end
  end
ensure
  bridge.stop                       # kills child processes, closes stdio
end

Runnable variants in examples/:

examples/mcp_agent.rb — minimal version of the loop above.
examples/mcp_chat_with_tools.rb — same loop, heavily annotated step by step.
examples/mcp_resources_and_prompts.rb — adds resources (catalogued in the system prompt + fetched via the synthetic read_resource tool) and prompts.

Error handling

Every API failure raises a subclass of GroqRuby::APIError. Rescue the base class to handle anything; rescue specific subclasses to react to particular conditions.

begin
  client.chat.completions.create(...)
rescue GroqRuby::AuthenticationError => e
  warn "auth failed: #{e.message}"
rescue GroqRuby::RateLimitError => e
  warn "rate limited; retry after backoff"
rescue GroqRuby::APIStatusError => e
  warn "status #{e.status}: #{e.message}"
rescue GroqRuby::APIConnectionError => e
  warn "network: #{e.message}"
end

Hierarchy:

Class	When
`GroqRuby::Error`	Base for everything in the gem
`GroqRuby::ConfigurationError`	Missing or invalid configuration
`GroqRuby::ParameterError`	Request params failed validation
`GroqRuby::APIError`	Base for any failure talking to the API
`GroqRuby::APIConnectionError`	Network failure before a response
`GroqRuby::APITimeoutError`	Connection or read timeout
`GroqRuby::APIStatusError`	4xx/5xx response (carries `status`, `headers`, `body`)
`GroqRuby::BadRequestError`	400
`GroqRuby::AuthenticationError`	401
`GroqRuby::PermissionDeniedError`	403
`GroqRuby::NotFoundError`	404
`GroqRuby::ConflictError`	409
`GroqRuby::UnprocessableEntityError`	422
`GroqRuby::RateLimitError`	429
`GroqRuby::InternalServerError`	5xx
`GroqRuby::APIResponseError`	API returned an unexpected payload
`GroqRuby::MCP::Error`	Base for any MCP-layer failure
`GroqRuby::MCP::TransportError`	Stdio pipe broke or process exited
`GroqRuby::MCP::TimeoutError`	MCP request timed out
`GroqRuby::MCP::ProtocolError`	Server sent malformed JSON-RPC
`GroqRuby::MCP::JsonRpcError`	Server returned a JSON-RPC `error`
`GroqRuby::MCP::UnknownToolError`	`Bridge#call` couldn't find the tool

Examples

The examples/ directory has one runnable script per major endpoint, plus the MCP agent loop. Each reads GROQ_API_KEY from the environment. See examples/README.md for the full list.

Compatibility

Ruby 3.2+
Net::HTTP (no Faraday/HTTParty dependency)

Not yet supported

The python SDK has a few features that aren't in groq_ruby v1:

Async client (everything here is synchronous; use threads/fibers if you need concurrency).
with_raw_response / with_streaming_response accessors (responses are always parsed into typed models).
Built-in retries / backoff (handle in your own caller).
MCP sampling and notifications/list_changed (resource and prompt inventories are snapshotted at Bridge construction).

Development

bin/setup                              # install deps
bundle exec rake                       # tests + lint + RBS validate
bundle exec rake test                  # tests only
bundle exec rake test TESTOPTS="--name=/pattern/"   # subset
bin/console                            # IRB with the gem preloaded

Tests use Minitest + WebMock; no test makes real network calls.

Attribution

The API surface, parameter names, and resource layout follow the official Groq Python SDK at https://github.com/groq/groq-python, distributed under the Apache 2.0 licence. This gem is an independent Ruby implementation and is not affiliated with or endorsed by Groq.

License

MIT — see LICENSE.txt.