groq_ruby

Idiomatic Ruby client for the Groq API.

groq_ruby mirrors the surface of the official groq-python SDK in a Ruby-native shape: typed response objects, single-purpose resource classes, internal dry-monads Result pipelines, and request validation via dry-schema. Streaming chat completions are supported via Server-Sent Events. A built-in MCP client lets you wire one or more MCP servers into a Groq chat completion as tools.

This gem is not an official Groq product. The wire protocol it implements and the API surface it mirrors come from the publicly available groq-python SDK.

Installation

# Gemfile
gem "groq_ruby"
bundle install

Quick start

require "groq_ruby"

client = GroqRuby::Client.new # reads GROQ_API_KEY from the environment

response = client.chat.completions.create(
  model: "llama-3.3-70b-versatile",
  messages: [
    {role: "system", content: "You are a helpful assistant."},
    {role: "user", content: "Explain low-latency LLMs in one sentence."}
  ]
)

puts response.choices.first.message.content
puts "tokens: #{response.usage.total_tokens}"

Configuration

Configuration is per-client and immutable — no global state, no GroqRuby.configure. Build one client per tenant or set of credentials.

client = GroqRuby::Client.new(
  api_key: ENV["GROQ_API_KEY"],         # default: ENV["GROQ_API_KEY"]
  base_url: "https://api.groq.com",     # default: ENV["GROQ_BASE_URL"] || "https://api.groq.com"
  open_timeout: 10,                     # connect-phase timeout, seconds
  read_timeout: 60,                     # socket-read timeout, seconds
  user_agent: "myapp/1.0"               # default: "groq_ruby/<version> (ruby; net-http)"
)

A ConfigurationError is raised at construction time if no API key can be found.

Chat completions

Buffered

response = client.chat.completions.create(
  model: "llama-3.3-70b-versatile",
  messages: [{role: "user", content: "Hello"}],
  temperature: 0.5,
  max_completion_tokens: 256
)
response.choices.first.message.content
response.usage.prompt_tokens

Streaming

Pass stream: true. With a block, each chunk is yielded as it arrives. Without a block, you get a lazy Enumerable you can iterate later.

client.chat.completions.create(
  model: "llama-3.3-70b-versatile",
  messages: [{role: "user", content: "Write a poem about latency."}],
  stream: true
) do |chunk|
  print chunk.choices.first.delta.content
end

# Or:
stream = client.chat.completions.create(model: "...", messages: [...], stream: true)
stream.each { |chunk| ... }

Validation rejects out-of-range values before any HTTP request fires:

client.chat.completions.create(
  model: "...", messages: [{role: "user", content: "x"}], temperature: 5.0
)
# => GroqRuby::ParameterError: invalid parameters: {:temperature=>["must be less than or equal to 2.0"]}

Function (tool) calling

Pass tools: (and optionally tool_choice:) just like the OpenAI/Groq schema — groq_ruby doesn't transform them. When the model decides to call a tool, it comes back in response.choices.first.message.tool_calls.

tools = [
  {
    type: "function",
    function: {
      name: "get_weather",
      description: "Look up the current weather for a city",
      parameters: {
        type: "object",
        properties: {city: {type: "string"}},
        required: ["city"]
      }
    }
  }
]

response = client.chat.completions.create(
  model: "llama-3.3-70b-versatile",
  messages: [{role: "user", content: "What's the weather in Berlin?"}],
  tools: tools,
  tool_choice: "auto"   # "auto" | "required" | "none" | {type: "function", function: {name: "get_weather"}}
)

call = response.choices.first.message.tool_calls&.first
if call
  args = JSON.parse(call["function"]["arguments"])
  result = get_weather(args["city"])

  # Feed the result back as a follow-up turn:
  followup = client.chat.completions.create(
    model: "llama-3.3-70b-versatile",
    messages: [
      *original_messages,
      {role: "assistant", content: nil, tool_calls: response.choices.first.message.tool_calls},
      {role: "tool", tool_call_id: call["id"], content: JSON.generate(result)}
    ]
  )
end

For an MCP-driven version of this loop (where the tools come from an MCP server instead of being hand-defined), see the MCP section below.

Embeddings

response = client.embeddings.create(
  model: "nomic-embed-text-v1_5",
  input: "Groq makes inference very fast."
)
vector = response.data.first.embedding

Audio

Text → speech

audio_bytes = client.audio.speech.create(
  input: "Hello from Groq.",
  model: "playai-tts",
  voice: "Aaliyah-PlayAI",
  response_format: "wav"
)
File.binwrite("speech.wav", audio_bytes)

Speech → text (transcription)

response = File.open("audio.mp3", "rb") do |file|
  client.audio.transcriptions.create(
    file: file,
    filename: "audio.mp3",
    model: "whisper-large-v3-turbo"
  )
end
puts response.text

Speech → English text (translation)

response = File.open("audio.mp3", "rb") do |file|
  client.audio.translations.create(
    file: file,
    filename: "audio.mp3",
    model: "whisper-large-v3"
  )
end

Models

client.models.list.data.each { |m| puts m.id }
client.models.retrieve("llama-3.3-70b-versatile")
client.models.delete("custom-model-id")

Files

uploaded = File.open("requests.jsonl", "rb") do |file|
  client.files.create(file: file, filename: "requests.jsonl", purpose: "batch")
end

client.files.list.data
client.files.info(uploaded.id)
client.files.content(uploaded.id) # raw bytes
client.files.delete(uploaded.id)

Batches

batch = client.batches.create(
  input_file_id: uploaded.id,
  endpoint: "/v1/chat/completions",
  completion_window: "24h"
)

client.batches.retrieve(batch.id)
client.batches.list
client.batches.cancel(batch.id)

MCP — Model Context Protocol

groq_ruby ships with a stdio MCP client, so a Groq agent can use tools exposed by any MCP-compatible server (filesystem, web, custom tooling). Coverage matches what host applications like Claude Desktop surface:

MCP capability Coverage
Tools Surfaced as Groq function tools, namespaced <server>__<tool>
Resources Surfaced via a synthetic <server>__read_resource(uri) tool the LLM can call. bridge.resources returns the inventory if you want to advertise specific URIs in your system prompt
Prompts Listed via bridge.prompts for your application to surface (e.g. as a picker in your UI) and rendered via bridge.get_prompt(name, args)
Sampling, notifications Not supported in v1

Optional capabilities are probed gracefully — if a server doesn't implement resources/list or prompts/list, those entries are simply empty for that server.

Configuring servers

Build ServerConfig directly, or load from the same JSON shape Claude Desktop uses (mcpServers block). The Claude Desktop adapter expands ${VAR} references against each server's env block first, then the process's ENV, and raises on unresolved references.

# (a) direct
config = GroqRuby::MCP::ServerConfig.new(
  name: "fs",
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/docs"]
)

# (b) load Claude Desktop JSON
configs = GroqRuby::MCP::ClaudeDesktopConfig.load(
  "~/Library/Application Support/Claude/claude_desktop_config.json"
)
bridge = GroqRuby::MCP::Bridge.new(configs)

# (c) parse an in-memory hash (e.g. for a private server with a PAT)
configs = GroqRuby::MCP::ClaudeDesktopConfig.parse({
  "mcpServers" => {
    "spectrum-ferret-staging" => {
      "command" => "npx",
      "args" => [
        "-y", "mcp-remote@latest", "https://mcp-staging.spectrumferret.com",
        "--header", "Authorization: Bearer ${SF_PAT}"
      ],
      "env" => {"SF_PAT" => ENV.fetch("SF_PAT")}
    }
  }
})

Direct usage

config = GroqRuby::MCP::ServerConfig.new(
  name: "fs",
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/docs"]
)

mcp = GroqRuby::MCP::Client.connect(config)
mcp.tools_list                                # [Tool(name: "read_file", ...), ...]
mcp.tools_call(name: "read_file", arguments: {path: "/Users/me/docs/foo.txt"})
mcp.resources_list                            # [Resource(uri: "fs://...", ...)]
mcp.resources_read("fs://docs/foo.md")
mcp.prompts_list                              # [Prompt(name: "summarize", ...)]
mcp.prompts_get("summarize", {path: "foo.md"})
mcp.stop

Bridge into chat.completions

Important: Groq's chat-completions API does not have a request parameter for "MCP server URL." MCP integration here is client-side orchestration: this gem talks to the MCP servers, exposes their tools as ordinary OpenAI/Groq function tools through the standard tools: parameter, then routes the model's tool_calls back to the right server. From Groq's perspective there is no MCP — just function tools.

Bridge does three things at construction:

  1. spawns each ServerConfig's child process via stdio,
  2. runs the MCP initialize handshake and asks each server for its tool list,
  3. indexes those tools by namespaced name <server>__<tool> so collisions across servers are impossible.

bridge.tools then returns an array shaped exactly like Groq's tools: parameter:

bridge.tools
# => [
#   {
#     type: "function",
#     function: {
#       name: "fs__read_file",
#       description: "Read the contents of a file at the given path",
#       parameters: {           # JSON Schema, straight from the MCP server
#         "type" => "object",
#         "properties" => { "path" => { "type" => "string" } },
#         "required" => ["path"]
#       }
#     }
#   },
#   ...
# ]

A complete agent loop using a real MCP server (@modelcontextprotocol/server-filesystem):

require "groq_ruby"
require "json"

filesystem = GroqRuby::MCP::ServerConfig.new(
  name: "fs",
  command: "npx",
  args: ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/docs"]
)

bridge = GroqRuby::MCP::Bridge.new([filesystem])
groq = GroqRuby::Client.new

messages = [
  {role: "system", content: "You can use filesystem tools. Prefer tools over speculation."},
  {role: "user", content: "Summarise README.md."}
]

begin
  loop do
    response = groq.chat.completions.create(
      model: "llama-3.3-70b-versatile",
      messages: messages,
      tools: bridge.tools          # <-- MCP tools surfaced as Groq function tools
    )

    message = response.choices.first.message
    tool_calls = message.tool_calls

    if tool_calls.nil? || tool_calls.empty?
      puts message.content
      break
    end

    messages << {role: "assistant", content: message.content, tool_calls: tool_calls}
    tool_calls.each do |call|
      fn = call["function"]
      # bridge.call accepts either a Hash or the raw JSON string Groq
      # returns in `function.arguments` — it routes to the owning server.
      result = bridge.call(fn["name"], fn["arguments"])
      messages << {role: "tool", tool_call_id: call["id"], content: JSON.generate(result)}
    end
  end
ensure
  bridge.stop                       # kills child processes, closes stdio
end

Runnable variants in examples/:

Error handling

Every API failure raises a subclass of GroqRuby::APIError. Rescue the base class to handle anything; rescue specific subclasses to react to particular conditions.

begin
  client.chat.completions.create(...)
rescue GroqRuby::AuthenticationError => e
  warn "auth failed: #{e.message}"
rescue GroqRuby::RateLimitError => e
  warn "rate limited; retry after backoff"
rescue GroqRuby::APIStatusError => e
  warn "status #{e.status}: #{e.message}"
rescue GroqRuby::APIConnectionError => e
  warn "network: #{e.message}"
end

Hierarchy:

Class When
GroqRuby::Error Base for everything in the gem
GroqRuby::ConfigurationError Missing or invalid configuration
GroqRuby::ParameterError Request params failed validation
GroqRuby::APIError Base for any failure talking to the API
GroqRuby::APIConnectionError Network failure before a response
GroqRuby::APITimeoutError Connection or read timeout
GroqRuby::APIStatusError 4xx/5xx response (carries status, headers, body)
GroqRuby::BadRequestError 400
GroqRuby::AuthenticationError 401
GroqRuby::PermissionDeniedError 403
GroqRuby::NotFoundError 404
GroqRuby::ConflictError 409
GroqRuby::UnprocessableEntityError 422
GroqRuby::RateLimitError 429
GroqRuby::InternalServerError 5xx
GroqRuby::APIResponseError API returned an unexpected payload
GroqRuby::MCP::Error Base for any MCP-layer failure
GroqRuby::MCP::TransportError Stdio pipe broke or process exited
GroqRuby::MCP::TimeoutError MCP request timed out
GroqRuby::MCP::ProtocolError Server sent malformed JSON-RPC
GroqRuby::MCP::JsonRpcError Server returned a JSON-RPC error
GroqRuby::MCP::UnknownToolError Bridge#call couldn't find the tool

Examples

The examples/ directory has one runnable script per major endpoint, plus the MCP agent loop. Each reads GROQ_API_KEY from the environment. See examples/README.md for the full list.

Compatibility

  • Ruby 3.2+
  • Net::HTTP (no Faraday/HTTParty dependency)

Not yet supported

The python SDK has a few features that aren't in groq_ruby v1:

  • Async client (everything here is synchronous; use threads/fibers if you need concurrency).
  • with_raw_response / with_streaming_response accessors (responses are always parsed into typed models).
  • Built-in retries / backoff (handle in your own caller).
  • MCP sampling and notifications/list_changed (resource and prompt inventories are snapshotted at Bridge construction).

Development

bin/setup                              # install deps
bundle exec rake                       # tests + lint + RBS validate
bundle exec rake test                  # tests only
bundle exec rake test TESTOPTS="--name=/pattern/"   # subset
bin/console                            # IRB with the gem preloaded

Tests use Minitest + WebMock; no test makes real network calls.

Attribution

The API surface, parameter names, and resource layout follow the official Groq Python SDK at https://github.com/groq/groq-python, distributed under the Apache 2.0 licence. This gem is an independent Ruby implementation and is not affiliated with or endorsed by Groq.

License

MIT — see LICENSE.txt.