Ruby

Bindings Rust Python Node.js WASM Java Go C# PHP Ruby Elixir Docker Homebrew C FFI License Docs
kreuzberg.dev
Join Discord

Universal LLM API client for Ruby. Access 143+ LLM providers through a single interface with idiomatic Ruby API and native performance.

What This Package Provides

  • One provider surface — chat, streaming, embeddings, images, audio, search, OCR, tools, and structured output across the provider registry.
  • Provider/model routing — call models with the provider/model convention and keep provider-specific request code out of application paths.
  • Production controls — retries, fallback, rate limits, cache layers, budgets, health checks, OpenTelemetry spans, and redacted secrets.
  • Same core as every binding — Rust, Python, Node.js, Go, Java, PHP, Ruby, .NET, Elixir, WASM, Kotlin Android, Swift, Dart, Zig, and C FFI use the same Rust implementation.
  • Ruby package — native extension with idiomatic Ruby request and response objects.

Installation

Package Installation

Install via one of the supported package managers:

gem:

gem install liter_llm

Bundler:

gem 'liter_llm'

System Requirements

  • Ruby 3.2+ required
  • API keys via environment variables (e.g. OPENAI_API_KEY, ANTHROPIC_API_KEY)

Quick Start

Basic Chat

Send a message to any provider using the provider/model prefix:

# frozen_string_literal: true

require 'liter_llm'

client = LiterLlm.create_client(ENV.fetch('OPENAI_API_KEY'))

result = client.chat_async(
  LiterLlm::ChatCompletionRequest.new(
    model: 'openai/gpt-4o-mini',
    messages: [{ 'role' => 'user', 'content' => 'Hello!' }]
  )
)

puts result.choices[0].message.content

Common Use Cases

Streaming Responses

Stream tokens in real time:

# frozen_string_literal: true

require 'liter_llm'

client = LiterLlm.create_client(ENV.fetch('OPENAI_API_KEY'))

client.chat_stream(
  LiterLlm::ChatCompletionRequest.new(
    model: 'openai/gpt-4o-mini',
    messages: [{ 'role' => 'user', 'content' => 'Count from 1 to 5.' }],
    stream: true
  )
) do |chunk|
  delta = chunk.choices && chunk.choices[0] && chunk.choices[0].delta
  print delta.content if delta && delta.content
end
puts

Next Steps

Features

Supported Providers (143+)

Route to any provider using the provider/model prefix convention:

Provider Example Model
OpenAI openai/gpt-4o, openai/gpt-4o-mini
Anthropic anthropic/claude-3-5-sonnet-20241022
Groq groq/llama-3.1-70b-versatile
Mistral mistral/mistral-large-latest
Cohere cohere/command-r-plus
Together AI together/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
Fireworks fireworks/accounts/fireworks/models/llama-v3p1-70b-instruct
Google Vertex vertexai/gemini-1.5-pro
Amazon Bedrock bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0

Complete Provider List

Key Capabilities

  • Provider Routing -- Single client for 143+ LLM providers via provider/model prefix
  • Local LLMs — Connect to locally-hosted models via Ollama, LM Studio, vLLM, llama.cpp, and other local inference servers
  • Unified API -- Consistent chat, chat_stream, embeddings, list_models interface
  • Streaming -- Real-time token streaming via chat_stream
  • Tool Calling -- Function calling and tool use across all supporting providers
  • Type Safe -- Schema-driven types compiled from JSON schemas
  • Secure -- API keys never logged or serialized, managed via environment variables
  • Observability -- Built-in OpenTelemetry with GenAI semantic conventions
  • Error Handling -- Structured errors with provider context and retry hints

Performance

Built on a compiled Rust core for speed and safety:

  • Provider resolution at client construction -- zero per-request overhead
  • Configurable timeouts and connection pooling
  • Zero-copy streaming with SSE and AWS EventStream support
  • API keys wrapped in secure memory, zeroed on drop

Provider Routing

Route to 143+ providers using the provider/model prefix convention:

openai/gpt-4o
anthropic/claude-3-5-sonnet-20241022
groq/llama-3.1-70b-versatile
mistral/mistral-large-latest

See the provider registry for the full list.

Proxy Server

liter-llm also ships as an OpenAI-compatible proxy server with Docker support:

docker run -p 4000:4000 -e LITER_LLM_MASTER_KEY=sk-your-key ghcr.io/kreuzberg-dev/liter-llm

See the proxy server documentation for configuration, CLI usage, and MCP integration.

Documentation

Part of Kreuzberg.dev

  • Kreuzberg — document intelligence: text, tables, metadata from 90+ formats with optional OCR.
  • Kreuzberg Cloud — managed extraction API with SDKs, dashboards, and observability.
  • kreuzcrawl — web crawling and scraping with HTML→Markdown and headless-Chrome fallback.
  • html-to-markdown — fast, lossless HTML→Markdown engine.
  • tree-sitter-language-pack — tree-sitter grammars and code-intelligence primitives.
  • alef — the polyglot binding generator that produces this README and all per-language bindings.
  • Discord — community, roadmap, announcements.

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Join our Discord community for questions and discussion.

License

MIT -- see LICENSE for details.