🌈 RainbowLLM

🚀 Quick Start

bundle add rainbow_llm

📖 Table of Contents

Features
Installation
Configuration
Usage Examples
Advanced Patterns
Development
Contributing
License

Features

Automatic failover - Tries providers in your specified order with instant fallback when a provider fails, configurable retry logic and timeouts
Cost optimization - Route to the most cost-effective provider, maximize free tier usage across providers, and avoid rate limit surprises
Flexible configuration - Support for OpenAI-compatible endpoints, Basic Auth and API key authentication, custom endpoints and model mappings
Production ready - Built on the reliable ruby_llm foundation with comprehensive error handling and detailed logging and monitoring

📦 Installation

Option 1: With Bundler (Recommended)

bundle add rainbow_llm

Option 2: Direct Install

 gem install rainbow_llm

Requirements

Ruby 3.2+

⚙️ Configuration

Configure your providers once, use them everywhere:

# config/initializers/rainbow_llm.rb
RainbowLLM.configure do |config|
  # Local Ollama instance
  config.provider :ollama, {
    provider: "openai_basic",
    uri_base: "http://localhost:11434/v1",
    access_token: ENV['OLLAMA_API_KEY'],
    assume_model_exists: true
  }

  # Cerebras cloud service
  config.provider :cerebras, {
    provider: "openai",
    uri_base: "https://api.cerebras.ai/v1",
    access_token: ENV['CEREBRAS_API_KEY'],
    assume_model_exists: true
  }

  # Add as many providers as you need!
  config.provider :openai, {
    provider: "openai",
    access_token: ENV['OPENAI_API_KEY']
  }
end

💡 Usage Examples

Basic Chat Completion

response = RainbowLLM.chat(
  models: ["ollama/llama3.3", "cerebras/llama-3.3-70b"]
).ask("Explain quantum computing to a 5-year-old")

puts response.content
# => "Imagine you have a magical toy that can be in two places at once..."

Fluent API Options

Chain options together for fine-grained control:

# Set temperature and timeout
response = RainbowLLM.chat(models: ["ollama/llama3.3"])
  .with_temperature(0.8)
  .with_timeout(30)
  .ask("Write a creative story")

# Use JSON schema for structured output
response = RainbowLLM.chat(models: ["cerebras/llama-3.3-70b"])
  .with_schema(MySchema)
  .ask("Extract data from this text")

# Chain multiple options
response = RainbowLLM.chat(models: ["openai/gpt-5", "cerebras/llama-3.3-70b"])
  .with_temperature(0.7)
  .with_timeout(45)
  .with_schema(ResponseSchema)
  .ask("Analyze this document")

Response Object:

# Check which model succeeded
response.model
# => "cerebras/llama-3.3-70b" (or nil if all failed)

# Get the content
response.content
# => "The analysis results..." (or nil if all failed)

# Inspect detailed status for each model
response.details
# => {
#      "ollama/llama3.3" => { status: :failed, error: "Connection refused" },
#      "cerebras/llama-3.3-70b" => { status: :success, duration: 1.23 }
#    }

Error Handling

RainbowLLM doesn't raise exceptions - it returns a Response with details:

response = RainbowLLM.chat(
  models: ["primary-model", "backup-model-1", "backup-model-2"]
).ask("Important business question")

if response.content
  puts "Success: #{response.content}"
  puts "Provided by: #{response.model}"
else
  # All providers failed - inspect details to understand why
  puts "All providers failed!"

  response.details.each do |model, info|
    puts "#{model}: #{info[:status]} - #{info[:error]}"
  end
  # => primary-model: failed - Rate limit exceeded
  # => backup-model-1: failed - Connection timeout
  # => backup-model-2: failed - Invalid API key
end

🎯 Advanced Patterns

Cost-Based Routing

# Route to cheapest available provider first
cheap_models = [
  "ollama/llama3.2-3b",   # Free (local)
  "cerebras/llama-3.3-70b", # Free tier
  "openai/gpt-5"    # Paid fallback
]

response = RainbowLLM.chat(models: cheap_models)
  .with_temperature(0.5)
  .ask(user_input)

Performance-Based Routing

# Route to fastest providers for time-sensitive requests
fast_models = [
  "cerebras/llama-3.3-70b",  # Fast cloud
  "openai/gpt-5",         # Fast but more expensive
  "ollama/llama3.2"        # Local but slower
]

response = RainbowLLM.chat(models: fast_models)
  .with_timeout(1)
  .ask(time_sensitive_question)

Multi-Provider Load Balancing

# Distribute load across providers to avoid rate limits
providers = [
  "openai/model-1",
  "anthropic/model-2",
  "cerebras/model-3",
  "ollama/model-4"
]

# RainbowLLM will try each in order until one succeeds
response = RainbowLLM.chat(models: providers)
  .with_temperature(0.7)
  .ask(request)

🔧 Development

Want to contribute or run tests locally?

# Clone the repo
git clone https://github.com/a-chris/rainbow_llm.git
cd rainbow_llm

# Install dependencies
bin/setup

# Run tests
rake test

# Launch interactive console
bin/console

# Install locally
bundle exec rake install

🤝 Contributing

We welcome contributions! Here's how you can help:

Report bugs: Open an issue with detailed reproduction steps
Suggest features: What would make RainbowLLM even better?
Submit pull requests: Fix bugs, add features, improve docs
Spread the word: Star the repo, share with friends!

Development setup:

# After cloning
bin/setup  # Installs dependencies
rake test # Runs the test suite

📜 License

RainbowLLM is open source software licensed under the MIT License.

Need help? Open an issue or contact @a-chris