lex-llm-openai
LegionIO LLM provider extension for OpenAI.
This gem provides the :openai provider family implementation, enabling LegionIO to route chat, streaming, embedding, moderation, image, and audio requests to OpenAI-compatible APIs through the shared lex-llm provider contract.
Namespace: Legion::Extensions::Llm::Openai
Version: 0.3.11
Load: require 'legion/extensions/llm/openai'
Quick Start
require 'legion/extensions/llm/openai'
# Configure via the shared LLM configuration API
Legion::Extensions::Llm.configure do |config|
config.openai_api_key = ENV.fetch("OPENAI_API_KEY")
config.default_model = "gpt-5.5"
end
# Use through the standard provider interface
provider = Legion::Extensions::Llm::Openai::Provider.new
provider.chat(model: "gpt-5.5", messages: [{ role: "user", content: "Hello" }])
What It Provides
| Capability | Endpoint | Notes |
|---|---|---|
| Chat completions | POST /v1/chat/completions |
Includes function calling, vision, structured output |
| Streaming chat | Same endpoint, stream: true |
Token usage reported via stream_usage_supported? |
| Model listing | GET /v1/models |
Enriched with static capability map; publishes registry events |
| Model retrieval | GET /v1/models/{model} |
|
| Embeddings | POST /v1/embeddings |
|
| Moderation | POST /v1/moderations |
|
| Image generation | POST /v1/images/generations |
|
| Image editing | POST /v1/images/edits |
|
| Image variation | POST /v1/images/variations |
|
| Audio transcription | POST /v1/audio/transcriptions |
Whisper, gpt-4o-transcribe |
Architecture
Legion::Extensions::Llm::Openai
|-- Openai # Root module — settings, discovery, auto-registration
| |-- default_settings # Provider family defaults (endpoint, models, fleet)
| |-- discover_instances # Credential scanning across env, Codex, Claude, settings
| |-- normalize_instance_config # Normalizes generic keys to canonical OpenAI keys
| `-- sanitize_instance_config # Strips temporary credential fields
|
|-- Provider # OpenAI provider implementation
| |-- Capabilities # Model family predicates (chat?, streaming?, vision?, etc.)
| |-- CAPABILITY_MAP # Static capability matrix for 14 known model families
| |-- list_models # Enriches raw API response with capability metadata
| |-- retrieve_model # Fetches single model detail
| |-- chat_url, models_url, etc. # Endpoint builders
| `-- maybe_normalize_temperature # Adjusts temperature for o*/gpt-5 reasoning models
|
|-- Actor::FleetWorker # Subscription actor for fleet request consumption
| |-- enabled? # Checks if any instance has respond_to_requests: true
| `-- Delegates to lex-llm ProviderResponder
|
|-- Actor::DiscoveryRefresh # Periodic actor that refreshes the model discovery cache
| |-- time # Reads discovery_interval from settings (default 1800s)
| `-- Calls Legion::LLM::Discovery.refresh_discovered_models!
|
`-- Runners::FleetWorker # Execution entrypoint for fleet requests
`-- handle_fleet_request # Routes to lex-llm ProviderResponder.call
Design Boundaries
- Response normalization, request payload mapping lives in
Lex-llm::Provider::OpenAICompatible(mixed in) - Fleet responder logic (ack/reject, response publication) lives in
Lex-llm::Fleet::ProviderResponder - Registry event publishing is delegated to
Lex-llm::RegistryPublisher - This extension depends only on
lex-llmat runtime; it does not depend onlegion-llm
Instance Discovery
Openai.discover_instances scans 7 credential sources, deduplicates by key, and injects default_model:
| Priority | Source | Key |
|---|---|---|
| 1 | OPENAI_API_KEY environment variable |
:env |
| 2 | CODEX_API_KEY environment variable |
:codex_env |
| 3 | Codex bearer token (~/.codex/auth.json) |
:codex |
| 4 | Codex OpenAI key (~/.codex/auth.json) |
:codex_key |
| 5 | Claude config (openaiApiKey) |
:claude |
| 6 | Extension settings (extensions.llm.openai) |
:settings |
| 7 | Named instances in extension settings | Named keys |
Configuration
Via Legion Settings (YAML)
extensions:
llm:
openai:
api_key: "sk-..."
default_model: "gpt-5.5"
endpoint: "https://api.openai.com"
discovery_interval: 1800 # Seconds between model list refresh (used by DiscoveryRefresh actor)
instances:
primary:
openai_api_key: "sk-..."
openai_api_base: "https://api.openai.com"
fleet:
enabled: true
respond_to_requests: true
capabilities:
- chat
- stream_chat
- embed
- image
Via Ruby Configuration API
Legion::Extensions::Llm.configure do |config|
config.openai_api_key = ENV.fetch("OPENAI_API_KEY")
config.openai_api_base = nil # defaults to https://api.openai.com
config.openai_organization_id = nil # optional OpenAI-Organization header
config.openai_project_id = nil # optional OpenAI-Project header
config.openai_use_system_role = true # include system messages in requests
config.default_model = "gpt-5.5"
config. = "text-embedding-3-small"
config.default_moderation_model = "omni-moderation-latest"
config.default_image_model = "gpt-image-1"
config.default_transcription_model = "gpt-4o-transcribe"
end
Default Settings
Legion::Extensions::Llm::Openai.default_settings
# {
# provider_family: :openai,
# instances: {
# default: {
# endpoint: "https://api.openai.com",
# default_model: "gpt-5.5",
# tier: :frontier,
# transport: :http,
# credentials: {
# api_key: "env://OPENAI_API_KEY",
# organization_id: nil,
# project_id: nil
# },
# usage: {
# inference: true,
# embedding: true,
# moderation: true,
# image: true,
# audio: true
# },
# limits: { concurrency: 4 },
# fleet: {
# enabled: false,
# respond_to_requests: false,
# capabilities: [:chat, :stream_chat, :embed, :image],
# lanes: [],
# concurrency: 4,
# queue_suffix: nil
# }
# }
# }
# }
Model Capability Map
The provider maintains a static CAPABILITY_MAP covering 14 OpenAI model families. Each entry declares capabilities, input/output modalities, and context window size.
| Prefix | Capabilities | Input | Output | Context |
|---|---|---|---|---|
gpt-4o |
completion, streaming, function_calling, vision, structured_output | text, image, audio | text | 128K |
gpt-4.1 |
completion, streaming, function_calling, vision, structured_output | text, image | text | 1M |
gpt-4 |
completion, streaming, function_calling, vision | text, image | text | 128K |
gpt-5 |
completion, streaming, function_calling, vision, structured_output, reasoning | text, image | text | 1M |
o4 |
completion, streaming, function_calling, vision, reasoning | text, image | text | 200K |
o3 |
completion, streaming, function_calling, vision, reasoning | text, image | text | 200K |
o1 |
completion, streaming, function_calling, vision, reasoning | text, image | text | 200K |
text-embedding-* |
embedding | text | embeddings | 8K |
omni-moderation |
moderation | text, image | moderation | - |
text-moderation |
moderation | text | moderation | - |
gpt-image |
image_generation | text, image | image | - |
dall-e |
image_generation | text | image | - |
whisper |
audio_transcription | audio | text | - |
tts |
audio_generation | text | audio | - |
Unknown models default to { capabilities: [:completion, :streaming], modalities: { input: ["text"], output: ["text"] } }.
Capability Predicates
Provider::Capabilities provides module functions for model routing decisions:
| Method | Matches |
|---|---|
chat?(model) |
Any model that is not embedding, moderation, image, audio, tts, realtime, or sora |
streaming?(model) |
Same as chat? |
functions?(model) |
Models starting with gpt or o\d |
vision?(model) |
Models starting with gpt, o\d, or omni-moderation |
embeddings?(model) |
Models starting with text-embedding- |
moderation?(model) |
Models containing moderation |
images?(model) |
Models starting with gpt-image or dall-e |
audio_transcription?(model) |
Models matching gpt-4o.*transcribe or whisper |
Fleet Responder
Provider instances can opt in to consuming Legion LLM fleet requests via the shared ProviderResponder. The fleet actor starts automatically when any instance has respond_to_requests: true.
Fleet YAML Configuration
extensions:
llm:
openai:
instances:
local:
fleet:
enabled: true
respond_to_requests: true
capabilities:
- chat
- stream_chat
- embed
- image
Fleet Components
| Class | Role |
|---|---|
Actor::FleetWorker |
Subscription actor; checks enabled? against discovered instances |
Runners::FleetWorker.handle_fleet_request |
Execution entrypoint; delegates to ProviderResponder.call |
Observability
All classes include Legion::Logging::Helper:
- Structured error handling: Every
rescuecallshandle_exceptionwith operation context - Debug-level request telemetry: Model listing, retrieval, fleet dispatch, discovery refresh
- Info-level action logging: Registry publishing, instance discovery results
- Automatic segment derivation: Log lines tagged with provider family and component type
Dependencies
| Gem | Purpose |
|---|---|
lex-llm (>= 0.4.3) |
Provider contract, OpenAICompatible mixin, fleet responder, registry publisher |
legion-transport (>= 1.4.14) |
AMQP subscriptions and replies |
legion-json (>= 1.2.1) |
JSON serialization |
legion-logging (>= 1.3.2) |
Structured logging |
legion-settings (>= 1.3.14) |
Configuration management |
Key Files
| File | Purpose |
|---|---|
lib/legion/extensions/llm/openai.rb |
Root module, settings, instance discovery, auto-registration |
lib/legion/extensions/llm/openai/provider.rb |
Provider implementation, capability map, API methods |
lib/legion/extensions/llm/openai/actors/discovery_refresh.rb |
Periodic model cache refresh actor |
lib/legion/extensions/llm/openai/actors/fleet_worker.rb |
Fleet request subscription actor |
lib/legion/extensions/llm/openai/runners/fleet_worker.rb |
Fleet request execution runner |
lib/legion/extensions/llm/openai/version.rb |
Version constant |
Development
bundle install
bundle exec rspec
bundle exec rubocop
License
MIT