Class: RubyLLM::Agents::Audio::SpeechClient

Inherits:

Object

Object
RubyLLM::Agents::Audio::SpeechClient

show all

Defined in:: lib/ruby_llm/agents/audio/speech_client.rb

Overview

Direct HTTP client for text-to-speech APIs.

Supports OpenAI and ElevenLabs providers, bypassing the need for a RubyLLM.speak() method that does not exist in the base gem.

Examples:

OpenAI

client = SpeechClient.new(provider: :openai)
response = client.speak("Hello", model: "tts-1", voice: "nova")
response.audio  # => binary audio data

ElevenLabs

client = SpeechClient.new(provider: :elevenlabs)
response = client.speak("Hello",
  model: "eleven_v3",
  voice: "Rachel",
  voice_id: "21m00Tcm4TlvDq8ikWAM",
  voice_settings: { stability: 0.5, similarity_boost: 0.75 }
)

Defined Under Namespace

Classes: Response, StreamChunk

Constant Summary collapse

SUPPORTED_PROVIDERS =

%i[openai elevenlabs].freeze

Instance Method Summary collapse

#initialize(provider:) ⇒ SpeechClient constructor

A new instance of SpeechClient.
#speak(text, model:, voice:, voice_id: nil, speed: nil, response_format: "mp3", voice_settings: nil) ⇒ Response

Synthesize speech (non-streaming).
#speak_streaming(text, model:, voice:, voice_id: nil, speed: nil, response_format: "mp3", voice_settings: nil) {|StreamChunk| ... } ⇒ Response

Synthesize speech with streaming.

Constructor Details

#initialize(provider:) ⇒ `SpeechClient`

Returns a new instance of SpeechClient.

Parameters:

provider (Symbol) —

:openai or :elevenlabs

Raises:

(UnsupportedProviderError) —

if provider is not supported

# File 'lib/ruby_llm/agents/audio/speech_client.rb', line 46

def initialize(provider:)
  validate_provider!(provider)
  @provider = provider
end

Instance Method Details

#speak(text, model:, voice:, voice_id: nil, speed: nil, response_format: "mp3", voice_settings: nil) ⇒ `Response`

Synthesize speech (non-streaming)

Parameters:

text (String) —

text to convert
model (String) —

model identifier
voice (String) —

voice name
voice_id (String, nil) (defaults to: nil) —

voice ID (required for ElevenLabs)
speed (Float, nil) (defaults to: nil) —

speed multiplier
response_format (String) (defaults to: "mp3") —

output format
voice_settings (Hash, nil) (defaults to: nil) —

ElevenLabs voice settings

Returns:

(Response)

# File 'lib/ruby_llm/agents/audio/speech_client.rb', line 61

def speak(text, model:, voice:, voice_id: nil, speed: nil,
  response_format: "mp3", voice_settings: nil)
  case @provider
  when :openai
    openai_speak(text, model: model, voice: voice_id || voice,
      speed: speed, response_format: response_format)
  when :elevenlabs
    elevenlabs_speak(text, model: model, voice_id: voice_id || voice,
      speed: speed, response_format: response_format,
      voice_settings: voice_settings)
  end
end

#speak_streaming(text, model:, voice:, voice_id: nil, speed: nil, response_format: "mp3", voice_settings: nil) {|StreamChunk| ... } ⇒ `Response`

Synthesize speech with streaming

Parameters:

text (String) —

text to convert
model (String) —

model identifier
voice (String) —

voice name
voice_id (String, nil) (defaults to: nil) —

voice ID
speed (Float, nil) (defaults to: nil) —

speed multiplier
response_format (String) (defaults to: "mp3") —

output format
voice_settings (Hash, nil) (defaults to: nil) —

ElevenLabs voice settings

Yields:

(StreamChunk) —

each audio chunk as it arrives

Returns:

(Response)

# File 'lib/ruby_llm/agents/audio/speech_client.rb', line 85

def speak_streaming(text, model:, voice:, voice_id: nil, speed: nil,
  response_format: "mp3", voice_settings: nil, &block)
  case @provider
  when :openai
    openai_speak_streaming(text, model: model, voice: voice_id || voice,
                           speed: speed, response_format: response_format,
      &block)
  when :elevenlabs
    elevenlabs_speak_streaming(text, model: model,
                               voice_id: voice_id || voice,
                               speed: speed,
                               response_format: response_format,
                               voice_settings: voice_settings, &block)
  end
end

Class: RubyLLM::Agents::Audio::SpeechClient

Overview

Examples:

OpenAI

ElevenLabs

Defined Under Namespace

Constant Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(provider:) ⇒ SpeechClient

Instance Method Details

#speak(text, model:, voice:, voice_id: nil, speed: nil, response_format: "mp3", voice_settings: nil) ⇒ Response

#speak_streaming(text, model:, voice:, voice_id: nil, speed: nil, response_format: "mp3", voice_settings: nil) {|StreamChunk| ... } ⇒ Response

#initialize(provider:) ⇒ `SpeechClient`

#speak(text, model:, voice:, voice_id: nil, speed: nil, response_format: "mp3", voice_settings: nil) ⇒ `Response`

#speak_streaming(text, model:, voice:, voice_id: nil, speed: nil, response_format: "mp3", voice_settings: nil) {|StreamChunk| ... } ⇒ `Response`