Class: SmartPrompt::ZhipuAIAdapter
- Inherits:
-
LLMAdapter
- Object
- LLMAdapter
- SmartPrompt::ZhipuAIAdapter
- Defined in:
- lib/smart_prompt/zhipu_adapter.rb
Overview
Adapter for 智谱 AI (BigModel / GLM) — covering all REST model categories behind one provider domain. One adapter owns the whole provider: every category shares the same base URL ‘open.bigmodel.cn/api/paas/v4` and Bearer-token auth, so a single config block serves them just by changing `model`.
1. 文本对话 (chat) — POST {base}/chat/completions (OpenAI-compatible; reasoning
models return message.reasoning_content, the exact field the engine
already reads — no remap needed)
2. 图文多模态 (vision) — same endpoint, OpenAI Vision content array
3. 向量 (embeddings) — POST {base}/embeddings (embedding-3, custom dimensions)
4. 文生图 (image) — POST {base}/images/generations (response is NESTED: data.images[].url)
5. 文生视频 (video) — POST {base}/videos/generations -> task_id; GET {base}/async-result?task_id=
poll until SUCCESS -> video_result.url (async)
6. 语音合成 (TTS) — POST {base}/audio/speech (glm-tts)
7. 语音识别 (ASR) — POST {base}/audio/transcriptions (glm-asr-2512, multipart)
8. 重排 (rerank) — POST {base}/rerank
We talk to the endpoints with Net::HTTP directly (like the SenseNova / image / tts / stt / video adapters) so we can control SSE streaming, the nested image shape, and the async video flow. No new gem deps.
Constant Summary collapse
- DEFAULT_BASE_URL =
"https://open.bigmodel.cn/api/paas/v4".freeze
- DEFAULT_CODING_BASE_URL =
CodeGeeX-4 / coding models use a separate base.
"https://open.bigmodel.cn/api/coding/paas/v4".freeze
- SUPPORTED_IMAGE_FORMATS =
%w[jpg jpeg png gif bmp webp].freeze
- CHAT_OPTIONAL_KEYS =
Zhipu chat sampling parameters forwarded from config when present.
%w[ top_p max_tokens do_sample stop presence_penalty frequency_penalty thinking ].freeze
Instance Attribute Summary
Attributes inherited from LLMAdapter
Instance Method Summary collapse
-
#check_video_status(task_id) ⇒ Object
Poll an async task.
- #download_video(video_url, output_path) ⇒ Object
-
#embeddings(text, model) ⇒ Object
embedding-3 (default 2048 dims); supports a custom ‘dimensions` (256/512/1024/2048) via config.
-
#generate_image(prompt, params = {}) ⇒ Object
Text-to-image.
-
#generate_video(prompt, params = {}) ⇒ Object
Submit a text-to-video (or image-to-video) job.
-
#initialize(config) ⇒ ZhipuAIAdapter
constructor
A new instance of ZhipuAIAdapter.
-
#rerank(query, documents, model: nil) ⇒ Object
—- rerank (bonus) ——————————————————.
-
#save_image(image_data, output_dir = "./output", filename_prefix = "zhipu_image") ⇒ Object
Save one or many generated images to disk (Array from #generate_image or a single hash).
-
#send_request(messages, model = nil, temperature = nil, tools = nil, proc = nil) ⇒ Object
Chat / multimodal.
-
#synthesize_speech(text, voice: nil, model: nil, response_format: "wav", **opts) ⇒ Object
Returns a base64 data URL for the synthesized audio.
- #synthesize_to_file(text, output_path, voice: nil, model: nil, response_format: "wav", **opts) ⇒ Object
-
#transcribe_audio(audio_file, model: nil, language: nil, **opts) ⇒ Object
Transcribe an audio file (local path).
-
#wait_for_video_completion(task_id, check_interval: 10, timeout: 600) ⇒ Object
Block until the task finishes (or times out), then return the video URL.
Constructor Details
#initialize(config) ⇒ ZhipuAIAdapter
Returns a new instance of ZhipuAIAdapter.
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
# File 'lib/smart_prompt/zhipu_adapter.rb', line 39 def initialize(config) super SmartPrompt.logger.info "Start create the SmartPrompt ZhipuAIAdapter." api_key = @config["api_key"] if api_key.is_a?(String) && api_key.start_with?("ENV[") && api_key.end_with?("]") api_key = eval(api_key) end # Match the other adapters: tolerate a missing key at construction so examples/config # can load without a live key; the first request fails with a clear auth error. SmartPrompt.logger.warn "Zhipu api_key is empty — API calls will fail until it is set." if api_key.nil? || api_key.to_s.strip.empty? @api_key = api_key @base_url = (@config["url"] || DEFAULT_BASE_URL).to_s.chomp("/") @coding_base = (@config["coding_url"] || DEFAULT_CODING_BASE_URL).to_s.chomp("/") # Optional per-method URL overrides (default to the standard paths off @base_url). @image_url = (@config["image_url"] || "#{@base_url}/images/generations").to_s @video_url = (@config["video_url"] || "#{@base_url}/videos/generations").to_s @query_url = (@config["query_url"] || "#{@base_url}/async-result").to_s SmartPrompt.logger.info "Zhipu base_url=#{@base_url}" end |
Instance Method Details
#check_video_status(task_id) ⇒ Object
Poll an async task. Returns the raw status hash (task_status etc.).
193 194 195 196 197 198 199 200 |
# File 'lib/smart_prompt/zhipu_adapter.rb', line 193 def check_video_status(task_id) SmartPrompt.logger.info "ZhipuAIAdapter: polling video task #{task_id}" http_get_json("#{@query_url}/#{URI.encode_www_form_component(task_id)}") rescue LLMAPIError, Error raise rescue => e raise LLMAPIError, "Failed to query Zhipu video task: #{e.}" end |
#download_video(video_url, output_path) ⇒ Object
225 226 227 228 229 230 231 232 233 234 235 236 |
# File 'lib/smart_prompt/zhipu_adapter.rb', line 225 def download_video(video_url, output_path) uri = URI.parse(video_url) http = Net::HTTP.new(uri.host, uri.port); http.use_ssl = (uri.scheme == "https") response = http.request(Net::HTTP::Get.new(uri.request_uri)) raise Error, "Failed to download video: #{response.code}" unless response.is_a?(Net::HTTPSuccess) FileUtils.mkdir_p(File.dirname(output_path)) File.binwrite(output_path, response.body) SmartPrompt.logger.info "Zhipu video saved to #{output_path}" output_path rescue => e raise e.is_a?(SmartPrompt::Error) ? e : Error, "Error downloading Zhipu video: #{e.}" end |
#embeddings(text, model) ⇒ Object
embedding-3 (default 2048 dims); supports a custom ‘dimensions` (256/512/1024/2048) via config. Returns the first embedding vector.
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
# File 'lib/smart_prompt/zhipu_adapter.rb', line 96 def (text, model) model_name = model || @config["embedding_model"] || @config["model"] SmartPrompt.logger.info "ZhipuAIAdapter: embeddings model=#{model_name}" body = { "model" => model_name, "input" => text.is_a?(Array) ? text : [text.to_s] } body["dimensions"] = @config["dimensions"] if @config["dimensions"] body["encoding_format"] = @config["encoding_format"] if @config["encoding_format"] response = begin http_post_json("#{@base_url}/embeddings", body) rescue LLMAPIError, Error raise rescue => e raise LLMAPIError, "Failed to call Zhipu embeddings: #{e.}" end items = response["data"] unless items.is_a?(Array) && items.any? && items[0]["embedding"] raise LLMAPIError, "No embedding vector in Zhipu response: #{response.inspect}" end items[0]["embedding"] end |
#generate_image(prompt, params = {}) ⇒ Object
Text-to-image. The Zhipu response is NESTED: data.images[].url (not OpenAI’s data[]), so we parse defensively. Returns an Array of b64_json:.
124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
# File 'lib/smart_prompt/zhipu_adapter.rb', line 124 def generate_image(prompt, params = {}) SmartPrompt.logger.info "ZhipuAIAdapter: generating image" raise Error, "Prompt cannot be empty" if prompt.nil? || prompt.to_s.strip.empty? model_name = params[:model] || @config["image_model"] || @config["model"] raise Error, "No model configured for image generation" if model_name.nil? || model_name.to_s.strip.empty? body = { "model" => model_name, "prompt" => prompt.to_s } body["size"] = params[:size] if params[:size] body["user"] = params[:user] if params[:user] body["response_format"] = params[:response_format] if params[:response_format] SmartPrompt.logger.info "Zhipu image params: #{body.except('prompt').inspect}" response = begin http_post_json(@image_url, body) rescue LLMAPIError, Error raise rescue => e raise Error, "Failed to call Zhipu image generation: #{e.}" end images = parse_image_response(response) SmartPrompt.logger.info "ZhipuAIAdapter: generated #{images.size} image(s)" images end |
#generate_video(prompt, params = {}) ⇒ Object
Submit a text-to-video (or image-to-video) job. Returns the task id.
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 |
# File 'lib/smart_prompt/zhipu_adapter.rb', line 165 def generate_video(prompt, params = {}) SmartPrompt.logger.info "ZhipuAIAdapter: submitting video job" model_name = params[:model] || @config["video_model"] || @config["model"] raise Error, "No model configured for video generation" if model_name.nil? || model_name.to_s.strip.empty? body = { "model" => model_name, "prompt" => prompt.to_s } %i[quality fps duration with_audio resolution request_id seed].each do |k| body[k.to_s] = params[k] unless params[k].nil? end body["image_url"] = normalize_image_url(params[:image_url]) if params[:image_url] SmartPrompt.logger.info "Zhipu video params: #{body.except('prompt').inspect}" response = begin http_post_json(@video_url, body) rescue LLMAPIError, Error raise rescue => e raise Error, "Failed to submit Zhipu video job: #{e.}" end task_id = response["id"] || response["task_id"] raise LLMAPIError, "No task id in Zhipu video response: #{response.inspect}" unless task_id SmartPrompt.logger.info "ZhipuAIAdapter: video task #{task_id} submitted" { task_id: task_id, model: model_name, raw: response } end |
#rerank(query, documents, model: nil) ⇒ Object
—- rerank (bonus) ——————————————————
293 294 295 296 297 298 299 300 301 302 |
# File 'lib/smart_prompt/zhipu_adapter.rb', line 293 def rerank(query, documents, model: nil) model_name = model || @config["rerank_model"] || @config["model"] body = { "model" => model_name, "query" => query, "documents" => documents } response = http_post_json("#{@base_url}/rerank", body) (response["results"] || []).map { |r| { index: r["index"], relevance_score: r["relevance_score"] || r["score"] } } rescue LLMAPIError, Error raise rescue => e raise LLMAPIError, "Failed to call Zhipu rerank: #{e.}" end |
#save_image(image_data, output_dir = "./output", filename_prefix = "zhipu_image") ⇒ Object
Save one or many generated images to disk (Array from #generate_image or a single hash).
152 153 154 155 156 157 158 159 160 |
# File 'lib/smart_prompt/zhipu_adapter.rb', line 152 def save_image(image_data, output_dir = "./output", filename_prefix = "zhipu_image") FileUtils.mkdir_p(output_dir) images = image_data.is_a?(Array) ? image_data : [image_data] saved = images.each_with_index.map do |img, index| save_single_image(img, output_dir, "#{filename_prefix}_#{index + 1}") end SmartPrompt.logger.info "Saved #{saved.size} Zhipu image(s) to #{output_dir}" saved end |
#send_request(messages, model = nil, temperature = nil, tools = nil, proc = nil) ⇒ Object
Chat / multimodal. Non-streaming returns a full OpenAI-format hash (so last_response carries usage + reasoning_content); streaming calls proc with each OpenAI-shaped chunk.
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
# File 'lib/smart_prompt/zhipu_adapter.rb', line 67 def send_request(, model = nil, temperature = nil, tools = nil, proc = nil) model_name = model || @config["model"] body = build_chat_body(, model_name, temperature, tools) SmartPrompt.logger.info "ZhipuAIAdapter: chat request model=#{model_name} stream=#{!proc.nil?}" url = chat_url_for(model_name) if proc body["stream"] = true stream_chat(url, body) { |data| proc.call(build_stream_chunk(data), 0) } SmartPrompt.logger.info "ZhipuAIAdapter: streaming request finished" nil else raw = http_post_json(url, body) response = build_completion_response(raw) @last_response = response SmartPrompt.logger.info "ZhipuAIAdapter: received chat response" response end rescue LLMAPIError, Error raise rescue => e SmartPrompt.logger.error "Zhipu chat error: #{e.}" raise LLMAPIError, "Failed to call Zhipu chat: #{e.}" end |
#synthesize_speech(text, voice: nil, model: nil, response_format: "wav", **opts) ⇒ Object
Returns a base64 data URL for the synthesized audio. GLM-TTS accepts wav/pcm only (mp3/flac are rejected), so default to wav.
242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 |
# File 'lib/smart_prompt/zhipu_adapter.rb', line 242 def synthesize_speech(text, voice: nil, model: nil, response_format: "wav", **opts) SmartPrompt.logger.info "ZhipuAIAdapter: TTS" raise Error, "Text cannot be empty" if text.nil? || text.to_s.strip.empty? model_name = model || @config["tts_model"] || "glm-tts" body = { "model" => model_name, "input" => text.to_s } body["voice"] = voice if voice body["response_format"] = response_format body["speed"] = opts[:speed] if opts[:speed] body["emotion"] = opts[:emotion] if opts[:emotion] audio = http_post_binary("#{@base_url}/audio/speech", body) "data:audio/#{response_format};base64,#{Base64.strict_encode64(audio)}" rescue LLMAPIError, Error raise rescue => e raise Error, "Failed to call Zhipu TTS: #{e.}" end |
#synthesize_to_file(text, output_path, voice: nil, model: nil, response_format: "wav", **opts) ⇒ Object
261 262 263 264 265 266 267 268 |
# File 'lib/smart_prompt/zhipu_adapter.rb', line 261 def synthesize_to_file(text, output_path, voice: nil, model: nil, response_format: "wav", **opts) data_url = synthesize_speech(text, voice: voice, model: model, response_format: response_format, **opts) FileUtils.mkdir_p(File.dirname(output_path)) audio_bytes = Base64.decode64(data_url.sub(/\Adata:audio\/\w+;base64,/, "")) File.binwrite(output_path, audio_bytes) SmartPrompt.logger.info "Zhipu audio saved to #{output_path}" { file_path: output_path, format: response_format } end |
#transcribe_audio(audio_file, model: nil, language: nil, **opts) ⇒ Object
Transcribe an audio file (local path). Returns text:.
273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 |
# File 'lib/smart_prompt/zhipu_adapter.rb', line 273 def transcribe_audio(audio_file, model: nil, language: nil, **opts) SmartPrompt.logger.info "ZhipuAIAdapter: ASR #{File.basename(audio_file)}" raise Error, "Audio file not found: #{audio_file}" unless File.exist?(audio_file) model_name = model || @config["asr_model"] || "glm-asr-2512" form = { "model" => model_name } form["language"] = language if language form["prompt"] = opts[:prompt] if opts[:prompt] form["response_format"] = opts[:response_format] if opts[:response_format] response = http_post_multipart("#{@base_url}/audio/transcriptions", form, audio_file) { text: response["text"] } rescue LLMAPIError, Error raise rescue => e raise e.is_a?(SmartPrompt::Error) ? e : Error, "Failed to call Zhipu ASR: #{e.}" end |
#wait_for_video_completion(task_id, check_interval: 10, timeout: 600) ⇒ Object
Block until the task finishes (or times out), then return the video URL.
203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 |
# File 'lib/smart_prompt/zhipu_adapter.rb', line 203 def wait_for_video_completion(task_id, check_interval: 10, timeout: 600) start = Time.now loop do status = check_video_status(task_id) case task_status_of(status) when "SUCCESS" url = video_url_of(status) raise LLMAPIError, "Video succeeded but no url in: #{status.inspect}" unless url SmartPrompt.logger.info "ZhipuAIAdapter: video ready #{url}" return { task_id: task_id, status: "SUCCESS", video_url: url, cover_image_url: cover_url_of(status), raw: status } when "FAIL", "FAILED" raise LLMAPIError, "Zhipu video generation failed: #{status.inspect}" else if Time.now - start > timeout raise LLMAPIError, "Zhipu video generation timeout after #{timeout}s" end SmartPrompt.logger.info "Zhipu video task #{task_id} still processing..." sleep(check_interval) end end end |