Class: SmartPrompt::ImageGenerationAdapter
- Inherits:
-
LLMAdapter
- Object
- LLMAdapter
- SmartPrompt::ImageGenerationAdapter
- Defined in:
- lib/smart_prompt/image_generation_adapter.rb
Overview
Adapter for SiliconFlow’s image generation API.
SiliconFlow exposes image generation through a single endpoint:
POST {url}/images/generations
Unlike OpenAI’s image API, SiliconFlow uses its own parameter names (‘image_size`, `batch_size`, `negative_prompt`, `num_inference_steps`, `guidance_scale`, `cfg`, …) and returns an `images` array instead of a `data` array. The OpenAI gem’s ‘images.generate` helper therefore does not fit, so — like the TTS/Video adapters — we talk to the endpoint directly with Net::HTTP.
Constant Summary collapse
- SUPPORTED_IMAGE_FORMATS =
%w[jpg jpeg png gif bmp webp].freeze
- DEFAULT_IMAGE_SIZE =
Default resolution for text-to-image generation (“widthxheight”). Edit models (Qwen/Qwen-Image-Edit*) ignore this field, so it is only sent for text-to-image calls.
"1024x1024"
Instance Attribute Summary
Attributes inherited from LLMAdapter
Instance Method Summary collapse
-
#edit_image(prompt, params = {}) ⇒ Object
Image editing / image-to-image generation (for Qwen/Qwen-Image-Edit-* and Kolors composable models).
-
#generate_image(prompt, params = {}) ⇒ Object
Text-to-image generation.
-
#initialize(config) ⇒ ImageGenerationAdapter
constructor
A new instance of ImageGenerationAdapter.
-
#save_image(image_data, output_dir = "./output", filename_prefix = "generated_image") ⇒ Object
Save one or many generated images to disk.
Constructor Details
#initialize(config) ⇒ ImageGenerationAdapter
Returns a new instance of ImageGenerationAdapter.
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
# File 'lib/smart_prompt/image_generation_adapter.rb', line 29 def initialize(config) super api_key = @config["api_key"] if api_key.is_a?(String) && api_key.start_with?("ENV[") && api_key.end_with?("]") api_key = eval(api_key) end @api_key = api_key @base_url = @config["url"].to_s.chomp("/") @model = @config["model"] begin # Created for parity with the other non-chat adapters; the actual image # requests are issued directly below via Net::HTTP. @client = OpenAI::Client.new( access_token: @api_key, uri_base: @config["url"], request_timeout: 240, ) rescue OpenAI::ConfigurationError => e SmartPrompt.logger.error "Failed to initialize ImageGeneration client: #{e.}" raise LLMAPIError, "Invalid ImageGeneration configuration: #{e.}" rescue OpenAI::Error => e SmartPrompt.logger.error "Failed to initialize ImageGeneration client: #{e.}" raise LLMAPIError, "ImageGeneration authentication failed: #{e.}" rescue SocketError => e SmartPrompt.logger.error "Failed to initialize ImageGeneration client: #{e.}" raise LLMAPIError, "Network error: Unable to connect to ImageGeneration API" rescue => e SmartPrompt.logger.error "Failed to initialize ImageGeneration client: #{e.}" raise Error, "Unexpected error initializing ImageGeneration client: #{e.}" ensure SmartPrompt.logger.info "Successfully created an ImageGeneration client (model=#{@model})." end end |
Instance Method Details
#edit_image(prompt, params = {}) ⇒ Object
Image editing / image-to-image generation (for Qwen/Qwen-Image-Edit-* and Kolors composable models). image (and optionally image2/image3) may be a local file path, a base64 data URL, or a public http(s) URL.
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
# File 'lib/smart_prompt/image_generation_adapter.rb', line 108 def edit_image(prompt, params = {}) SmartPrompt.logger.info "ImageGenerationAdapter: Editing image" raise Error, "Prompt cannot be empty" if prompt.nil? || prompt.to_s.strip.empty? raise Error, "An input image is required for image editing" if params[:image].nil? && params[:image_file].nil? normalized = params.dup normalized[:image] = normalize_input_image(normalized[:image] || normalized[:image_file]) normalized[:image2] = normalize_input_image(normalized[:image2]) if normalized[:image2] normalized[:image3] = normalize_input_image(normalized[:image3]) if normalized[:image3] # Edit models reject image_size, so we deliberately omit it here. parameters = build_parameters(prompt, normalized) parameters[:image] = normalized[:image] parameters[:image2] = normalized[:image2] if normalized[:image2] parameters[:image3] = normalized[:image3] if normalized[:image3] SmartPrompt.logger.info "Image edit parameters: #{parameters.except(:prompt, :image, :image2, :image3).inspect}" begin response = submit_image_request("/images/generations", parameters) @last_response = response images = parse_images(response) SmartPrompt.logger.info "Successfully edited image, generated #{images.size} result(s)" images rescue LLMAPIError, Error raise rescue JSON::ParserError => e SmartPrompt.logger.error "Failed to parse image edit response: #{e.}" raise LLMAPIError, "Failed to parse image edit response" rescue => e SmartPrompt.logger.error "Unexpected error during image editing: #{e.}" raise Error, "Unexpected error during image editing: #{e.}" end end |
#generate_image(prompt, params = {}) ⇒ Object
Text-to-image generation.
params accepts SiliconFlow-native keys plus a couple of friendly aliases:
model:, negative_prompt:,
image_size: (alias: size:),
batch_size: (alias: n:),
seed:, num_inference_steps:, guidance_scale:, cfg:
Returns an Array of hashes, e.g. [{ url: “…”, b64_json: nil, seed: 123 }].
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 |
# File 'lib/smart_prompt/image_generation_adapter.rb', line 74 def generate_image(prompt, params = {}) SmartPrompt.logger.info "ImageGenerationAdapter: Generating image from text" raise Error, "Prompt cannot be empty" if prompt.nil? || prompt.to_s.strip.empty? parameters = build_parameters(prompt, params) parameters[:image_size] = resolve_image_size(params) # batch_size only applies to a subset of models (e.g. Kolors); send it # only when the caller explicitly asks for it. batch = params[:batch_size] || params[:n] parameters[:batch_size] = batch if batch SmartPrompt.logger.info "Image generation parameters: #{parameters.except(:prompt).inspect}" begin response = submit_image_request("/images/generations", parameters) @last_response = response images = parse_images(response) SmartPrompt.logger.info "Successfully generated #{images.size} image(s)" images rescue LLMAPIError, Error raise rescue JSON::ParserError => e SmartPrompt.logger.error "Failed to parse image generation response: #{e.}" raise LLMAPIError, "Failed to parse image generation response" rescue => e SmartPrompt.logger.error "Unexpected error during image generation: #{e.}" raise Error, "Unexpected error during image generation: #{e.}" end end |
#save_image(image_data, output_dir = "./output", filename_prefix = "generated_image") ⇒ Object
Save one or many generated images to disk. Accepts the Array returned by #generate_image/#edit_image or a single image hash. Returns the list of written file paths.
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
# File 'lib/smart_prompt/image_generation_adapter.rb', line 147 def save_image(image_data, output_dir = "./output", filename_prefix = "generated_image") SmartPrompt.logger.info "ImageGenerationAdapter: Saving image to file" begin FileUtils.mkdir_p(output_dir) images = image_data.is_a?(Array) ? image_data : [image_data] saved_files = images.each_with_index.map do |img, index| save_single_image(img, output_dir, "#{filename_prefix}_#{index + 1}") end SmartPrompt.logger.info "Successfully saved #{saved_files.size} image(s) to #{output_dir}" saved_files rescue => e SmartPrompt.logger.error "Error saving image: #{e.}" raise Error, "Error saving image: #{e.}" end end |