PromptBuilder

Continuous Integration Ruby Style Guide Gem Version

This gem provides a Ruby DSL for building and parsing LLM API request payloads. The goal of this gem is to provide a single, consistent interface for constructing requests and parsing responses across multiple LLM APIs without locking you into a specific provider or HTTP client. Chat sessions are designed to be serializable so they can be persisted into databases or caches.

The Open Responses API is used as the internal data model. The Open Responses reference documentation provides details on how to use the API and the terminology.

Requests can be generated for and responses can be parsed from these common LLM API formats:

This gem does not include any HTTP client code. It is designed to be used with whatever HTTP library you prefer. You build a request payload, send it to the API yourself, and then parse the response back into Ruby objects.

It was specifically designed to work with the patient_http gem to allow making asynchronous requests to LLM APIs and is used in the patient_llm gem.

Usage

Sessions

The PromptBuilder::Session class is the main entry point. A session holds the model configuration, conversation history, and tool definitions needed to build a request payload.

session = PromptBuilder::Session.new(
  model: "gpt-5.4",
  instructions: "You are a helpful assistant.",
  temperature: 0.7
)

# Add a user message to the conversation history
session.user("What is the capital of France?")

You can also pass an input shorthand to create a user message in one step:

session = PromptBuilder::Session.new(
  model: "gpt-5.4",
  input: "What is the capital of France?"
)

Conversation History

Build up a multi-turn conversation by adding messages:

session = PromptBuilder::Session.new(model: "gpt-5.4")
session.system("You are a helpful assistant.")
session.user("Hello!")
session.assistant("Hi there! How can I help you today?")
session.user("What's the weather like?")

Messages support the roles user, assistant, system, and developer.

Serializing Requests

Once you've built a session, serialize it to a request payload for the API you want to call. Five target formats are supported:

Open Responses API:

payload = session.to_h
# or
payload = session.request_payload(:open_responses)

OpenAI Chat Completions API:

payload = session.request_payload(:chat_completion)

Anthropic Messages API:

payload = session.request_payload(:messages)

Google Gemini API:

payload = session.request_payload(:gemini)

Amazon Bedrock Converse API:

payload = session.request_payload(:converse)

The payload is a plain Ruby Hash that you can convert to JSON and send to the API with any HTTP client:

require "net/http"
require "json"

uri = URI("https://api.openai.com/v1/responses")
request = Net::HTTP::Post.new(uri)
request["Authorization"] = "Bearer #{api_key}"
request["Content-Type"] = "application/json"
request.body = JSON.generate(session.to_h)

response = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http|
  http.request(request)
end

Parsing Responses

Parse an API response back into an PromptBuilder::Response object using Response.parse with a serializer symbol:

# Open Responses API
response = PromptBuilder::Response.parse(JSON.parse(response_body), :open_responses)

# OpenAI Chat Completions API
response = PromptBuilder::Response.parse(JSON.parse(response_body), :chat_completion)

# Anthropic Messages API
response = PromptBuilder::Response.parse(JSON.parse(response_body), :messages)

# Google Gemini API
response = PromptBuilder::Response.parse(JSON.parse(response_body), :gemini)

# Amazon Bedrock Converse API
response = PromptBuilder::Response.parse(JSON.parse(response_body), :converse)

You can also pass a serializer class directly:

response = PromptBuilder::Response.parse(JSON.parse(response_body), PromptBuilder::Serializers::ChatCompletion)

The Response object provides convenient accessors:

response.text            # => "The capital of France is Paris."
response.completed?      # => true
response.has_tool_calls? # => false
response.usage           # => #<PromptBuilder::Usage input_tokens=25 output_tokens=12 ...>

Agentic Tool Loops

You can register tool definitions on a session, add API responses to the conversation, and manually append tool outputs to build an agentic loop:

session = PromptBuilder::Session.new(model: "gpt-5.4")

session.register_tool(
  "get_weather",
  description: "Get the current weather for a city.",
  parameters: {
    "type" => "object",
    "properties" => {
      "city" => {"type" => "string", "description" => "The city name"}
    },
    "required" => ["city"]
  }
)

session.user("What's the weather in Paris?")

loop do
  payload = session.request_payload(:chat_completion)
  response_body = call_api(payload)  # Your HTTP call
  response = PromptBuilder::Response.parse(response_body, :chat_completion)

  session.add_response(response)

  unless response.has_tool_calls?
    puts response.text
    break
  end

  # Invoke tool handlers for each tool call in this response and append the
  # output back to the session before the next iteration.
  response.tool_calls.each do |call|
    result = call_tool(call.name, call.parsed_arguments)  # invoke your logic for the tool
    session.add_function_call_output(call_id: call.call_id, result: result.to_s)
  end
end

The add_response method appends the model's output items (messages, tool calls, reasoning, etc.) to the session's conversation history. You add FunctionCallOutput items manually after invoking each tool, then loop until the model produces a final text response.

Tool Registry

For tools that are shared across multiple sessions, you can use a ToolRegistry:

registry = PromptBuilder::ToolRegistry.new

registry.register(
  "search",
  description: "Search the knowledge base.",
  parameters: {
    "type" => "object",
    "properties" => {
      "query" => {"type" => "string"}
    },
    "required" => ["query"]
  }
) do |args|
  KnowledgeBase.search(args["query"])
end

# Apply all tools from the registry to a session
session = PromptBuilder::Session.new(model: "gpt-5.4")
session.register_tools(registry)

There is also a global registry available on the PromptBuilder module:

PromptBuilder.register_tool("search", description: "Search the knowledge base.") do |args|
  KnowledgeBase.search(args["query"])
end

session.register_tools(PromptBuilder.tool_registry)

Content Types

Message content can be a plain string or an array of structured content objects for multi-modal input. Content can be provided as raw Hashes or as PromptBuilder::Content objects.

Images

Send an image by URL or as base64-encoded data:

# Image from a URL
session.user([
  PromptBuilder::Content::InputText.new(text: "What is in this image?"),
  PromptBuilder::Content::InputImage.new(url: "https://example.com/photo.jpg")
])

# Image with a detail level hint
session.user([
  PromptBuilder::Content::InputText.new(text: "Describe this image in detail."),
  PromptBuilder::Content::InputImage.new(
    url: "https://example.com/photo.jpg",
    detail: "high"
  )
])

# Image from raw binary data using a data URL
session.user([
  PromptBuilder::Content::InputText.new(text: "What is in this image?"),
  PromptBuilder::Content::InputImage.new(
    url: PromptBuilder.data_url(File.binread("photo.png"), "image/png")
  )
])

Files

Attach a file by URL or as raw binary data:

# File from a URL
session.user([
  PromptBuilder::Content::InputText.new(text: "Summarize this document."),
  PromptBuilder::Content::InputFile.new(url: "https://example.com/report.pdf")
])

# File from raw binary data with a filename and media type
session.user([
  PromptBuilder::Content::InputText.new(text: "What does this spreadsheet contain?"),
  PromptBuilder::Content::InputFile.new(
    url: PromptBuilder.data_url(File.binread("data.csv"), "text/csv"),
    filename: "data.csv"
  )
])

# Reference a previously uploaded file by id (OpenAI Files API, Gemini Files API)
session.user([
  PromptBuilder::Content::InputText.new(text: "Summarize this."),
  PromptBuilder::Content::InputFile.new(file_id: "file_abc123", media_type: "application/pdf")
])

Videos

session.user([
  PromptBuilder::Content::InputText.new(text: "Summarize what happens in this video."),
  PromptBuilder::Content::InputVideo.new(url: "https://example.com/clip.mp4")
])

Using Hashes

You can also pass plain Hashes instead of content objects:

session.user([
  {"type" => "input_text", "text" => "What is in this image?"},
  {"type" => "input_image", "url" => "https://example.com/photo.jpg"}
])

Supported content types

Type Class Description
input_text Content::InputText Text input
input_image Content::InputImage Image input (URL or base64)
input_file Content::InputFile File input (URL or base64)
input_video Content::InputVideo Video input (URL)
output_text Content::OutputText Text output from the model
refusal Content::RefusalContent Refusal content from the model

Configuration Options

Sessions support a wide range of configuration options that map to common API parameters:

session = PromptBuilder::Session.new(
  model: "gpt-5.4",
  instructions: "You are a helpful assistant.",
  temperature: 0.7,
  top_p: 0.9,
  max_output_tokens: 1024,
  frequency_penalty: 0.5,
  presence_penalty: 0.5,
  parallel_tool_calls: true,
  reasoning: {"effort" => "high"},
  text: {"format" => {"type" => "json_object"}},
  tool_choice: "auto",
  truncation: "auto",
  prompt_cache_key: "account-123",
  prompt_cache_retention: "24h",
  store: true,
  metadata: {"user_id" => "123"}
)

Provider-Specific Extra Parameters

The extra attribute on sessions, content blocks, and tool definitions lets you pass provider-specific parameters that are not part of the Open Responses canonical format. Each serializer recognizes a defined set of extra keys and maps them to the appropriate location in the target format. Unrecognized keys are silently ignored.

Session Extra

Pass provider-specific top-level request parameters via session.extra:

# Anthropic Messages: top_k, stop_sequences, cache_control
session = PromptBuilder::Session.new(
  model: "claude-sonnet-4-20250514",
  extra: {
    "top_k" => 40,
    "stop_sequences" => ["\n\nHuman:"],
    "cache_control" => {"type" => "ephemeral"}
  }
)

# OpenAI Chat Completions: stop, seed, logit_bias, n, web_search_options
session = PromptBuilder::Session.new(
  model: "gpt-4o",
  extra: {
    "stop" => ["END", "\n\n"],
    "seed" => 42,
    "web_search_options" => {"search_context_size" => "medium"}
  }
)

# Google Gemini: safety_settings, cached_content, stop_sequences, top_k, seed
session = PromptBuilder::Session.new(
  model: "gemini-2.0-flash",
  extra: {
    "safety_settings" => [
      {"category" => "HARM_CATEGORY_HATE_SPEECH", "threshold" => "BLOCK_ONLY_HIGH"}
    ],
    "cached_content" => "cachedContents/abc123",
    "stop_sequences" => ["END"],
    "top_k" => 40,
    "seed" => 42
  }
)

# Amazon Bedrock Converse: stop_sequences, guardrail_config, additional_model_request_fields
session = PromptBuilder::Session.new(
  model: "anthropic.claude-v2",
  extra: {
    "stop_sequences" => ["\n\nHuman:"],
    "guardrail_config" => {
      "guardrailIdentifier" => "my-guardrail",
      "guardrailVersion" => "1"
    },
    "additional_model_request_fields" => {"top_k" => 50}
  }
)

Content Extra

Content blocks support provider-specific attributes via keyword arguments that are captured in the extra hash:

# Anthropic Messages: cache_control on content blocks for prompt caching
session.system([
  PromptBuilder::Content::InputText.new(
    text: "You are an assistant with a large knowledge base...",
    cache_control: {"type" => "ephemeral"}
  )
])

# Anthropic Messages: citations opt-in on document blocks
session.user([
  PromptBuilder::Content::InputFile.new(
    url: "https://example.com/report.pdf",
    citations: {"enabled" => true}
  ),
  PromptBuilder::Content::InputText.new(text: "Summarize with citations.")
])

# Bedrock Converse: cache_point markers
session.user([
  PromptBuilder::Content::InputText.new(
    text: "Long context that should be cached...",
    cache_point: true
  )
])

Tool Definition Extra

Tool definitions accept provider-specific parameters as keyword arguments:

# Anthropic Messages: cache_control on tool definitions
session.register_tool(
  "search",
  description: "Search the knowledge base.",
  parameters: {"type" => "object", "properties" => {"query" => {"type" => "string"}}},
  cache_control: {"type" => "ephemeral"}
)

Recognized Extra Keys by Serializer

Serializer Session Extra Keys Content Extra Keys Tool Extra Keys
Chat Completions stop, seed, logit_bias, n, prediction, web_search_options, modalities, audio file_id, media_type
Messages top_k, stop_sequences, cache_control cache_control, citations, file_id, media_type cache_control
Gemini safety_settings, cached_content, stop_sequences, top_k, seed, candidate_count, response_modalities, media_resolution file_id, media_type, thought_signature
Converse stop_sequences, guardrail_config, additional_model_request_fields, additional_model_response_field_paths, performance_config, prompt_variables media_type, cache_point

Serialization and Persistence

Sessions and responses can be serialized to and from Hashes for storage or transmission:

# Serialize a session to a Hash
hash = session.to_h

# Restore a session from a Hash
restored_session = PromptBuilder::Session.from_h(hash)

This makes it straightforward to persist conversation state in a database or cache between requests.

Serializer Compatibility

The Open Responses format is the canonical data model for this gem. When serializing to other formats, some features may not be available because either the target API does not support them or because the Open Responses format does not expose parameters unique to the target API. If you attempt to use a feature that is not supported by a particular serializer, it will be silently omitted from the serialized output.

Session Fields

The following table shows which session configuration fields are supported by each serializer. ❌ means the field is silently omitted from the serialized output. Partial support is noted inline.

Session Field ChatCompletion Messages Gemini Converse
background
frequency_penalty ✅ → generationConfig.frequencyPenalty
include
max_tool_calls
metadata user_id key only ✅ → requestMetadata¹³
parallel_tool_calls
presence_penalty ✅ → generationConfig.presencePenalty
prompt_cache_key
prompt_cache_retention
reasoning effort key only budget_tokens/display/effort/type only budget_tokens/effort/summary: "auto" only
safety_identifier ✅ → metadata.user_id
service_tier auto/standard_only only unspecified/standard/flex/priority only ✅ → serviceTier.type
store
stream endpoint-selected⁸
stream_options include_usage/include_obfuscation only
text format/verbosity only format.type=json_schema only format key only format.type == json_schema only
top_logprobs ✅ → responseLogprobs/logprobs
truncation

Content Types

Content Type ChatCompletion Messages Gemini Converse
InputText
InputImage user messages only⁷ user messages only⁵ user messages only⁶ base64 or S3 URI only
InputFile user messages only¹⁰ user messages only¹ user messages only⁶ base64 or S3 URI only²
InputVideo URL required (Google-hosted URI only)⁶ S3 URI only
OutputText ✅ (annotations dropped on request)¹¹ ✅ (annotations ↔ citations)¹²
RefusalContent dropped⁹ dropped⁹ dropped⁹ dropped⁹
Reasoning items ✅³ ✅⁴
Compaction items
ItemReference items

¹ Messages format uses the media type from the data URL for base64 sources; set InputFile.media_type to override. file_id is mapped to a file source for the Anthropic Files API beta — set the appropriate anthropic-beta: files-api-2025-04-14 header in your HTTP client. ² Converse format infers the document type from the filename or URL extension. ³ Messages format only emits thinking blocks that include a cryptographic signature; unsigned blocks are silently dropped. ⁴ Gemini format passes thoughtSignature through transparently on reasoning, text, and function-call parts. redacted_thinking blocks are silently skipped. ⁵ Anthropic does not support a detail field on images; it is silently dropped. file_id is mapped to a file source (Anthropic Files API beta). ⁶ Gemini supports inline base64 data or any URL. For files, set media_type or use a recognized filename/URL extension. ⁷ Chat Completions does not accept InputImage.file_id — the image_file content type is Assistants API only. Use a URL or base64 data instead; a file_id-only image is omitted. ⁸ Gemini selects streaming by endpoint (:streamGenerateContent), not a request body field, so session.stream is silently ignored when serializing to Gemini. ⁹ RefusalContent blocks are dropped silently by all request serializers so a parsed refusal can sit in session history without breaking subsequent request_payload calls. A message left empty after stripping is omitted entirely. ¹⁰ Chat Completions sends InputFile as a {"type": "file", "file": {...}} content block. file_id (Files API) and base64 data URL (with optional filename/media_type) are supported. A URL-only InputFile (non-data URL) is omitted because Chat Completions has no remote-URL form for files. ¹¹ OutputText.annotations (e.g. URL citations from web_search_options) are parsed onto the assistant message and round-trip through session history, but are dropped silently on Chat Completions and Converse request serialization since those formats have no request-side equivalent. OutputText.logprobs is likewise dropped. ¹² Messages text-block citations are parsed into OutputText.annotations and emitted back as citations when serializing Messages history. Document/tool-result citation opt-ins are still not modeled. ¹³ Converse requestMetadata only accepts string key/value pairs. This serializer stringifies scalar metadata values and silently omits nested arrays or objects.

Features Not Accessible Through Open Responses

These target API features are not available through the Open Responses canonical format because Open Responses does not expose the equivalent parameters:

Feature ChatCompletion Messages Gemini Converse
Audio input
Audio output / TTS
Output modalities selection
top_k sampling
seed
Stop sequences
Structured output JSON schema
Top-level cache_control
logit_bias
Multiple candidates (n) ✅ (request unsupported; parsing keeps only the first candidate)
Speculative decoding (prediction)
tool_choice allowed_tools shape
Custom (non-function) tool types
Strict tool definitions ✅ (VALIDATED exists, but per-tool strict is not supported by this gem)
Deprecated Chat functions / function_call / max_tokens / user fields
Built-in web search (web_search_options)
Configurable safety thresholds
Guardrail policies (guardrailConfig, guardContent)
Cross-region routing (inference profiles)
Prompt caching (cache_control markers)
Prompt caching (cachePoint blocks)
Citations on documents and tool results
additionalModelRequestFields / additionalModelResponseFieldPaths
Bedrock Prompt Management variables (promptVariables)
search_result content blocks
Audio content blocks
MCP connectors (mcp_servers)
Code execution container reuse
Geographic inference routing (inference_geo)
Beta API headers (anthropic-beta, etc.)
Built-in code execution
Built-in computer use
Built-in bash / text editor / memory tools
Built-in URL context retrieval (urlContext)
Built-in Google Maps grounding
Built-in semantic file search (fileSearch)
Context caching (cachedContent resource)
Tool-call mode VALIDATED
toolConfig.retrievalConfig / includeServerSideToolInvocations
responseJsonSchema (newer JSON Schema variant)
mediaResolution (image/video token budget)
audioTimestamp, speechConfig
enableEnhancedCivicAnswers
routingConfig / modelSelectionConfig
thinkingConfig.includeThoughts detail control ✅ (reasoning.summary: "auto" only maps to true)
thinkingConfig.thinkingLevel ✅ (use reasoning.effort)
Per-Part videoMetadata (offset/FPS)
Top-level labels (Vertex flavor)

For Messages specifically:

  • Prompt caching — Anthropic's cache_control: {"type": "ephemeral"} markers on system blocks, message content blocks, tool definitions, document blocks, and the top-level request have no Open Responses equivalent. The prompt_cache_key and prompt_cache_retention session fields are silently omitted from the Messages payload.
  • Citations — text-block citations are preserved via OutputText.annotations. The citations: {"enabled": true} opt-in on document blocks and tool results is not modeled by this gem.
  • Geographic inference routing — Anthropic's inference_geo request field has no canonical Open Responses field. Response-side usage.inference_geo is preserved in response.usage.input_tokens_details.
  • API versioning / beta headers — this gem produces no HTTP, so anthropic-version, anthropic-beta, and similar headers must be set on your HTTP client. Features behind a beta header (Files API, MCP, code execution containers, extended caching, etc.) still produce valid request payloads through this gem when their request-body shape is supported.

Chat Completions-specific notes

Request-side mappings worth calling out:

Canonical field / value Chat Completions mapping
instructions leading {"role": "system", "content": ...} message (use session.developer(...) if your model prefers developer)
max_output_tokens max_completion_tokens
safety_identifier safety_identifier (the legacy user field is not used)
prompt_cache_key / prompt_cache_retention passed through directly
text.format response_format (the OR-canonical flat json_schema is reshaped into {"type": "json_schema", "json_schema": {...}})
text.verbosity top-level verbosity
reasoning.effort reasoning_effort
tool_choice: {"type": "function", "name": ...} {"type": "function", "function": {"name": ...}}
InputFile.file_id {"type": "file", "file": {"file_id": ...}}
InputFile with data URL (+ optional filename/media_type) {"type": "file", "file": {"filename": ..., "file_data": "data:<media_type>;base64,..."}}
InputImage with data URL (+ media_type) {"type": "image_url", "image_url": {"url": "data:<media_type>;base64,..."}}

Unsupported Chat Completions request features not exposed by this gem include audio input/output (audio, modalities, input_audio content), web_search_options, custom tools, tool_choice: {"type": "allowed_tools", ...}, seed, stop, logit_bias, n, prediction, and the deprecated functions, function_call, max_tokens, and user fields. Use the modern canonical fields when available (tools, tool_choice, max_output_tokens, safety_identifier, prompt_cache_key).

Response-side behavior and limitations:

  • Responses with multiple choices (n > 1) parse only the first choice; additional candidates are dropped.
  • Streaming chunks (chat.completion.chunk deltas) raise UnsupportedFormatError — this gem expects a fully assembled non-streaming response body.
  • system_fingerprint is exposed on response.provider_data.
  • service_tier is populated when present on the response.
  • message.annotations (URL citations from web_search_options) are copied onto OutputText.annotations.
  • Audio responses, custom tool calls, legacy message.function_call, and unknown response content block types are silently skipped because they have no canonical representation in this gem.
  • finish_reason mappings: stop/tool_calls/function_callcompleted, lengthincomplete, content_filterfailed. The legacy function_call reason is included so older models still surface as completed.

Anthropic Messages-specific mappings

A few features map between the canonical Open Responses format and the Messages API in non-obvious ways:

Canonical field / value Messages mapping
InputImage.file_id {"type": "image", "source": {"type": "file", "file_id": ...}} (Files API beta)
InputFile.file_id {"type": "document", "source": {"type": "file", "file_id": ...}} (Files API beta)
FunctionCallOutput.statusincomplete, failed, error tool_result.is_error: true
safety_identifier metadata.user_id
parallel_tool_calls: false tool_choice.disable_parallel_tool_use: true (forces tool_choice.type to auto if unset)
reasoning.budget_tokens thinking.budget_tokens (with thinking.type defaulted to enabled)
reasoning.effort output_config.effort (low, medium, high, xhigh, or max)
reasoning.type: "adaptive" thinking.type: "adaptive"
text.format.type: "json_schema" output_config.format (type and schema only; name, description, and strict are ignored)
Tools::Definition#strict tool definition strict: true
tool_choice: "required" {"type": "any"}
tool_choice: {"type": "function", "name": ...} {"type": "tool", "name": ...}

Response stop reasons are mapped to Open Responses statuses as follows: end_turn, tool_use, stop_sequence, and pause_turncompleted; max_tokensincomplete; refusalfailed. When stop_sequence is matched, the matched text is exposed via response.incomplete_details["stop_sequence"]. Anthropic stop_details and response container metadata are exposed via response.incomplete_details. Additional usage details (cache_creation breakdown, service_tier, inference_geo, server_tool_use, and cache token counts) are surfaced through response.usage.input_tokens_details.

Built-in tool response content blocks (server_tool_use, web_search_tool_result, web_fetch_tool_result, code_execution_tool_result, bash_code_execution_tool_result, tool_search_tool_result, mcp_tool_use, mcp_tool_result, container_upload, search_result) are silently skipped on parse, since this gem has no canonical representation for them.

Gemini-specific notes

Request-side mappings worth calling out:

Canonical field / value Gemini mapping
instructions + system/developer messages merged into systemInstruction.parts[]
max_output_tokens generationConfig.maxOutputTokens
temperature / top_p generationConfig.temperature / topP
presence_penalty / frequency_penalty generationConfig.presencePenalty / frequencyPenalty
top_logprobs: N generationConfig.{responseLogprobs: true, logprobs: N}
text.format == "text" generationConfig.responseMimeType = "text/plain"
text.format == "json_object" generationConfig.responseMimeType = "application/json"
text.format == "json_schema" generationConfig.responseMimeType = "application/json" + responseSchema (read from format.json_schema.schema or format.schema)
reasoning.budget_tokens generationConfig.thinkingConfig.thinkingBudget
tool_choice: "auto" / "none" / "required" toolConfig.functionCallingConfig.mode = AUTO / NONE / ANY
tool_choice: {"type": "function", "name": ...} mode = ANY + allowedFunctionNames: [name]
Tool definitions single tools[0].functionDeclarations[] entry

Content and message restrictions:

  • Assistant messages map to role: "model"; consecutive same-role turns are merged automatically.
  • InputImage, InputFile, and InputVideo are only supported in user messages; the same content on an assistant message is omitted.
  • InputImage accepts any URL, a base64 data URL (with required media_type), or a Files API file_id.
  • InputFile accepts any URL, a base64 data URL, or file_id. media_type is required when not inferable from a filename or URL extension. Recognized extensions: pdf, txt, md/markdown, html/htm, csv, json, xml, rtf.
  • InputVideo requires a URL; raw bytes are not modeled.
  • RefusalContent is dropped silently so a parsed Chat Completions refusal can sit in session history without breaking subsequent serialization.
  • Reasoning items round-trip via parts[].thought with thoughtSignature preserved. redacted_thinking, summary, and unknown reasoning block types are silently skipped.
  • FunctionCall.arguments must parse to a JSON object — Gemini's functionCall.args is a Struct.
  • FunctionCallOutput content must be text-only (InputText/OutputText or a plain string); other content types are omitted. When output parses to a JSON object it is forwarded as the functionResponse.response Struct; otherwise it is wrapped as {"result": ...}. The output must reference a prior FunctionCall so its name can be resolved (Gemini's functionResponse.name is the tool name, not the call id).
  • Compaction and ItemReference items are silently skipped.
  • tool_choice without registered tools is omitted (along with unsupported tool_choice shapes).

Response-side limitations:

  • Unknown response Part shapes (inlineData, fileData, executableCode, codeExecutionResult, videoMetadata, server-side toolCall/toolResponse, or any Part without a recognized content key) are silently skipped.
  • Only candidates[0] is parsed. When candidateCount > 1, additional candidates are dropped (their index is preserved on provider_data).
  • Function-call id from the response is dropped; the parser synthesizes gemini_call_<seed>_<n> so multiple calls in one response share a deterministic seed.
  • finishReason mappings: STOPcompleted; MAX_TOKENSincomplete; SAFETY, RECITATION, OTHER, BLOCKLIST, PROHIBITED_CONTENT, SPII, MALFORMED_FUNCTION_CALL, IMAGE_SAFETY, LANGUAGE, UNEXPECTED_TOOL_CALL, TOO_MANY_TOOL_CALLS, MODEL_ARMORfailed. FINISH_REASON_UNSPECIFIED is treated as nil.
  • An empty candidates array combined with promptFeedback.blockReason is mapped to failed.
  • usageMetadata.cachedContentTokenCount, toolUsePromptTokenCount, promptTokensDetails, cacheTokensDetails, and toolUsePromptTokensDetails populate response.usage.input_tokens_details. thoughtsTokenCount and candidatesTokensDetails populate response.usage.output_tokens_details.
  • Response metadata with no canonical Open Responses slot is exposed on response.provider_data: groundingMetadata, citationMetadata, urlContextMetadata, urlRetrievalMetadata, safetyRatings, groundingAttributions, avgLogprobs, logprobsResult, finishMessage, candidate index, top-level createTime, and full promptFeedback. Streaming chunks are not parsed — this gem expects a fully assembled non-streaming response body.

Converse-specific notes

Request-side restrictions worth calling out:

Canonical field / value Converse mapping
instructions + system/developer messages merged into top-level system array
max_output_tokens inferenceConfig.maxTokens
temperature / top_p inferenceConfig.temperature / inferenceConfig.topP
metadata requestMetadata with scalar values stringified
service_tier serviceTier.type
text.format.type == "json_schema" outputConfig.textFormat with structure.jsonSchema
tool_choice: "auto" {"auto": {}}
tool_choice: "required" {"any": {}}
tool_choice: {"type": "function", "name": ...} {"tool": {"name": ...}}

Content and message restrictions:

  • The first message must have role user; the request raises if it doesn't. Consecutive same-role messages are merged automatically.
  • system and developer messages must contain text only. Non-text content is omitted because Converse system blocks only map to text, guardContent, or cachePoint, and the latter two are not modeled by this gem.
  • InputImage requires either base64 data or an s3:// URI; content with an arbitrary public URL is omitted. file_id and detail have no Converse equivalent and are silently ignored.
  • InputFile requires either a base64 data URL or an s3:// URI; document name is auto-derived from filename / URL and sanitized to Bedrock's allowed character set ([A-Za-z0-9 \-()\[\]]{1,256}), with collisions disambiguated within a single request.
  • InputFile.file_id is silently ignored because Converse does not accept provider file IDs for documents.
  • InputVideo requires an s3:// URI; a non-S3 URL is omitted, and raw bytes (source.bytes) cannot be sent because the canonical InputVideo content type does not model byte data.
  • InputImage, InputFile, and InputVideo are only supported in user messages; the same content on an assistant message is omitted. RefusalContent, OutputText.annotations, and OutputText.logprobs are also dropped because Converse has no request-side representation for them.
  • Reasoning items are silently skipped on the request side, so multi-turn extended-thinking + tool-use loops are not supported through this serializer (the response parser does decode reasoningContent blocks back into Reasoning items, but they cannot be sent back).
  • Compaction and ItemReference items are silently skipped.
  • FunctionCall.arguments must parse to a JSON object; non-object JSON values raise (Bedrock's toolUse.input requires an object).
  • FunctionCallOutput.output content supports InputText/OutputText, InputImage, InputFile, and InputVideo, mapping to Converse text, image, document, and video tool result blocks. Converse also supports json and searchResult tool result blocks, but this gem has no canonical content types for them, so unsupported content is omitted. FunctionCallOutput.status is mapped: completedsuccess, failed/incompleteerror, anything else passes through or is dropped.
  • tool_choice: "none" and tool_choice without registered tools are both omitted.
  • Converse API request features not exposed by this gem include stopSequences, additionalModelRequestFields, additionalModelResponseFieldPaths, guardrailConfig / guardContent, cachePoint, Prompt Management promptVariables, audio blocks, searchResult blocks, and performanceConfig.latency.

Response-side limitations:

  • Unknown content block keys (e.g. citationsContent, guardContent) are silently skipped.
  • metrics.latencyMs, trace (guardrail and prompt-router trace events), additionalModelResponseFields, performanceConfig, and the raw serviceTier object are exposed on response.provider_data. response.service_tier is also populated from serviceTier.type.
  • stopReason mappings: end_turn / tool_use / stop_sequencecompleted, max_tokens / model_context_window_exceededincomplete, guardrail_intervened / content_filtered / malformed_model_output / malformed_tool_usefailed. Unlike the Messages serializer, the matched stop sequence text is not surfaced separately because Converse does not echo it back unless you request provider-specific fields through additionalModelResponseFieldPaths, which this gem cannot emit.
  • usage.cacheReadInputTokens and usage.cacheWriteInputTokens populate response.usage.input_tokens_details["cached_tokens"] and ["cache_creation_input_tokens"]. Cache writes still require cachePoint markers in the request, which this gem cannot produce.

Installation

Add this line to your application's Gemfile:

gem "prompt_builder"

Then execute:

$ bundle

Or install it yourself as:

$ gem install prompt_builder

Contributing

Open a pull request on GitHub.

Please use the standardrb syntax and lint your code with standardrb --fix before submitting.

License

The gem is available as open source under the terms of the MIT License.