Class: PromptBuilder::Serializers::Gemini::Request

Inherits:

Base

show all

Overview

Request serializer for the Google Gemini API format.

These session fields are not supported and are silently omitted from the serialized output:

background — Gemini has no background/async mode on the generate endpoint
include — response-field inclusion is an Open Responses-only concept
max_tool_calls — per-request tool-call caps are not supported
metadata — arbitrary metadata is not supported
parallel_tool_calls — parallel tool call control is not supported
prompt_cache_key / prompt_cache_retention — explicit prompt cache keys are not supported
safety_identifier — no equivalent user-safety field on the generate endpoint
stream_options — stream event options are not supported
truncation — server-side context truncation is not supported

Input content restrictions:

InputImage content is only supported in user messages (assistant images are omitted)
InputImage with image_url requires either base64 data or a URL; content without image_url or file_id raises
InputFile content is only supported in user messages (assistant files are omitted)
InputFile requires media_type when file_data is provided, or a recognized extension on filename / file_url
InputVideo requires video_url
RefusalContent is dropped silently (a parsed Chat Completions refusal can stay in session history without breaking subsequent request_payload calls)
redacted_thinking and unknown reasoning blocks are silently skipped
Reasoning items with summary blocks have the summary skipped
FunctionCallOutput array contents must be text-only; other content is omitted
Compaction and ItemReference items are silently skipped

The following Gemini parameters cannot be set through the Open Responses canonical format:

thinkingConfig.thinkingBudget — use reasoning.budget_tokens instead
thinkingConfig.thinkingLevel — use reasoning.effort instead
thinkingConfig.includeThoughts — use reasoning.summary = “auto” instead
topK — top-K sampling parameter
seed — for reproducible outputs (model-dependent)
stopSequences — custom stop sequences
candidateCount — requesting multiple response candidates
responseModalities — selecting TEXT/IMAGE/AUDIO output channels
responseJsonSchema — newer JSON Schema variant of responseSchema
safetySettings — configurable harm-category safety thresholds
mediaResolution — controls token cost of image/video inputs
audioTimestamp, speechConfig — audio output configuration
enableEnhancedCivicAnswers
routingConfig, modelSelectionConfig
Video metadata controls (videoMetadata offset, FPS) on Parts
Audio input model capability controls (audio can be sent as InputFile with an audio MIME type)
Built-in Gemini tools: googleSearch / googleSearchRetrieval, codeExecution, urlContext, computerUse, fileSearch, googleMaps, mcpServers
functionCallingConfig.mode = VALIDATED
toolConfig.retrievalConfig, includeServerSideToolInvocations
cachedContent — referencing a CachedContent resource by name
Top-level labels (Vertex flavor only)

SUPPORTED_REASONING_EFFORTS =

%w[minimal low medium high].freeze

SUPPORTED_REASONING_SUMMARIES =

%w[auto].freeze

SUPPORTED_SERVICE_TIERS =

%w[unspecified standard flex priority].freeze