Class: Legion::LLM::API::ClientTranslators::OpenAIResponses

Inherits:

Object

Object
Legion::LLM::API::ClientTranslators::OpenAIResponses

show all

Extended by:: Legion::Logging::Helper

Includes:: SharedExtractors, Legion::Logging::Helper

Defined in:: lib/legion/llm/api/client_translators/openai_responses.rb

Overview

OpenAI /v1/responses (Responses API) client translator.

Per Phase 5:

- parse_request(body, env) -> Canonical::Request (handles
  input/instructions/reasoning/tools shapes specific to /v1/responses)
- format_response(canonical_or_pipeline_response) -> Hash with output[]
  containing {thinking, function_call, message} items
- format_error(error, status_code:, type:)
- events_emitter(out, ...) -> Events emitter conforming to the
  StreamAssembler contract.

Defined Under Namespace

Classes: Events

Constant Summary collapse

Canonical =

Legion::Extensions::Llm::Canonical

Instance Method Summary collapse

#build_inference_request(canonical_request, request_id:, server_caller:, modality: nil) ⇒ Object
#build_tool_choice(raw) ⇒ Object

OpenAI Responses tool_choice shapes: “auto” / “none” / “required” → corresponding sym ‘function’, name: ‘X’ → ‘function’, name: ‘X’ ‘function’, function: {name:} → flatten to ‘function’, name: ‘X’.
#ensure_reasoning_summary(body) ⇒ Object

When the caller asks for reasoning (‘reasoning.effort` set) but didn’t pin a summary mode, default to ‘summary: ’auto’‘.
#events_emitter(out, request_id:, model:, conv_id: nil) ⇒ Object
#format_chunk(canonical_chunk) ⇒ Object

Format a single Canonical::Chunk to a /v1/responses SSE event hash.
#format_error(error, status_code: 500, type: 'server_error') ⇒ Object
#format_response(pipeline_response, model:, request_id:) ⇒ Object
#format_tool_call_delta_chunk(canonical_chunk) ⇒ Object

G24 — server-executed tool chunks surface as completed function_call output items so the client sees the call name AND result inline.
#g24_format ⇒ Object

G24 — declares which execution-proxy contract shape this translator surfaces.
#parse_request(body, env = {}) ⇒ Object
#server_tool_chunk?(tool_call) ⇒ Boolean

Methods included from SharedExtractors

#args_as_json_string, #args_as_object, #extract_content_text, #extract_thinking_text, #legion_routing_explicit_from_env, #legion_routing_from_env, #text_content_type?, #token_value

Instance Method Details

#build_inference_request(canonical_request, request_id:, server_caller:, modality: nil) ⇒ `Object`

# File 'lib/legion/llm/api/client_translators/openai_responses.rb', line 86

def build_inference_request(canonical_request, request_id:, server_caller:, modality: nil)
  tool_defs = build_tool_definitions(canonical_request.tools)

  extra = {}
  tier = canonical_request.metadata[:tier]
  extra[:tier] = tier.to_sym if tier
  routing_explicit = canonical_request.metadata[:routing_explicit]
  extra[:routing_explicit] = routing_explicit if routing_explicit

  messages = inference_messages(canonical_request.messages)

  Legion::LLM::Inference::Request.build(
    id:              request_id,
    messages:        messages,
    system:          canonical_request.system,
    routing:         canonical_request.routing,
    tools:           tool_defs,
    tool_choice:     canonical_request.tool_choice,
    caller:          server_caller,
    conversation_id: canonical_request.conversation_id,
    stream:          canonical_request.stream == true,
    modality:        modality,
    thinking:        thinking_to_inference(canonical_request.thinking),
    cache:           { strategy: :default, cacheable: true },
    extra:           extra,
    metadata:        canonical_request.metadata.except(:upstream_body)
  )
end

#build_tool_choice(raw) ⇒ `Object`

OpenAI Responses tool_choice shapes:

"auto" / "none" / "required"          → corresponding sym
{type: 'function', name: 'X'}         → {type: 'function', name: 'X'}
{type: 'function', function: {name:}} → flatten to {type: 'function', name: 'X'}

# File 'lib/legion/llm/api/client_translators/openai_responses.rb', line 119

def build_tool_choice(raw)
  return nil if raw.nil?

  case raw
  when Hash
    symbolized = raw.transform_keys(&:to_sym)
    type = symbolized[:type].to_s
    if type == 'function'
      fn = symbolized[:function].is_a?(Hash) ? symbolized[:function].transform_keys(&:to_sym) : nil
      name = symbolized[:name] || fn&.[](:name)
      return { type: :function, name: name.to_s } if name

      symbolized
    else
      %w[auto none required any].include?(type) ? type.to_sym : symbolized
    end
  when String, Symbol
    raw.to_sym
  end
end

#ensure_reasoning_summary(body) ⇒ `Object`

When the caller asks for reasoning (‘reasoning.effort` set) but didn’t pin a summary mode, default to ‘summary: ’auto’‘. OpenAI’s /v1/responses lane omits reasoning summary content unless the request opts in — without this, codex→openai cells return only the message item (no reasoning), and the e2e validator reports “reasoning never produced” (P5-final-cells.md B3).

# File 'lib/legion/llm/api/client_translators/openai_responses.rb', line 75

def ensure_reasoning_summary(body)
  reasoning = body[:reasoning]
  return body unless reasoning.is_a?(Hash)

  normalized = reasoning.transform_keys { |k| k.respond_to?(:to_sym) ? k.to_sym : k }
  return body.merge(reasoning: normalized) if normalized.key?(:summary)
  return body unless normalized[:effort]

  body.merge(reasoning: normalized.merge(summary: 'auto'))
end

#events_emitter(out, request_id:, model:, conv_id: nil) ⇒ `Object`



189
190
191

# File 'lib/legion/llm/api/client_translators/openai_responses.rb', line 189

def events_emitter(out, request_id:, model:, conv_id: nil)
  Events.new(out: out, request_id: request_id, model: model.to_s, conv_id: conv_id)
end

#format_chunk(canonical_chunk) ⇒ `Object`

Format a single Canonical::Chunk to a /v1/responses SSE event hash.

# File 'lib/legion/llm/api/client_translators/openai_responses.rb', line 194

def format_chunk(canonical_chunk)
  return nil if canonical_chunk.nil?

  case canonical_chunk.type
  when :text_delta
    { type: 'response.output_text.delta', delta: extract_content_text(canonical_chunk.delta),
      output_index: canonical_chunk.block_index || 0 }
  when :thinking_delta
    { type: 'response.thinking.delta', delta: canonical_chunk.delta.to_s,
      output_index: canonical_chunk.block_index || 0 }
  when :tool_call_delta
    format_tool_call_delta_chunk(canonical_chunk)
  when :done
    { type: 'response.completed' }
  end
end

#format_error(error, status_code: 500, type: 'server_error') ⇒ `Object`



185
186
187

# File 'lib/legion/llm/api/client_translators/openai_responses.rb', line 185

def format_error(error, status_code: 500, type: 'server_error')
  [status_code, { error: { message: error.respond_to?(:message) ? error.message : error.to_s, type: type } }]
end

#format_response(pipeline_response, model:, request_id:) ⇒ `Object`

# File 'lib/legion/llm/api/client_translators/openai_responses.rb', line 140

def format_response(pipeline_response, model:, request_id:)
  routing = pipeline_response.respond_to?(:routing) ? pipeline_response.routing || {} : {}
  tokens = pipeline_response.respond_to?(:tokens) ? pipeline_response.tokens || {} : {}
  raw_msg = pipeline_response.respond_to?(:message) ? pipeline_response.message : nil
  content = extract_content_text(raw_msg)
  resolved_model = (routing[:model] || routing['model'] || model).to_s

  actionable_tool_calls = build_output_tool_calls(pipeline_response)
  server_tool_items = build_output_server_tool_items(pipeline_response)
  reasoning = build_output_reasoning(pipeline_response)

  output = [
    *reasoning,
    *server_tool_items,
    *actionable_tool_calls,
    {
      type:    'message',
      id:      "msg_#{SecureRandom.hex(12)}",
      role:    'assistant',
      content: [{ type: 'output_text', text: content }],
      status:  'completed'
    }
  ]

  # G24 — server-executed tools are completed non-actionable items.
  # `requires_action` is reserved for client-callable tools awaiting
  # client execution. When ALL tool calls are server-resolved, the
  # response status is `completed` and there is no action_required.
  status = actionable_tool_calls.any? ? 'requires_action' : 'completed'

  result = {
    id:         request_id,
    object:     'response',
    created_at: Time.now.to_i,
    model:      resolved_model,
    output:     output,
    usage:      build_usage(tokens),
    status:     status
  }

  result[:action_required] = { type: 'function_calls', function_calls: actionable_tool_calls } if actionable_tool_calls.any?

  result
end

#format_tool_call_delta_chunk(canonical_chunk) ⇒ `Object`

G24 — server-executed tool chunks surface as completed function_call output items so the client sees the call name AND result inline. Plain client-callable chunks remain plain function_call_arguments.delta events.

# File 'lib/legion/llm/api/client_translators/openai_responses.rb', line 215

def format_tool_call_delta_chunk(canonical_chunk)
  tc = canonical_chunk.tool_call
  args = tc.respond_to?(:arguments) ? tc.arguments : {}
  output_index = canonical_chunk.block_index || 0

  if server_tool_chunk?(tc)
    {
      type:         'response.output_item.done',
      output_index: output_index,
      item:         {
        type:      'function_call',
        id:        "fc_#{SecureRandom.hex(12)}",
        call_id:   tc.respond_to?(:id) ? tc.id : nil,
        name:      tc.respond_to?(:name) ? tc.name.to_s : '',
        arguments: args_as_json_string(args),
        status:    'completed'
      }
    }
  else
    { type:         'response.function_call_arguments.delta',
      output_index: output_index,
      delta:        args_as_json_string(args) }
  end
end

#g24_format ⇒ `Object`

G24 — declares which execution-proxy contract shape this translator surfaces. Consumed by the lex-llm conformance shared examples.



34
35
36

# File 'lib/legion/llm/api/client_translators/openai_responses.rb', line 34

def g24_format
  :openai_responses
end

#parse_request(body, env = {}) ⇒ `Object`

# File 'lib/legion/llm/api/client_translators/openai_responses.rb', line 38

def parse_request(body, env = {})
  log.debug('[llm][client_translator][openai_responses] action=parse_request')
  body = symbolize(body)

  messages = build_messages(body[:input], body[:instructions])
  tools = build_tools(body[:tools])
  params = build_params(body)
  thinking = build_thinking(body[:reasoning])
  tool_choice = build_tool_choice(body[:tool_choice])
  upstream_body = ensure_reasoning_summary(body)

  Canonical::Request.build(
    id:              env['HTTP_X_CLIENT_REQUEST_ID'] || "resp_#{SecureRandom.hex(16)}",
    messages:        messages,
    tools:           tools,
    tool_choice:     tool_choice,
    params:          params,
    thinking:        thinking,
    stream:          body[:stream] == true,
    conversation_id: env['HTTP_X_LEGION_CONVERSATION_ID'] || env['HTTP_THREAD_ID'] || body[:conversation],
    routing:         legion_routing_from_env(env),
    metadata:        {
      client_model:     body[:model],
      tier:             env['HTTP_X_LEGION_TIER'],
      routing_explicit: legion_routing_explicit_from_env(env),
      external_refs:    external_refs(body, env),
      upstream_body:    upstream_body # preserved for native call_responses path until executor is canonical
    }.compact
  )
end

#server_tool_chunk?(tool_call) ⇒ `Boolean`

Returns:

(Boolean)

# File 'lib/legion/llm/api/client_translators/openai_responses.rb', line 240

def server_tool_chunk?(tool_call)
  source = tool_call.respond_to?(:source) ? tool_call.source : nil
  return false if source.nil?

  type = source.is_a?(Hash) ? (source[:type] || source['type']) : source
  %i[special registry extension mcp].include?(type&.to_sym)
end

Class: Legion::LLM::API::ClientTranslators::OpenAIResponses

Overview

Defined Under Namespace

Constant Summary collapse

Instance Method Summary collapse

Methods included from SharedExtractors

Instance Method Details

#build_inference_request(canonical_request, request_id:, server_caller:, modality: nil) ⇒ Object

#build_tool_choice(raw) ⇒ Object

#ensure_reasoning_summary(body) ⇒ Object

#events_emitter(out, request_id:, model:, conv_id: nil) ⇒ Object

#format_chunk(canonical_chunk) ⇒ Object

#format_error(error, status_code: 500, type: 'server_error') ⇒ Object

#format_response(pipeline_response, model:, request_id:) ⇒ Object

#format_tool_call_delta_chunk(canonical_chunk) ⇒ Object

#g24_format ⇒ Object

#parse_request(body, env = {}) ⇒ Object

#server_tool_chunk?(tool_call) ⇒ Boolean