Module: Legion::LLM::API::DebugFormats

Extended by:: Legion::Logging::Helper

Defined in:: lib/legion/llm/api/debug_formats.rb

Overview

Shared X-Legion-Format / X-Legion-Debug surface for the three client routes (G21).

Two opt-in debug modes, both gated by ‘llm.api.debug_formats.enabled` (default: true in lite/dev, false otherwise — the envelope leaks routing/escalation internals):

X-Legion-Format: canonical
  Run the full pipeline but skip the client-format translation.
  Sync: return Canonical::Response#to_h + contract version.
  Streaming: emit each canonical chunk as `data: <chunk-json>\n\n`
  followed by `data: [DONE]\n\n` — same envelope across all three
  routes (no Anthropic-style typed events, no /v1/responses
  sequence_number ceremony) so the canonical layer is its own
  bisection point.

X-Legion-Debug: echo-request
  The parsed Canonical::Request#to_h is folded into the response
  metadata under `_legion_debug.echo_request`. Combined with
  X-Legion-Format: canonical, this is the equivalence-invariant
  check — the same semantic payload sent to /v1/messages and
  /v1/responses must echo back IDENTICAL Canonical::Request hashes.

Modes are independent: ‘format=canonical` works without the echo, and `echo-request` works on a normal client-format response.

Defined Under Namespace

Classes: CanonicalEvents

Constant Summary collapse

Canonical =

Legion::Extensions::Llm::Canonical

FORMAT_HEADER =

'HTTP_X_LEGION_FORMAT'

DEBUG_HEADER =

'HTTP_X_LEGION_DEBUG'

FORMAT_CANONICAL =

'canonical'

DEBUG_ECHO_REQUEST =

'echo-request'

Class Method Summary collapse

.attach_echo_request(client_format_body, canonical_request) ⇒ Object

Sync: fold the parsed canonical request into a client-format response’s metadata.
.canonical_event_emitter(out) ⇒ Object

Streaming canonical SSE — emit each chunk via the assembler-equivalent stream interface, then a final ‘data: [DONE]nn`.
.canonical_format?(env) ⇒ Boolean
.canonical_stop_reason(pipeline_response, tool_calls) ⇒ Object
.canonical_thinking(value) ⇒ Object
.canonical_tool_call(tool_call) ⇒ Object
.canonical_usage(tokens, _pipeline_response) ⇒ Object
.canonicalize_response(pipeline_response) ⇒ Object

Convert an Inference::Response (the executor’s envelope) into a Canonical::Response (the provider-boundary contract).
.echo_request?(env) ⇒ Boolean
.emit_echo_request_sse(out, canonical_request) ⇒ Object

Streaming: emit a one-shot SSE event with the canonical request echo so the client can correlate the two endpoints.
.enabled? ⇒ Boolean

Settings dig — debug surface enabled?.
.render_canonical_response(pipeline_response, canonical_request:, env:) ⇒ Object

Render a non-streaming canonical response as JSON.
.sanitize_routing(routing) ⇒ Object
.token_value(tokens, *keys) ⇒ Object

Class Method Details

.attach_echo_request(client_format_body, canonical_request) ⇒ `Object`

Sync: fold the parsed canonical request into a client-format response’s metadata. Mutates a copy of the body to keep callers simple. Returns the merged hash.

# File 'lib/legion/llm/api/debug_formats.rb', line 86

def self.attach_echo_request(client_format_body, canonical_request)
  {
    **client_format_body,
    _legion_debug: { echo_request: canonical_request.to_h }
  }
end

.canonical_event_emitter(out) ⇒ `Object`

Streaming canonical SSE — emit each chunk via the assembler-equivalent stream interface, then a final ‘data: [DONE]nn`. Same envelope on every route.



79
80
81

# File 'lib/legion/llm/api/debug_formats.rb', line 79

def self.canonical_event_emitter(out)
  CanonicalEvents.new(out)
end

.canonical_format?(env) ⇒ `Boolean`

Returns:

(Boolean)



52
53
54

# File 'lib/legion/llm/api/debug_formats.rb', line 52

def self.canonical_format?(env)
  enabled? && env[FORMAT_HEADER].to_s.downcase == FORMAT_CANONICAL
end

.canonical_stop_reason(pipeline_response, tool_calls) ⇒ `Object`

# File 'lib/legion/llm/api/debug_formats.rb', line 318

def self.canonical_stop_reason(pipeline_response, tool_calls)
  return :tool_use if tool_calls.any? { |tc| tc.source != :special && tc.source != :registry && tc.source != :extension && tc.source != :mcp && tc.result.nil? }

  stop = pipeline_response.respond_to?(:stop) ? pipeline_response.stop : nil
  reason = stop.is_a?(Hash) ? (stop[:reason] || stop['reason']) : stop
  sym = reason&.to_sym
  return sym if Canonical::Response::STOP_REASONS.include?(sym)

  :end_turn
end

.canonical_thinking(value) ⇒ `Object`

# File 'lib/legion/llm/api/debug_formats.rb', line 263

def self.canonical_thinking(value)
  return nil if value.nil?

  if value.is_a?(Hash)
    normalized = value.transform_keys { |k| k.respond_to?(:to_sym) ? k.to_sym : k }
    content = (normalized[:content] || normalized[:text] || normalized[:thinking]).to_s
    signature = normalized[:signature].to_s
  else
    content = value.respond_to?(:content) ? value.content.to_s : value.to_s
    signature = value.respond_to?(:signature) ? value.signature.to_s : ''
  end
  return nil if content.empty? && signature.empty?

  Canonical::Thinking.from_hash(content: content, signature: signature)
end

.canonical_tool_call(tool_call) ⇒ `Object`

# File 'lib/legion/llm/api/debug_formats.rb', line 279

def self.canonical_tool_call(tool_call)
  return tool_call if tool_call.is_a?(Canonical::ToolCall)

  h = tool_call.respond_to?(:to_h) && !tool_call.is_a?(Hash) ? tool_call.to_h : tool_call
  h = if h.is_a?(Hash)
        h.transform_keys { |k| k.respond_to?(:to_sym) ? k.to_sym : k }
      else
        {}
      end

  source = h[:source]
  source_sym = if source.is_a?(Hash)
                 (source[:type] || source['type'])&.to_sym
               else
                 source&.to_sym
               end

  Canonical::ToolCall.build(
    id:        h[:id],
    name:      h[:name].to_s,
    arguments: h[:arguments] || {},
    source:    source_sym,
    status:    h[:status],
    result:    h[:result]
  )
end

.canonical_usage(tokens, _pipeline_response) ⇒ `Object`

# File 'lib/legion/llm/api/debug_formats.rb', line 306

def self.canonical_usage(tokens, _pipeline_response)
  return nil if tokens.nil? || (tokens.respond_to?(:empty?) && tokens.empty?)

  Canonical::Usage.from_hash(
    input_tokens:       token_value(tokens, :input, :input_tokens) || 0,
    output_tokens:      token_value(tokens, :output, :output_tokens) || 0,
    cache_read_tokens:  token_value(tokens, :cache_read, :cache_read_tokens) || 0,
    cache_write_tokens: token_value(tokens, :cache_write, :cache_write_tokens) || 0,
    thinking_tokens:    token_value(tokens, :thinking, :thinking_tokens) || 0
  )
end

.canonicalize_response(pipeline_response) ⇒ `Object`

Convert an Inference::Response (the executor’s envelope) into a Canonical::Response (the provider-boundary contract). Inference::Response carries the executor envelope — text, tool_calls, thinking, usage live on it; Canonical::Response is the projection.

# File 'lib/legion/llm/api/debug_formats.rb', line 106

def self.canonicalize_response(pipeline_response)
  message = pipeline_response.respond_to?(:message) ? pipeline_response.message : nil
  text = if message.is_a?(Hash)
           (message[:content] || message['content']).to_s
         else
           message.to_s
         end

  tokens = pipeline_response.respond_to?(:tokens) ? pipeline_response.tokens || {} : {}
  usage = canonical_usage(tokens, pipeline_response)

  tool_calls = (pipeline_response.respond_to?(:tools) ? Array(pipeline_response.tools) : []).map do |tc|
    canonical_tool_call(tc)
  end

  thinking = canonical_thinking(pipeline_response.respond_to?(:thinking) ? pipeline_response.thinking : nil)
  stop_reason = canonical_stop_reason(pipeline_response, tool_calls)
  routing = pipeline_response.respond_to?(:routing) ? pipeline_response.routing || {} : {}
  model = (routing[:model] || routing['model']).to_s

  metadata = {}
  metadata[:request_id] = pipeline_response.request_id if pipeline_response.respond_to?(:request_id)
  metadata[:conversation_id] = pipeline_response.conversation_id if pipeline_response.respond_to?(:conversation_id)
  metadata[:routing] = sanitize_routing(routing) if routing.any?

  Canonical::Response.build(
    text:        text,
    thinking:    thinking,
    tool_calls:  tool_calls,
    usage:       usage,
    stop_reason: stop_reason,
    model:       model,
    routing:     sanitize_routing(routing),
    metadata:    metadata
  )
end

.echo_request?(env) ⇒ `Boolean`

Returns:

(Boolean)



56
57
58

# File 'lib/legion/llm/api/debug_formats.rb', line 56

def self.echo_request?(env)
  enabled? && env[DEBUG_HEADER].to_s.downcase == DEBUG_ECHO_REQUEST
end

.emit_echo_request_sse(out, canonical_request) ⇒ `Object`

Streaming: emit a one-shot SSE event with the canonical request echo so the client can correlate the two endpoints. Best-effort; non-fatal on write failure.

# File 'lib/legion/llm/api/debug_formats.rb', line 96

def self.emit_echo_request_sse(out, canonical_request)
  out << "event: legion.debug.echo_request\ndata: #{Legion::JSON.dump(canonical_request.to_h)}\n\n"
rescue IOError, Errno::EPIPE
  nil
end

.enabled? ⇒ `Boolean`

Settings dig — debug surface enabled?

Returns:

(Boolean)

# File 'lib/legion/llm/api/debug_formats.rb', line 47

def self.enabled?
  settings = Legion::Settings[:llm][:api][:debug_formats]
  settings.is_a?(Hash) ? settings[:enabled] == true : false
end

.render_canonical_response(pipeline_response, canonical_request:, env:) ⇒ `Object`

Render a non-streaming canonical response as JSON. Returns

status, headers, body_string: suitable for use in a Sinatra action.

# File 'lib/legion/llm/api/debug_formats.rb', line 62

def self.render_canonical_response(pipeline_response, canonical_request:, env:)
  canonical_response = canonicalize_response(pipeline_response)
  payload = {
    object:           'canonical.response',
    contract_version: Canonical::CONTRACT_VERSION,
    response:         canonical_response.to_h
  }
  payload[:_legion_debug] = { echo_request: canonical_request.to_h } if echo_request?(env)
  [200,
   { 'Content-Type'              => 'application/json',
     'X-Legion-Contract-Version' => Canonical::CONTRACT_VERSION },
   Legion::JSON.dump(payload)]
end

.sanitize_routing(routing) ⇒ `Object`

# File 'lib/legion/llm/api/debug_formats.rb', line 329

def self.sanitize_routing(routing)
  return {} if routing.nil? || routing.empty?

  {
    provider: routing[:provider] || routing['provider'],
    model:    routing[:model]    || routing['model'],
    tier:     routing[:tier]     || routing['tier'],
    instance: routing[:instance] || routing['instance']
  }.compact
end

.token_value(tokens, *keys) ⇒ `Object`

# File 'lib/legion/llm/api/debug_formats.rb', line 340

def self.token_value(tokens, *keys)
  return nil if tokens.nil?

  keys.each do |key|
    value = if tokens.is_a?(Hash)
              tokens[key] || tokens[key.to_s]
            elsif tokens.respond_to?(key)
              tokens.public_send(key)
            end
    return value.to_i unless value.nil?
  end
  nil
end

Module: Legion::LLM::API::DebugFormats

Overview

Defined Under Namespace

Constant Summary collapse

Class Method Summary collapse

Class Method Details

.attach_echo_request(client_format_body, canonical_request) ⇒ Object

.canonical_event_emitter(out) ⇒ Object

.canonical_format?(env) ⇒ Boolean

.canonical_stop_reason(pipeline_response, tool_calls) ⇒ Object

.canonical_thinking(value) ⇒ Object

.canonical_tool_call(tool_call) ⇒ Object

.canonical_usage(tokens, _pipeline_response) ⇒ Object

.canonicalize_response(pipeline_response) ⇒ Object

.echo_request?(env) ⇒ Boolean

.emit_echo_request_sse(out, canonical_request) ⇒ Object

.enabled? ⇒ Boolean

.render_canonical_response(pipeline_response, canonical_request:, env:) ⇒ Object

.sanitize_routing(routing) ⇒ Object

.token_value(tokens, *keys) ⇒ Object

.attach_echo_request(client_format_body, canonical_request) ⇒ `Object`

.canonical_event_emitter(out) ⇒ `Object`

.canonical_format?(env) ⇒ `Boolean`

.canonical_stop_reason(pipeline_response, tool_calls) ⇒ `Object`

.canonical_thinking(value) ⇒ `Object`

.canonical_tool_call(tool_call) ⇒ `Object`

.canonical_usage(tokens, _pipeline_response) ⇒ `Object`

.canonicalize_response(pipeline_response) ⇒ `Object`

.echo_request?(env) ⇒ `Boolean`

.emit_echo_request_sse(out, canonical_request) ⇒ `Object`

.enabled? ⇒ `Boolean`

.render_canonical_response(pipeline_response, canonical_request:, env:) ⇒ `Object`

.sanitize_routing(routing) ⇒ `Object`

.token_value(tokens, *keys) ⇒ `Object`