Class: Pikuri::Agent::ContextWindowDetector

Inherits:

Object

Object
Pikuri::Agent::ContextWindowDetector

show all

Defined in:: lib/pikuri/agent/context_window_detector.rb

Overview

Resolves the model’s context-window cap from three sources, in order: an explicit override, the value ruby_llm reports for the model, or a llama.cpp /props probe. Returns nil if none of those produce a value.

Used by #initialize at construction time to feed Listener::TokenLog a cap it can render alongside the running context size (so the ctx=12.2k/32.0k line tells the operator how close the conversation is to the limit).

Precedence

override — the Agent.new(context_window:) kwarg. Wins over everything; an explicit value is the operator’s statement of truth.
ruby_llm_reported — RubyLLM::Model::Info#context_window from #chat‘s resolved model. Populated for models in ruby_llm’s bundled registry (OpenAI, Anthropic, Gemini, …); nil for custom local model ids that fall through to Model::Info.default.
llama_probe_url — HTTP GET against llama.cpp’s non-standard /props endpoint. The server exposes the launched n_ctx at default_generation_settings.n_ctx there. Probed only when the first two are nil. Provider-specific to llama.cpp; the caller (typically bin/pikuri-chat) derives the right URL from its configured base.

llama.cpp router mode

A llama.cpp router (the multi-instance front that proxies to N on-demand model servers) answers a bare /props with {“role”:“router”, …, “n_ctx”:0} — there is no single loaded model at the router itself, so its top-level n_ctx is 0. The real per-model cap is one proxied hop away: GET /props?model=<id> routes the probe to that model’s instance, whose /props carries the launched n_ctx. So when the bare probe reports role: router and a model_id is known, this re-probes with the model id before giving up. A plain single-model server is untouched: its bare /props already carries a positive n_ctx, so the router branch never runs.

Failure handling

The probe is best-effort. HTTP error, timeout, non-JSON body, or a missing/invalid n_ctx field all return nil and log one warn line via Pikuri.logger_for(‘ContextWindowDetector’). This is the CLAUDE.md “secondary to the loop” carve-out — a wedged or non-llama.cpp server should not abort agent construction over a cosmetic readout.

Constant Summary collapse

LOGGER = Subsystem logger; set its level with PIKURI_LOG_CONTEXTWINDOWDETECTOR or the global PIKURI_LOG. Returns: (Logger)

Pikuri.logger_for('ContextWindowDetector')

OPEN_TIMEOUT = Connect timeout in seconds for the llama.cpp /props probe. Short on purpose: this runs synchronously during Agent.new and a wedged server should not stall startup noticeably. Returns: (Integer)

READ_TIMEOUT = Read timeout in seconds for the llama.cpp /props probe; matches OPEN_TIMEOUT for the same reason. Returns: (Integer)

Instance Method Summary collapse

#detect ⇒ Integer^?

Resolved cap, or nil if no source produced one.
#initialize(override:, ruby_llm_reported:, llama_probe_url:, model_id: nil) ⇒ ContextWindowDetector constructor

A new instance of ContextWindowDetector.

Constructor Details

#initialize(override:, ruby_llm_reported:, llama_probe_url:, model_id: nil) ⇒ `ContextWindowDetector`

Returns a new instance of ContextWindowDetector.

Parameters:

override (Integer, nil) —

explicit cap from the caller; wins if non-nil
ruby_llm_reported (Integer, nil) —

value off RubyLLM::Chat#model.context_window
llama_probe_url (String, nil) —

full URL to llama.cpp /props; nil or empty string skips the probe
model_id (String, nil) (defaults to: nil) —

the chat model id, used only to follow a llama.cpp router via /props?model=<id> when the bare probe reports role: router. nil or empty disables that second hop.

# File 'lib/pikuri/agent/context_window_detector.rb', line 86

def initialize(override:, ruby_llm_reported:, llama_probe_url:, model_id: nil)
  @override = override
  @ruby_llm_reported = ruby_llm_reported
  @llama_probe_url = llama_probe_url
  @model_id = model_id
end

Instance Method Details

#detect ⇒ `Integer`^?

Returns resolved cap, or nil if no source produced one.

Returns:

(Integer, nil) —

resolved cap, or nil if no source produced one

# File 'lib/pikuri/agent/context_window_detector.rb', line 95

def detect
  return @override if @override
  return @ruby_llm_reported if @ruby_llm_reported
  return nil if @llama_probe_url.nil? || @llama_probe_url.empty?

  probe_llama_cpp
end