Class: Pikuri::Agent::ContextWindowDetector
- Inherits:
-
Object
- Object
- Pikuri::Agent::ContextWindowDetector
- Defined in:
- lib/pikuri/agent/context_window_detector.rb
Overview
Resolves the model’s context-window cap from three sources, in order: an explicit override, the value ruby_llm reports for the model, or a llama.cpp /props probe. Returns nil if none of those produce a value.
Used by #initialize at construction time to feed Listener::TokenLog a cap it can render alongside the running context size (so the ctx=12.2k/32.0k line tells the operator how close the conversation is to the limit).
Precedence
-
override— the Agent.new(context_window:) kwarg. Wins over everything; an explicit value is the operator’s statement of truth. -
ruby_llm_reported— RubyLLM::Model::Info#context_window from #chat‘s resolved model. Populated for models in ruby_llm’s bundled registry (OpenAI, Anthropic, Gemini, …);nilfor custom local model ids that fall through toModel::Info.default. -
llama_probe_url— HTTP GET against llama.cpp’s non-standard/propsendpoint. The server exposes the launchedn_ctxatdefault_generation_settings.n_ctxthere. Probed only when the first two arenil. Provider-specific to llama.cpp; the caller (typicallybin/pikuri-chat) derives the right URL from its configured base.
llama.cpp router mode
A llama.cpp router (the multi-instance front that proxies to N on-demand model servers) answers a bare /props with {“role”:“router”, …, “n_ctx”:0} — there is no single loaded model at the router itself, so its top-level n_ctx is 0. The real per-model cap is one proxied hop away: GET /props?model=<id> routes the probe to that model’s instance, whose /props carries the launched n_ctx. So when the bare probe reports role: router and a model_id is known, this re-probes with the model id before giving up. A plain single-model server is untouched: its bare /props already carries a positive n_ctx, so the router branch never runs.
Failure handling
The probe is best-effort. HTTP error, timeout, non-JSON body, or a missing/invalid n_ctx field all return nil and log one warn line via Pikuri.logger_for(‘ContextWindowDetector’). This is the CLAUDE.md “secondary to the loop” carve-out — a wedged or non-llama.cpp server should not abort agent construction over a cosmetic readout.
Constant Summary collapse
- LOGGER =
Subsystem logger; set its level with
PIKURI_LOG_CONTEXTWINDOWDETECTORor the globalPIKURI_LOG. Pikuri.logger_for('ContextWindowDetector')
- OPEN_TIMEOUT =
Connect timeout in seconds for the llama.cpp
/propsprobe. Short on purpose: this runs synchronously duringAgent.newand a wedged server should not stall startup noticeably. 2- READ_TIMEOUT =
Read timeout in seconds for the llama.cpp
/propsprobe; matches OPEN_TIMEOUT for the same reason. 2
Instance Method Summary collapse
-
#detect ⇒ Integer?
Resolved cap, or
nilif no source produced one. -
#initialize(override:, ruby_llm_reported:, llama_probe_url:, model_id: nil) ⇒ ContextWindowDetector
constructor
A new instance of ContextWindowDetector.
Constructor Details
#initialize(override:, ruby_llm_reported:, llama_probe_url:, model_id: nil) ⇒ ContextWindowDetector
Returns a new instance of ContextWindowDetector.
86 87 88 89 90 91 |
# File 'lib/pikuri/agent/context_window_detector.rb', line 86 def initialize(override:, ruby_llm_reported:, llama_probe_url:, model_id: nil) @override = override @ruby_llm_reported = ruby_llm_reported @llama_probe_url = llama_probe_url @model_id = model_id end |
Instance Method Details
#detect ⇒ Integer?
Returns resolved cap, or nil if no source produced one.
95 96 97 98 99 100 101 |
# File 'lib/pikuri/agent/context_window_detector.rb', line 95 def detect return @override if @override return @ruby_llm_reported if @ruby_llm_reported return nil if @llama_probe_url.nil? || @llama_probe_url.empty? probe_llama_cpp end |