Class: Pikuri::Agent::ContextWindowDetector

Inherits:
Object
  • Object
show all
Defined in:
lib/pikuri/agent/context_window_detector.rb

Overview

Resolves the model’s context-window cap from three sources, in order: an explicit override, the value ruby_llm reports for the model, or a llama.cpp /props probe. Returns nil if none of those produce a value.

Used by #initialize at construction time to feed Listener::TokenLog a cap it can render alongside the running context size (so the ctx=12.2k/32.0k line tells the operator how close the conversation is to the limit).

Precedence

  1. override — the Agent.new(context_window:) kwarg. Wins over everything; an explicit value is the operator’s statement of truth.

  2. ruby_llm_reportedRubyLLM::Model::Info#context_window from #chat‘s resolved model. Populated for models in ruby_llm’s bundled registry (OpenAI, Anthropic, Gemini, …); nil for custom local model ids that fall through to Model::Info.default.

  3. llama_probe_url — HTTP GET against llama.cpp’s non-standard /props endpoint. The server exposes the launched n_ctx at default_generation_settings.n_ctx there. Probed only when the first two are nil. Provider-specific to llama.cpp; the caller (typically bin/pikuri-chat) derives the right URL from its configured base.

llama.cpp router mode

A llama.cpp router (the multi-instance front that proxies to N on-demand model servers) answers a bare /props with {“role”:“router”, …, “n_ctx”:0} — there is no single loaded model at the router itself, so its top-level n_ctx is 0. The real per-model cap is one proxied hop away: GET /props?model=<id> routes the probe to that model’s instance, whose /props carries the launched n_ctx. So when the bare probe reports role: router and a model_id is known, this re-probes with the model id before giving up. A plain single-model server is untouched: its bare /props already carries a positive n_ctx, so the router branch never runs.

Failure handling

The probe is best-effort. HTTP error, timeout, non-JSON body, or a missing/invalid n_ctx field all return nil and log one warn line via Pikuri.logger_for(‘ContextWindowDetector’). This is the CLAUDE.md “secondary to the loop” carve-out — a wedged or non-llama.cpp server should not abort agent construction over a cosmetic readout.

Constant Summary collapse

LOGGER =

Subsystem logger; set its level with PIKURI_LOG_CONTEXTWINDOWDETECTOR or the global PIKURI_LOG.

Returns:

  • (Logger)
Pikuri.logger_for('ContextWindowDetector')
OPEN_TIMEOUT =

Connect timeout in seconds for the llama.cpp /props probe. Short on purpose: this runs synchronously during Agent.new and a wedged server should not stall startup noticeably.

Returns:

  • (Integer)
2
READ_TIMEOUT =

Read timeout in seconds for the llama.cpp /props probe; matches OPEN_TIMEOUT for the same reason.

Returns:

  • (Integer)
2

Instance Method Summary collapse

Constructor Details

#initialize(override:, ruby_llm_reported:, llama_probe_url:, model_id: nil) ⇒ ContextWindowDetector

Returns a new instance of ContextWindowDetector.

Parameters:

  • override (Integer, nil)

    explicit cap from the caller; wins if non-nil

  • ruby_llm_reported (Integer, nil)

    value off RubyLLM::Chat#model.context_window

  • llama_probe_url (String, nil)

    full URL to llama.cpp /props; nil or empty string skips the probe

  • model_id (String, nil) (defaults to: nil)

    the chat model id, used only to follow a llama.cpp router via /props?model=<id> when the bare probe reports role: router. nil or empty disables that second hop.



86
87
88
89
90
91
# File 'lib/pikuri/agent/context_window_detector.rb', line 86

def initialize(override:, ruby_llm_reported:, llama_probe_url:, model_id: nil)
  @override = override
  @ruby_llm_reported = ruby_llm_reported
  @llama_probe_url = llama_probe_url
  @model_id = model_id
end

Instance Method Details

#detectInteger?

Returns resolved cap, or nil if no source produced one.

Returns:

  • (Integer, nil)

    resolved cap, or nil if no source produced one



95
96
97
98
99
100
101
# File 'lib/pikuri/agent/context_window_detector.rb', line 95

def detect
  return @override if @override
  return @ruby_llm_reported if @ruby_llm_reported
  return nil if @llama_probe_url.nil? || @llama_probe_url.empty?

  probe_llama_cpp
end