Module: Rubino::LLM::ThinkingSupport
- Defined in:
- lib/rubino/llm/thinking_support.rb
Overview
Session-scoped memory of providers that rejected an Anthropic-style thinking budget, plus the detector for that rejection (#75), plus the static per-provider capability gate (#2).
Process-level (not per-adapter) because Lifecycle rebuilds the adapter every turn — and one CLI process serves one chat session, so this is exactly “remember for the session”. RubyLLMAdapter consults it before rendering a budget and marks it on a recognised rejection, so the provider is never sent a budget again this session.
Class Method Summary collapse
-
.budget_via_params?(provider_cfg, chat) ⇒ Boolean
providers.<name>.supports_thinking: true is the user’s explicit promise that the backend accepts an Anthropic-style thinking block.
-
.mark_unsupported!(provider, notify: nil) ⇒ Object
Records the rejection and tells the user once with a dim note (only the marking path emits it).
-
.rejection?(error) ⇒ Boolean
True when
errorreads as a provider’s “thinking (budget) is not supported” rejection. -
.reset! ⇒ Object
Test seam: forget all recorded rejections (a fresh “session”).
-
.supports?(provider_cfg, _model_id = nil) ⇒ Boolean
Per-provider thinking CAPABILITY gate (#2).
- .unsupported?(provider) ⇒ Boolean
Class Method Details
.budget_via_params?(provider_cfg, chat) ⇒ Boolean
providers.<name>.supports_thinking: true is the user’s explicit promise that the backend accepts an Anthropic-style thinking block. ruby_llm 1.16 only renders with_thinking when the model’s REGISTRY entry declares a budget_tokens reasoning option; an assume-model-exists model (MiniMax-M3 on the anthropic-compatible path) declares none, so with_thinking raised client-side before any request, the #75 rejection detector matched the message, and the documented opt-in silently died every turn (#175). On that path the adapter puts the wire payload on with_params instead, which ruby_llm deep-merges into the request body unconditionally.
54 55 56 57 58 59 60 61 62 |
# File 'lib/rubino/llm/thinking_support.rb', line 54 def budget_via_params?(provider_cfg, chat) model = chat.respond_to?(:model) ? chat.model : nil model_id = model.respond_to?(:id) ? model.id : model return false unless supports?(provider_cfg, model_id) !(model.respond_to?(:reasoning_option) && model.reasoning_option("budget_tokens")) rescue StandardError true end |
.mark_unsupported!(provider, notify: nil) ⇒ Object
Records the rejection and tells the user once with a dim note (only the marking path emits it). Cosmetic: a UI failure must never break the retried turn.
67 68 69 70 71 72 |
# File 'lib/rubino/llm/thinking_support.rb', line 67 def mark_unsupported!(provider, notify: nil) @unsupported[provider.to_s] = true notify&.note("provider doesn't support thinking — effort off") rescue StandardError nil end |
.rejection?(error) ⇒ Boolean
True when error reads as a provider’s “thinking (budget) is not supported” rejection. Kept narrow: the message must name thinking plus a not-supported phrasing.
82 83 84 85 86 |
# File 'lib/rubino/llm/thinking_support.rb', line 82 def rejection?(error) msg = error..to_s.downcase msg.include?("thinking") && (msg.include?("not support") || msg.include?("unsupported")) end |
.reset! ⇒ Object
Test seam: forget all recorded rejections (a fresh “session”).
75 76 77 |
# File 'lib/rubino/llm/thinking_support.rb', line 75 def reset! @unsupported = {} end |
.supports?(provider_cfg, _model_id = nil) ⇒ Boolean
Per-provider thinking CAPABILITY gate (#2). providers.<name>.supports_thinking (true/false) is the explicit override; unset, thinking defaults ON for every provider. Thinking only travels on the anthropic-family path (the budget is zeroed elsewhere by the adapter), so this is effectively “request reasoning on anthropic-compatible backends” — which MiniMax-M3 streams as proper ‘thinking` deltas (verified), exactly like the reference agent’s default ‘reasoning_effort: medium`. WITHOUT it M3 produces ~10s of dead air while it reasons toward a tool-call (the pre-tool-call freeze). A backend that genuinely rejects the budget is caught by #rejection? (#75) and the adapter retries once without it, then memoizes — so default-on is safe. (The earlier MiniMax-default-false workaround predated the with_params injection path, which routes the block cleanly instead of leaking it.)
37 38 39 40 41 42 |
# File 'lib/rubino/llm/thinking_support.rb', line 37 def supports?(provider_cfg, _model_id = nil) configured = provider_cfg["supports_thinking"] return configured unless configured.nil? true end |
.unsupported?(provider) ⇒ Boolean
21 22 23 |
# File 'lib/rubino/llm/thinking_support.rb', line 21 def unsupported?(provider) @unsupported.key?(provider.to_s) end |