Class: OllamaAgent::Providers::QuotaTracker
- Inherits:
-
Object
- Object
- OllamaAgent::Providers::QuotaTracker
- Defined in:
- lib/ollama_agent/providers/quota_tracker.rb
Overview
Per-credential usage accounting against declared provider limits.
Tracks daily token/request consumption and live RPM/TPM windows. Since most providers don’t expose a real-time quota API, usage is estimated locally from response metadata (prompt_tokens + completion_tokens returned by each successful call).
Daily counters auto-reset at midnight UTC.
Constant Summary collapse
- NEAR_EXHAUSTION_PCT =
0.90
Instance Method Summary collapse
-
#daily_utilisation ⇒ Float
Quota utilisation as a fraction [0.0, 1.0] of the daily token limit.
-
#exhausted? ⇒ Boolean
True when daily hard limits are hit (requests will be rejected by the provider).
-
#initialize(limits: {}) ⇒ QuotaTracker
constructor
A new instance of QuotaTracker.
-
#near_exhaustion? ⇒ Boolean
True when usage is approaching exhaustion (>= 90% of daily token limit).
-
#record(usage) ⇒ Object
Record usage from a successful response.
-
#summary ⇒ Hash
Full usage snapshot for TUI and telemetry.
Constructor Details
#initialize(limits: {}) ⇒ QuotaTracker
Returns a new instance of QuotaTracker.
30 31 32 33 34 35 36 37 38 |
# File 'lib/ollama_agent/providers/quota_tracker.rb', line 30 def initialize(limits: {}) @limits = limits.transform_keys(&:to_sym) @daily_tokens = 0 @daily_requests = 0 @daily_reset_at = next_midnight @rpm_window = RateWindow.new(window_seconds: 60) @tpm_window = RateWindow.new(window_seconds: 60) @mutex = Mutex.new end |
Instance Method Details
#daily_utilisation ⇒ Float
Quota utilisation as a fraction [0.0, 1.0] of the daily token limit. Returns 0.0 if no daily token limit is configured.
80 81 82 83 84 85 |
# File 'lib/ollama_agent/providers/quota_tracker.rb', line 80 def daily_utilisation @mutex.synchronize do maybe_reset_daily! daily_pct end end |
#exhausted? ⇒ Boolean
True when daily hard limits are hit (requests will be rejected by the provider).
60 61 62 63 64 65 |
# File 'lib/ollama_agent/providers/quota_tracker.rb', line 60 def exhausted? @mutex.synchronize do maybe_reset_daily! daily_token_limit_hit? || daily_request_limit_hit? end end |
#near_exhaustion? ⇒ Boolean
True when usage is approaching exhaustion (>= 90% of daily token limit). Used for predictive rerouting before a hard failure occurs.
70 71 72 73 74 75 |
# File 'lib/ollama_agent/providers/quota_tracker.rb', line 70 def near_exhaustion? @mutex.synchronize do maybe_reset_daily! daily_pct >= NEAR_EXHAUSTION_PCT end end |
#record(usage) ⇒ Object
Record usage from a successful response.
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
# File 'lib/ollama_agent/providers/quota_tracker.rb', line 42 def record(usage) return unless usage tokens = (usage[:total_tokens] || usage["total_tokens"] || 0).to_i @mutex.synchronize do maybe_reset_daily! @daily_tokens += tokens @daily_requests += 1 end # Rate windows have their own mutex @rpm_window.record(1) @tpm_window.record(tokens) end |
#summary ⇒ Hash
Full usage snapshot for TUI and telemetry.
89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
# File 'lib/ollama_agent/providers/quota_tracker.rb', line 89 def summary @mutex.synchronize do maybe_reset_daily! { daily_tokens: @daily_tokens, daily_tokens_limit: @limits[:daily_tokens], daily_requests: @daily_requests, daily_requests_limit: @limits[:daily_requests], rpm: @rpm_window.current_rate, rpm_limit: @limits[:rpm], tpm: @tpm_window.current_rate, tpm_limit: @limits[:tpm], daily_pct: daily_pct, resets_at: @daily_reset_at } end end |