Module: RubynCode::Observability::CostCalculator

Defined in:: lib/rubyn_code/observability/cost_calculator.rb

Overview

Maps model identifiers and token counts to USD cost.

Pricing is based on per-million-token rates. Cache reads are billed at 10% of the input rate; cache writes at 25% of the input rate.

Constant Summary collapse

PRICING = Per-million-token rates: { model_prefix => [input_rate, output_rate] }

{
  # Anthropic — Claude 4.6
  'claude-haiku-4-5' => [1.00, 5.00],
  'claude-sonnet-4-6' => [3.00, 15.00],
  'claude-opus-4-6' => [15.00, 75.00],
  # OpenAI — GPT-5.4
  'gpt-5.4' => [2.50, 10.00],
  'gpt-5.4-mini' => [0.15, 0.60],
  'gpt-5.4-nano' => [0.10, 0.40],
  # OpenAI — legacy
  'gpt-4o' => [2.50, 10.00],
  'gpt-4o-mini' => [0.15, 0.60],
  'o3' => [2.00, 8.00],
  'o4-mini' => [1.10, 4.40]
}.freeze

CACHE_READ_DISCOUNT =

0.1

CACHE_WRITE_PREMIUM =

1.25

Class Method Summary collapse

.calculate(model:, input_tokens:, output_tokens:, cache_read_tokens: 0, cache_write_tokens: 0) ⇒ Float

Calculates the USD cost for a single API call.

Class Method Details

.calculate(model:, input_tokens:, output_tokens:, cache_read_tokens: 0, cache_write_tokens: 0) ⇒ `Float`

Calculates the USD cost for a single API call.

Parameters:

model (String) —

the model identifier (exact or prefix match)
input_tokens (Integer) —

number of input tokens
output_tokens (Integer) —

number of output tokens
cache_read_tokens (Integer) (defaults to: 0) —

tokens served from cache
cache_write_tokens (Integer) (defaults to: 0) —

tokens written to cache

Returns:

(Float) —

cost in USD