Class: Woods::CostModel::EmbeddingCost
- Inherits:
-
Object
- Object
- Woods::CostModel::EmbeddingCost
- Defined in:
- lib/woods/cost_model/embedding_cost.rb
Overview
Calculates embedding costs for full-index, incremental, and query-time scenarios using the token-based pricing from ProviderPricing.
The cost model uses a constant of 450 tokens per chunk, derived from the BACKEND_MATRIX.md tables (e.g. 500 units × 2.5 chunks = 1250 chunks × 450 = 562K tokens).
Constant Summary collapse
- TOKENS_PER_CHUNK =
Average tokens per chunk after hierarchical chunking with context prefix.
450- TOKENS_PER_QUERY =
Average tokens per retrieval query.
100
Instance Method Summary collapse
-
#full_index_cost(units:, chunk_multiplier: 2.5) ⇒ Float
Cost to embed the full codebase index.
-
#incremental_cost(changed_units: 5, chunk_multiplier: 2.5) ⇒ Float
Cost to re-embed changed units from a single merge.
-
#initialize(provider:) ⇒ EmbeddingCost
constructor
A new instance of EmbeddingCost.
-
#monthly_query_cost(daily_queries:) ⇒ Float
Monthly cost for query-time embedding.
-
#total_tokens(units, chunk_multiplier) ⇒ Integer
Total tokens for a given number of units and chunk multiplier.
-
#yearly_incremental_cost(merges_per_year: 2400, changed_units_per_merge: 5, chunk_multiplier: 2.5) ⇒ Float
Yearly embedding cost from incremental re-indexing.
Constructor Details
#initialize(provider:) ⇒ EmbeddingCost
Returns a new instance of EmbeddingCost.
23 24 25 |
# File 'lib/woods/cost_model/embedding_cost.rb', line 23 def initialize(provider:) @cost_per_million = ProviderPricing.cost_per_million(provider) end |
Instance Method Details
#full_index_cost(units:, chunk_multiplier: 2.5) ⇒ Float
Cost to embed the full codebase index.
32 33 34 35 |
# File 'lib/woods/cost_model/embedding_cost.rb', line 32 def full_index_cost(units:, chunk_multiplier: 2.5) tokens = total_tokens(units, chunk_multiplier) token_cost(tokens) end |
#incremental_cost(changed_units: 5, chunk_multiplier: 2.5) ⇒ Float
Cost to re-embed changed units from a single merge.
42 43 44 45 |
# File 'lib/woods/cost_model/embedding_cost.rb', line 42 def incremental_cost(changed_units: 5, chunk_multiplier: 2.5) tokens = total_tokens(changed_units, chunk_multiplier) token_cost(tokens) end |
#monthly_query_cost(daily_queries:) ⇒ Float
Monthly cost for query-time embedding.
51 52 53 54 |
# File 'lib/woods/cost_model/embedding_cost.rb', line 51 def monthly_query_cost(daily_queries:) monthly_tokens = daily_queries * 30 * TOKENS_PER_QUERY token_cost(monthly_tokens) end |
#total_tokens(units, chunk_multiplier) ⇒ Integer
Total tokens for a given number of units and chunk multiplier.
72 73 74 75 |
# File 'lib/woods/cost_model/embedding_cost.rb', line 72 def total_tokens(units, chunk_multiplier) chunks = (units * chunk_multiplier).ceil chunks * TOKENS_PER_CHUNK end |
#yearly_incremental_cost(merges_per_year: 2400, changed_units_per_merge: 5, chunk_multiplier: 2.5) ⇒ Float
Yearly embedding cost from incremental re-indexing.
62 63 64 65 |
# File 'lib/woods/cost_model/embedding_cost.rb', line 62 def yearly_incremental_cost(merges_per_year: 2400, changed_units_per_merge: 5, chunk_multiplier: 2.5) tokens_per_merge = total_tokens(changed_units_per_merge, chunk_multiplier) token_cost(tokens_per_merge * merges_per_year) end |