Class: Woods::CostModel::Estimator
- Inherits:
-
Object
- Object
- Woods::CostModel::Estimator
- Defined in:
- lib/woods/cost_model/estimator.rb
Overview
Unified cost estimator that combines embedding, storage, and query costs into a single breakdown for a given configuration.
Instance Attribute Summary collapse
-
#chunk_multiplier ⇒ Float
readonly
Average chunks per unit.
-
#daily_queries ⇒ Integer
readonly
Number of retrieval queries per day.
-
#dimensions ⇒ Integer
readonly
Embedding vector dimensions.
-
#embedding_provider ⇒ Symbol
readonly
Embedding provider key.
-
#units ⇒ Integer
readonly
Number of extracted units.
Instance Method Summary collapse
-
#full_index_cost ⇒ Float
Cost to embed the full codebase index.
-
#incremental_per_merge_cost(changed_units: 5) ⇒ Float
Cost to re-embed a single merge (default 5 changed units).
-
#initialize(units:, embedding_provider:, chunk_multiplier: 2.5, dimensions: nil, daily_queries: 100) ⇒ Estimator
constructor
A new instance of Estimator.
-
#monthly_query_cost ⇒ Float
Monthly cost for query-time embedding.
-
#storage_bytes ⇒ Integer
Total storage in bytes for vector data.
-
#storage_mb ⇒ Float
Total storage in megabytes for vector data.
-
#to_h ⇒ Hash{Symbol => Numeric}
Full cost breakdown as a Hash.
-
#total_chunks ⇒ Integer
Total number of chunks for the codebase.
-
#yearly_incremental_cost(merges_per_year: 2400) ⇒ Float
Yearly embedding cost from incremental re-indexing.
Constructor Details
#initialize(units:, embedding_provider:, chunk_multiplier: 2.5, dimensions: nil, daily_queries: 100) ⇒ Estimator
Returns a new instance of Estimator.
42 43 44 45 46 47 48 49 50 51 |
# File 'lib/woods/cost_model/estimator.rb', line 42 def initialize(units:, embedding_provider:, chunk_multiplier: 2.5, dimensions: nil, daily_queries: 100) @units = units @chunk_multiplier = chunk_multiplier @embedding_provider = @dimensions = dimensions || ProviderPricing.default_dimensions() @daily_queries = daily_queries @embedding_cost = EmbeddingCost.new(provider: ) @storage_cost = StorageCost.new(dimensions: @dimensions) end |
Instance Attribute Details
#chunk_multiplier ⇒ Float (readonly)
Returns Average chunks per unit.
26 27 28 |
# File 'lib/woods/cost_model/estimator.rb', line 26 def chunk_multiplier @chunk_multiplier end |
#daily_queries ⇒ Integer (readonly)
Returns Number of retrieval queries per day.
35 36 37 |
# File 'lib/woods/cost_model/estimator.rb', line 35 def daily_queries @daily_queries end |
#dimensions ⇒ Integer (readonly)
Returns Embedding vector dimensions.
32 33 34 |
# File 'lib/woods/cost_model/estimator.rb', line 32 def dimensions @dimensions end |
#embedding_provider ⇒ Symbol (readonly)
Returns Embedding provider key.
29 30 31 |
# File 'lib/woods/cost_model/estimator.rb', line 29 def @embedding_provider end |
#units ⇒ Integer (readonly)
Returns Number of extracted units.
23 24 25 |
# File 'lib/woods/cost_model/estimator.rb', line 23 def units @units end |
Instance Method Details
#full_index_cost ⇒ Float
Cost to embed the full codebase index.
56 57 58 |
# File 'lib/woods/cost_model/estimator.rb', line 56 def full_index_cost @embedding_cost.full_index_cost(units: units, chunk_multiplier: chunk_multiplier) end |
#incremental_per_merge_cost(changed_units: 5) ⇒ Float
Cost to re-embed a single merge (default 5 changed units).
64 65 66 |
# File 'lib/woods/cost_model/estimator.rb', line 64 def incremental_per_merge_cost(changed_units: 5) @embedding_cost.incremental_cost(changed_units: changed_units, chunk_multiplier: chunk_multiplier) end |
#monthly_query_cost ⇒ Float
Monthly cost for query-time embedding.
71 72 73 |
# File 'lib/woods/cost_model/estimator.rb', line 71 def monthly_query_cost @embedding_cost.monthly_query_cost(daily_queries: daily_queries) end |
#storage_bytes ⇒ Integer
Total storage in bytes for vector data.
96 97 98 |
# File 'lib/woods/cost_model/estimator.rb', line 96 def storage_bytes @storage_cost.storage_bytes(chunks: total_chunks) end |
#storage_mb ⇒ Float
Total storage in megabytes for vector data.
103 104 105 |
# File 'lib/woods/cost_model/estimator.rb', line 103 def storage_mb @storage_cost.storage_mb(chunks: total_chunks) end |
#to_h ⇒ Hash{Symbol => Numeric}
Full cost breakdown as a Hash.
110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
# File 'lib/woods/cost_model/estimator.rb', line 110 def to_h { full_index_cost: full_index_cost, incremental_per_merge_cost: incremental_per_merge_cost, monthly_query_cost: monthly_query_cost, yearly_incremental_cost: yearly_incremental_cost, storage_bytes: storage_bytes, storage_mb: storage_mb, total_chunks: total_chunks, units: units, chunk_multiplier: chunk_multiplier, embedding_provider: , dimensions: dimensions, daily_queries: daily_queries } end |
#total_chunks ⇒ Integer
Total number of chunks for the codebase.
89 90 91 |
# File 'lib/woods/cost_model/estimator.rb', line 89 def total_chunks @total_chunks ||= (units * chunk_multiplier).ceil end |
#yearly_incremental_cost(merges_per_year: 2400) ⇒ Float
Yearly embedding cost from incremental re-indexing.
79 80 81 82 83 84 |
# File 'lib/woods/cost_model/estimator.rb', line 79 def yearly_incremental_cost(merges_per_year: 2400) @embedding_cost.yearly_incremental_cost( merges_per_year: merges_per_year, chunk_multiplier: chunk_multiplier ) end |