Class: OnnxRuntimeModel
- Inherits:
-
Object
- Object
- OnnxRuntimeModel
- Includes:
- EmbeddingModelProtocol
- Defined in:
- lib/kotoshu/embeddings/onnx_runtime_model.rb
Overview
OnnxRuntimeModel - ONNX Runtime wrapper for FastText embeddings
Provides embedding inference using ONNX Runtime. Supports single lookups, batch inference, and vocabulary-aware operations.
Constant Summary collapse
- DEFAULT_DIMENSION =
Default dimension for FastText models
300- BATCH_SIZE =
Batch size for batch inference
32
Instance Attribute Summary collapse
-
#dimension ⇒ Integer
readonly
Embedding dimension.
-
#inference_count ⇒ Integer
readonly
Number of inference calls.
-
#language_code ⇒ String
readonly
Language code (ISO 639-1).
-
#loaded ⇒ Boolean
readonly
Whether the model is loaded.
-
#onnx_path ⇒ String
readonly
Path to ONNX model file.
Class Method Summary collapse
-
.detect_language_from_path(path) ⇒ String
Detect language from file path.
-
.from_cache(language_code, cache = nil) ⇒ OnnxRuntimeModel?
Create model from cache.
-
.from_file(onnx_path, language_code: nil, dimension: nil) ⇒ OnnxRuntimeModel
Create model from file.
Instance Method Summary collapse
-
#batch_size ⇒ Integer
Get batch size for batch inference.
-
#get_embedding(index) ⇒ Array<Float>
Get embedding for a single word index.
-
#get_embedding_for_word(word, vocabulary) ⇒ Array<Float>?
Get embedding for a word using vocabulary.
-
#get_embeddings(indices) ⇒ Array<Array<Float>>
Get embeddings for multiple indices (batched).
-
#get_embeddings_for_words(words, vocabulary) ⇒ Hash<String, Array<Float>>
Get embeddings for multiple words using vocabulary.
-
#initialize(language_code:, onnx_path:, dimension: DEFAULT_DIMENSION) ⇒ OnnxRuntimeModel
constructor
Create a new ONNX Runtime model.
-
#load! ⇒ self
Load the ONNX model into memory.
-
#model_info ⇒ Hash
Get model information.
-
#model_type ⇒ String
Get model type identifier.
-
#preload_embeddings!(vocabulary) ⇒ Hash<Integer, Array<Float>>
Preload all embeddings into memory.
-
#ready? ⇒ Boolean
Check if model is ready for inference.
-
#supports_batching? ⇒ Boolean
Check if batching is supported.
-
#to_s ⇒ String
(also: #inspect)
String representation.
-
#unload! ⇒ self
Unload the model from memory.
Methods included from Protocol
#assert_implemented_by!, #compliance_errors, #optional_methods, #required_methods
Constructor Details
#initialize(language_code:, onnx_path:, dimension: DEFAULT_DIMENSION) ⇒ OnnxRuntimeModel
Create a new ONNX Runtime model
51 52 53 54 55 56 57 58 59 60 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 51 def initialize(language_code:, onnx_path:, dimension: DEFAULT_DIMENSION) @language_code = language_code @onnx_path = onnx_path @dimension = dimension @session = nil @loaded = false @input_name = nil @output_name = nil @inference_count = 0 end |
Instance Attribute Details
#dimension ⇒ Integer (readonly)
Returns Embedding dimension.
34 35 36 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 34 def dimension @dimension end |
#inference_count ⇒ Integer (readonly)
Returns Number of inference calls.
43 44 45 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 43 def inference_count @inference_count end |
#language_code ⇒ String (readonly)
Returns Language code (ISO 639-1).
31 32 33 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 31 def language_code @language_code end |
#loaded ⇒ Boolean (readonly)
Returns Whether the model is loaded.
40 41 42 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 40 def loaded @loaded end |
#onnx_path ⇒ String (readonly)
Returns Path to ONNX model file.
37 38 39 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 37 def onnx_path @onnx_path end |
Class Method Details
.detect_language_from_path(path) ⇒ String
Detect language from file path
379 380 381 382 383 384 385 386 387 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 379 def self.detect_language_from_path(path) basename = File.basename(path) if basename =~ /\.([a-z]{2})\./i Regexp.last_match(1).downcase else 'en' end end |
.from_cache(language_code, cache = nil) ⇒ OnnxRuntimeModel?
Create model from cache
259 260 261 262 263 264 265 266 267 268 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 259 def self.from_cache(language_code, cache = nil) require_relative '../cache/model_cache' cache ||= Cache::ModelCache.new onnx_path = cache.get_onnx_model(language_code) return nil unless onnx_path from_file(onnx_path, language_code: language_code) end |
.from_file(onnx_path, language_code: nil, dimension: nil) ⇒ OnnxRuntimeModel
Create model from file
240 241 242 243 244 245 246 247 248 249 250 251 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 240 def self.from_file(onnx_path, language_code: nil, dimension: nil) raise ArgumentError, "ONNX file not found: #{onnx_path}" unless File.exist?(onnx_path) language_code ||= detect_language_from_path(onnx_path) dimension ||= DEFAULT_DIMENSION new( language_code: language_code, onnx_path: onnx_path, dimension: dimension ) end |
Instance Method Details
#batch_size ⇒ Integer
Get batch size for batch inference
206 207 208 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 206 def batch_size BATCH_SIZE end |
#get_embedding(index) ⇒ Array<Float>
Get embedding for a single word index
114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 114 def (index) ensure_loaded raise ArgumentError, "Invalid word index: #{index}" unless valid_index?(index) output = @session.run( [@output_name], { @input_name => [index] } ) @inference_count += 1 (output.first) end |
#get_embedding_for_word(word, vocabulary) ⇒ Array<Float>?
Get embedding for a word using vocabulary
172 173 174 175 176 177 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 172 def (word, vocabulary) index = vocabulary.lookup(word) return nil unless index (index) end |
#get_embeddings(indices) ⇒ Array<Array<Float>>
Get embeddings for multiple indices (batched)
More efficient than individual calls for batch operations.
136 137 138 139 140 141 142 143 144 145 146 147 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 136 def (indices) ensure_loaded return [] if indices.nil? || indices.empty? valid_indices = indices.select { |i| valid_index?(i) } return [] if valid_indices.empty? # Process in batches for memory efficiency valid_indices.each_slice(BATCH_SIZE).flat_map do |batch| run_batch_inference(batch) end end |
#get_embeddings_for_words(words, vocabulary) ⇒ Hash<String, Array<Float>>
Get embeddings for multiple words using vocabulary
185 186 187 188 189 190 191 192 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 185 def (words, vocabulary) result = {} words.each do |word| = (word, vocabulary) result[word] = if end result end |
#load! ⇒ self
Load the ONNX model into memory
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 69 def load! return self if @loaded raise Kotoshu::Models::OnnxModel::OnnxUnavailable unless Kotoshu::Models::OnnxModel::ONNX_LOADED raise ArgumentError, "ONNX file not found: #{@onnx_path}" unless File.exist?(@onnx_path) @session = OnnxRuntime::InferenceSession.new(@onnx_path) # Detect input/output names @input_name = detect_input_name @output_name = detect_output_name @loaded = true self end |
#model_info ⇒ Hash
Get model information
222 223 224 225 226 227 228 229 230 231 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 222 def model_info { type: 'onnx', language: @language_code, dimension: @dimension, path: @onnx_path, loaded: @loaded, inference_count: @inference_count } end |
#model_type ⇒ String
Get model type identifier
214 215 216 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 214 def model_type 'onnx' end |
#preload_embeddings!(vocabulary) ⇒ Hash<Integer, Array<Float>>
Preload all embeddings into memory
For small vocabularies, this provides O(1) lookup after loading.
156 157 158 159 160 161 162 163 164 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 156 def (vocabulary) ensure_loaded all_indices = (0...vocabulary.size).to_a = (all_indices) # Build index mapping all_indices.zip().to_h end |
#ready? ⇒ Boolean
Check if model is ready for inference
102 103 104 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 102 def ready? @loaded && !@session.nil? end |
#supports_batching? ⇒ Boolean
Check if batching is supported
198 199 200 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 198 def supports_batching? true end |
#to_s ⇒ String Also known as: inspect
String representation
274 275 276 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 274 def to_s "OnnxRuntimeModel(language: #{@language_code}, dimension: #{@dimension}, loaded: #{@loaded})" end |
#unload! ⇒ self
Unload the model from memory
90 91 92 93 94 95 96 |
# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 90 def unload! @session = nil @input_name = nil @output_name = nil @loaded = false self end |