Class: OnnxRuntimeModel

Inherits:

Object

Object
OnnxRuntimeModel

show all

Includes:: EmbeddingModelProtocol

Defined in:: lib/kotoshu/embeddings/onnx_runtime_model.rb

Overview

OnnxRuntimeModel - ONNX Runtime wrapper for FastText embeddings

Provides embedding inference using ONNX Runtime. Supports single lookups, batch inference, and vocabulary-aware operations.

Examples:

Single embedding lookup

model = OnnxRuntimeModel.from_file('fasttext.en.onnx', language_code: 'en')
model.load!
embedding = model.get_embedding(1234)

Batch lookup

embeddings = model.get_embeddings([1, 2, 3, 4, 5])

With vocabulary

embedding = model.get_embedding_for_word('hello', vocabulary)

Constant Summary collapse

DEFAULT_DIMENSION = Default dimension for FastText models

BATCH_SIZE = Batch size for batch inference

Instance Attribute Summary collapse

#dimension ⇒ Integer readonly

Embedding dimension.
#inference_count ⇒ Integer readonly

Number of inference calls.
#language_code ⇒ String readonly

Language code (ISO 639-1).
#loaded ⇒ Boolean readonly

Whether the model is loaded.
#onnx_path ⇒ String readonly

Path to ONNX model file.

Class Method Summary collapse

.detect_language_from_path(path) ⇒ String

Detect language from file path.
.from_cache(language_code, cache = nil) ⇒ OnnxRuntimeModel^?

Create model from cache.
.from_file(onnx_path, language_code: nil, dimension: nil) ⇒ OnnxRuntimeModel

Create model from file.

Instance Method Summary collapse

#batch_size ⇒ Integer

Get batch size for batch inference.
#get_embedding(index) ⇒ Array<Float>

Get embedding for a single word index.
#get_embedding_for_word(word, vocabulary) ⇒ Array<Float>^?

Get embedding for a word using vocabulary.
#get_embeddings(indices) ⇒ Array<Array<Float>>

Get embeddings for multiple indices (batched).
#get_embeddings_for_words(words, vocabulary) ⇒ Hash<String, Array<Float>>

Get embeddings for multiple words using vocabulary.
#initialize(language_code:, onnx_path:, dimension: DEFAULT_DIMENSION) ⇒ OnnxRuntimeModel constructor

Create a new ONNX Runtime model.
#load! ⇒ self

Load the ONNX model into memory.
#model_info ⇒ Hash

Get model information.
#model_type ⇒ String

Get model type identifier.
#preload_embeddings!(vocabulary) ⇒ Hash<Integer, Array<Float>>

Preload all embeddings into memory.
#ready? ⇒ Boolean

Check if model is ready for inference.
#supports_batching? ⇒ Boolean

Check if batching is supported.
#to_s ⇒ String (also: #inspect)

String representation.
#unload! ⇒ self

Unload the model from memory.

Methods included from Protocol

#assert_implemented_by!, #compliance_errors, #optional_methods, #required_methods

Constructor Details

#initialize(language_code:, onnx_path:, dimension: DEFAULT_DIMENSION) ⇒ `OnnxRuntimeModel`

Create a new ONNX Runtime model

Parameters:

language_code (String) —

ISO 639-1 language code
onnx_path (String) —

Path to .onnx file
dimension (Integer) (defaults to: DEFAULT_DIMENSION) —

Embedding dimension (default: 300)

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 51

def initialize(language_code:, onnx_path:, dimension: DEFAULT_DIMENSION)
  @language_code = language_code
  @onnx_path = onnx_path
  @dimension = dimension
  @session = nil
  @loaded = false
  @input_name = nil
  @output_name = nil
  @inference_count = 0
end

Instance Attribute Details

#dimension ⇒ `Integer` (readonly)

Returns Embedding dimension.

Returns:

(Integer) —

Embedding dimension



34
35
36

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 34

def dimension
  @dimension
end

#inference_count ⇒ `Integer` (readonly)

Returns Number of inference calls.

Returns:

(Integer) —

Number of inference calls



43
44
45

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 43

def inference_count
  @inference_count
end

#language_code ⇒ `String` (readonly)

Returns Language code (ISO 639-1).

Returns:

(String) —

Language code (ISO 639-1)



31
32
33

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 31

def language_code
  @language_code
end

#loaded ⇒ `Boolean` (readonly)

Returns Whether the model is loaded.

Returns:

(Boolean) —

Whether the model is loaded



40
41
42

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 40

def loaded
  @loaded
end

#onnx_path ⇒ `String` (readonly)

Returns Path to ONNX model file.

Returns:

(String) —

Path to ONNX model file



37
38
39

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 37

def onnx_path
  @onnx_path
end

Class Method Details

.detect_language_from_path(path) ⇒ `String`

Detect language from file path

Parameters:

path (String)

Returns:

(String)

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 379

def self.detect_language_from_path(path)
  basename = File.basename(path)

  if basename =~ /\.([a-z]{2})\./i
    Regexp.last_match(1).downcase
  else
    'en'
  end
end

.from_cache(language_code, cache = nil) ⇒ `OnnxRuntimeModel`^?

Create model from cache

Parameters:

language_code (String) —

ISO 639-1 language code
cache (Cache::ModelCache) (defaults to: nil) —

Cache instance

Returns:

(OnnxRuntimeModel, nil)

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 259

def self.from_cache(language_code, cache = nil)
  require_relative '../cache/model_cache'

  cache ||= Cache::ModelCache.new

  onnx_path = cache.get_onnx_model(language_code)
  return nil unless onnx_path

  from_file(onnx_path, language_code: language_code)
end

.from_file(onnx_path, language_code: nil, dimension: nil) ⇒ `OnnxRuntimeModel`

Create model from file

Parameters:

onnx_path (String) —

Path to .onnx file
language_code (String) (defaults to: nil) —

Language code (auto-detected if nil)
dimension (Integer) (defaults to: nil) —

Embedding dimension

Returns:

(OnnxRuntimeModel)

Raises:

(ArgumentError)

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 240

def self.from_file(onnx_path, language_code: nil, dimension: nil)
  raise ArgumentError, "ONNX file not found: #{onnx_path}" unless File.exist?(onnx_path)

  language_code ||= detect_language_from_path(onnx_path)
  dimension ||= DEFAULT_DIMENSION

  new(
    language_code: language_code,
    onnx_path: onnx_path,
    dimension: dimension
  )
end

Instance Method Details

#batch_size ⇒ `Integer`

Get batch size for batch inference

Returns:

(Integer)



206
207
208

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 206

def batch_size
  BATCH_SIZE
end

#get_embedding(index) ⇒ `Array<Float>`

Get embedding for a single word index

Parameters:

index (Integer) —

Word index in vocabulary

Returns:

(Array<Float>) —

Embedding vector

Raises:

(RuntimeError) —

if model is not loaded
(ArgumentError) —

if index is invalid

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 114

def get_embedding(index)
  ensure_loaded

  raise ArgumentError, "Invalid word index: #{index}" unless valid_index?(index)

  output = @session.run(
    [@output_name],
    { @input_name => [index] }
  )

  @inference_count += 1

  extract_embedding(output.first)
end

#get_embedding_for_word(word, vocabulary) ⇒ `Array<Float>`^?

Get embedding for a word using vocabulary

Parameters:

word (String) —

The word to lookup
vocabulary (Vocabulary) —

Vocabulary for word-to-index mapping

Returns:

(Array<Float>, nil) —

Embedding vector or nil if word not found

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 172

def get_embedding_for_word(word, vocabulary)
  index = vocabulary.lookup(word)
  return nil unless index

  get_embedding(index)
end

#get_embeddings(indices) ⇒ `Array<Array<Float>>`

Get embeddings for multiple indices (batched)

More efficient than individual calls for batch operations.

Parameters:

indices (Array<Integer>) —

Word indices

Returns:

(Array<Array<Float>>) —

Array of embedding vectors

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 136

def get_embeddings(indices)
  ensure_loaded
  return [] if indices.nil? || indices.empty?

  valid_indices = indices.select { |i| valid_index?(i) }
  return [] if valid_indices.empty?

  # Process in batches for memory efficiency
  valid_indices.each_slice(BATCH_SIZE).flat_map do |batch|
    run_batch_inference(batch)
  end
end

#get_embeddings_for_words(words, vocabulary) ⇒ `Hash<String, Array<Float>>`

Get embeddings for multiple words using vocabulary

Parameters:

words (Array<String>) —

Words to lookup
vocabulary (Vocabulary) —

Vocabulary for word-to-index mapping

Returns:

(Hash<String, Array<Float>>) —

Word to embedding mapping

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 185

def get_embeddings_for_words(words, vocabulary)
  result = {}
  words.each do |word|
    embedding = get_embedding_for_word(word, vocabulary)
    result[word] = embedding if embedding
  end
  result
end

#load! ⇒ `self`

Load the ONNX model into memory

Returns:

(self)

Raises:

(Kotoshu::Models::OnnxUnavailable) —

if onnxruntime gem is missing
(ArgumentError) —

if model file doesn’t exist

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 69

def load!
  return self if @loaded

  raise Kotoshu::Models::OnnxModel::OnnxUnavailable unless Kotoshu::Models::OnnxModel::ONNX_LOADED

  raise ArgumentError, "ONNX file not found: #{@onnx_path}" unless File.exist?(@onnx_path)

  @session = OnnxRuntime::InferenceSession.new(@onnx_path)

  # Detect input/output names
  @input_name = detect_input_name
  @output_name = detect_output_name

  @loaded = true
  self
end

#model_info ⇒ `Hash`

Get model information

Returns:

(Hash)

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 222

def model_info
  {
    type: 'onnx',
    language: @language_code,
    dimension: @dimension,
    path: @onnx_path,
    loaded: @loaded,
    inference_count: @inference_count
  }
end

#model_type ⇒ `String`

Get model type identifier

Returns:

(String)



214
215
216

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 214

def model_type
  'onnx'
end

#preload_embeddings!(vocabulary) ⇒ `Hash<Integer, Array<Float>>`

Preload all embeddings into memory

For small vocabularies, this provides O(1) lookup after loading.

Parameters:

vocabulary (Vocabulary) —

Vocabulary with complete word list

Returns:

(Hash<Integer, Array<Float>>) —

Index to embedding mapping

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 156

def preload_embeddings!(vocabulary)
  ensure_loaded

  all_indices = (0...vocabulary.size).to_a
  embeddings = get_embeddings(all_indices)

  # Build index mapping
  all_indices.zip(embeddings).to_h
end

#ready? ⇒ `Boolean`

Check if model is ready for inference

Returns:

(Boolean)



102
103
104

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 102

def ready?
  @loaded && !@session.nil?
end

#supports_batching? ⇒ `Boolean`

Check if batching is supported

Returns:

(Boolean)



198
199
200

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 198

def supports_batching?
  true
end

#to_s ⇒ `String` Also known as: inspect

String representation

Returns:

(String)



274
275
276

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 274

def to_s
  "OnnxRuntimeModel(language: #{@language_code}, dimension: #{@dimension}, loaded: #{@loaded})"
end

#unload! ⇒ `self`

Unload the model from memory

Returns:

(self)

# File 'lib/kotoshu/embeddings/onnx_runtime_model.rb', line 90

def unload!
  @session = nil
  @input_name = nil
  @output_name = nil
  @loaded = false
  self
end

Class: OnnxRuntimeModel

Overview

Examples:

Single embedding lookup

Batch lookup

With vocabulary

Constant Summary collapse

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Protocol

Constructor Details

#initialize(language_code:, onnx_path:, dimension: DEFAULT_DIMENSION) ⇒ OnnxRuntimeModel

Instance Attribute Details

#dimension ⇒ Integer (readonly)

#inference_count ⇒ Integer (readonly)

#language_code ⇒ String (readonly)

#loaded ⇒ Boolean (readonly)

#onnx_path ⇒ String (readonly)

Class Method Details

.detect_language_from_path(path) ⇒ String

.from_cache(language_code, cache = nil) ⇒ OnnxRuntimeModel?

.from_file(onnx_path, language_code: nil, dimension: nil) ⇒ OnnxRuntimeModel

Instance Method Details

#batch_size ⇒ Integer

#get_embedding(index) ⇒ Array<Float>

#get_embedding_for_word(word, vocabulary) ⇒ Array<Float>?

#get_embeddings(indices) ⇒ Array<Array<Float>>

#get_embeddings_for_words(words, vocabulary) ⇒ Hash<String, Array<Float>>

#load! ⇒ self

#model_info ⇒ Hash

#model_type ⇒ String

#preload_embeddings!(vocabulary) ⇒ Hash<Integer, Array<Float>>

#ready? ⇒ Boolean

#supports_batching? ⇒ Boolean

#to_s ⇒ String Also known as: inspect

#unload! ⇒ self

#initialize(language_code:, onnx_path:, dimension: DEFAULT_DIMENSION) ⇒ `OnnxRuntimeModel`

#dimension ⇒ `Integer` (readonly)

#inference_count ⇒ `Integer` (readonly)

#language_code ⇒ `String` (readonly)

#loaded ⇒ `Boolean` (readonly)

#onnx_path ⇒ `String` (readonly)

.detect_language_from_path(path) ⇒ `String`

.from_cache(language_code, cache = nil) ⇒ `OnnxRuntimeModel`^?

.from_file(onnx_path, language_code: nil, dimension: nil) ⇒ `OnnxRuntimeModel`

#batch_size ⇒ `Integer`

#get_embedding(index) ⇒ `Array<Float>`

#get_embedding_for_word(word, vocabulary) ⇒ `Array<Float>`^?

#get_embeddings(indices) ⇒ `Array<Array<Float>>`

#get_embeddings_for_words(words, vocabulary) ⇒ `Hash<String, Array<Float>>`

#load! ⇒ `self`

#model_info ⇒ `Hash`

#model_type ⇒ `String`

#preload_embeddings!(vocabulary) ⇒ `Hash<Integer, Array<Float>>`

#ready? ⇒ `Boolean`

#supports_batching? ⇒ `Boolean`

#to_s ⇒ `String` Also known as: inspect

#unload! ⇒ `self`