gte

gte is a Ruby gem with a Rust extension for fast text embeddings with ONNX Runtime. Inspired by https://github.com/fbilhaut/gte-rs

Quick Start

require "gte"

model = GTE.new(ENV.fetch("GTE_MODEL_DIR"))
vector = model["query: hello world"]

For Puma or other thread pools, prefer process-local reuse:

MODEL = GTE.new(ENV.fetch("GTE_MODEL_DIR"))

Model Directory

A model directory must include tokenizer.json and one ONNX model, resolved in this order:

  1. onnx/text_model.onnx
  2. text_model.onnx
  3. onnx/model.onnx
  4. model.onnx

Development

Run commands inside nix develop.

bundle exec rake compile
cargo test --manifest-path ext/gte/Cargo.toml --no-default-features
bundle exec rspec

Benchmark

The repo includes two benchmark paths:

bundle exec rake bench:pure_compare
bundle exec rake bench:puma_compare
bundle exec rake bench:matrix_sweep
bundle exec ruby bench/memory_probe.rb --compare-pure

For release tracking and regression detection, record a run entry in RUNS.md:

bundle exec rake bench:record_run