gte
gte is a Ruby gem with a Rust extension for fast text embeddings with ONNX Runtime.
Inspired by https://github.com/fbilhaut/gte-rs
Quick Start
require "gte"
model = GTE.new(ENV.fetch("GTE_MODEL_DIR"))
vector = model["query: hello world"]
For Puma or other thread pools, prefer process-local reuse:
MODEL = GTE.new(ENV.fetch("GTE_MODEL_DIR"))
Model Directory
A model directory must include tokenizer.json and one ONNX model, resolved in this order:
onnx/text_model.onnxtext_model.onnxonnx/model.onnxmodel.onnx
Development
Run commands inside nix develop.
bundle exec rake compile
cargo test --manifest-path ext/gte/Cargo.toml --no-default-features
bundle exec rspec
Benchmark
The repo includes two benchmark paths:
bundle exec rake bench:pure_compare
bundle exec rake bench:puma_compare
bundle exec rake bench:matrix_sweep
bundle exec ruby bench/memory_probe.rb --compare-pure
For release tracking and regression detection, record a run entry in RUNS.md:
bundle exec rake bench:record_run