Class: Leann::Embedding::FastEmbed

Inherits:
Base
  • Object
show all
Defined in:
lib/leann/embedding/fastembed.rb

Overview

FastEmbed provider for local embeddings

Uses ONNX Runtime for fast, local embedding generation without requiring an API key or external service.

Examples:

provider = Leann::Embedding::FastEmbed.new(model: "BAAI/bge-small-en-v1.5")
embeddings = provider.compute(["Hello", "World"])

Constant Summary collapse

MAX_BATCH_SIZE =
64
MODELS =

Supported models with their dimensions

{
  "BAAI/bge-small-en-v1.5" => 384,
  "BAAI/bge-base-en-v1.5" => 768,
  "intfloat/multilingual-e5-small" => 384,
  "nomic-ai/nomic-embed-text-v1.5" => 768
}.freeze
DEFAULT_MODEL =
"BAAI/bge-small-en-v1.5"

Instance Attribute Summary

Attributes inherited from Base

#model

Instance Method Summary collapse

Methods inherited from Base

#compute_one

Constructor Details

#initialize(model: nil, cache_dir: nil, threads: nil) ⇒ FastEmbed

Returns a new instance of FastEmbed.

Parameters:

  • model (String) (defaults to: nil)

    FastEmbed model name

  • cache_dir (String, nil) (defaults to: nil)

    Model cache directory

  • threads (Integer, nil) (defaults to: nil)

    Number of ONNX threads



32
33
34
35
36
37
38
39
40
41
# File 'lib/leann/embedding/fastembed.rb', line 32

def initialize(model: nil, cache_dir: nil, threads: nil)
  model ||= DEFAULT_MODEL
  super(model: model)

  @cache_dir = cache_dir || ENV["FASTEMBED_CACHE_PATH"]
  @threads = threads
  @client = nil

  check_gem!
end

Instance Method Details

#compute(texts) ⇒ Array<Array<Float>>

Compute embeddings for texts

Parameters:

  • texts (Array<String>)

Returns:

  • (Array<Array<Float>>)


47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# File 'lib/leann/embedding/fastembed.rb', line 47

def compute(texts)
  return [] if texts.empty?

  all_embeddings = []

  in_batches(texts, MAX_BATCH_SIZE) do |batch|
    batch_embeddings = compute_batch(batch)
    all_embeddings.concat(batch_embeddings)
    print "." # Progress indicator
  end

  puts " Done! (#{all_embeddings.size} embeddings)" unless texts.size < MAX_BATCH_SIZE

  # FastEmbed returns normalized vectors by default
  all_embeddings
end

#dimensionsInteger

Get dimensions for the configured model

Returns:

  • (Integer)


66
67
68
# File 'lib/leann/embedding/fastembed.rb', line 66

def dimensions
  @dimensions ||= MODELS[model] || detect_dimensions
end