robot_lab-document_store
Embedding-based semantic document search for the RobotLab LLM agent framework.
[!CAUTION] This gem is under active development. APIs may change without notice.
What it provides
RobotLab::DocumentStore is a thread-safe, in-memory vector store backed by fastembed embeddings and cosine similarity search. It supports:
store(key, text)— embed and store a document under a symbol keysearch(query, limit:)— return the top-N most similar documents by cosine similaritydelete(key)/clear— remove individual entries or wipe the store- Asymmetric embedding — passage embeddings for storage, query embeddings for retrieval
Installation
Add to your Gemfile:
gem "robot_lab-document_store"
Quick Example
require "robot_lab/document_store"
store = RobotLab::DocumentStore.new
store.store(:alpha, "Ruby is a dynamic, open source programming language.")
store.store(:beta, "Python is widely used in data science and machine learning.")
store.store(:gamma, "JavaScript runs in the browser and on Node.js servers.")
results = store.search("What language is popular for AI?", limit: 2)
results.each do |r|
puts "#{r[:key]} (score: #{"%.3f" % r[:score]})"
end
# => beta (score: 0.872)
# => alpha (score: 0.641)
Custom Model
store = RobotLab::DocumentStore.new(
model_name: "BAAI/bge-small-en-v1.5"
)
The default model is "BAAI/bge-base-en-v1.5".
Using with RobotLab Robots
DocumentStore works well as in-memory retrieval for RAG (retrieval-augmented generation) workflows. Load documents at startup and pass relevant excerpts into robot context:
require "robot_lab"
require "robot_lab/document_store"
store = RobotLab::DocumentStore.new
store.store(:faq_1, "Our return policy allows returns within 30 days.")
store.store(:faq_2, "Shipping typically takes 3-5 business days.")
robot = RobotLab.build(
name: "support",
system_prompt: "You are a support agent. Use provided context to answer questions."
)
query = "How long do I have to return an item?"
chunks = store.search(query, limit: 2).map { |r| r[:text] }.join("\n")
result = robot.run("Context:\n#{chunks}\n\nQuestion: #{query}")
puts result.last_text_content
Links
License
MIT License - Copyright (c) 2025 Dewayne VanHoozer
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/MadBomber/robot_lab-document_store.