pikuri-memory
Durable, cross-conversation memory for the pikuri AI-assistant toolkit: facts about the user and their work that persist across conversations, backed by mem0.
Wire it onto a pikuri-core agent the same way as pikuri-tasks /
pikuri-vectordb — c.add_extension inside the Agent.new block:
require 'pikuri-memory'
client = Pikuri::Memory::Mem0Client.new(endpoint: 'http://localhost:8888')
Pikuri::Agent.new(transport: ..., system_prompt: ...) do |c|
c.add_extension Pikuri::Memory::Extension.new(client: client, user_id: 'martin')
end
What you get
Three retrieval tiers, the same layered shape pikuri-vectordb
uses (vectordb_search + vectordb_read):
- Resident persona — a small always-in-prompt summary of what the agent already knows about the user, appended once at construction.
- Automatic prefetch — every user turn is embedded and
searched; a small, high-precision slice is injected as a
:system<memory-context>block right after the turn. recalltool — explicit, topic-driven deepening when the agent wants more than the automatic slice surfaced.
Recall is automatic and synchronous (a vector search is milliseconds). Capture is automatic and asynchronous — a background worker drains user turns into mem0's extraction call (~seconds), so a turn never blocks on "what should I remember?".
Safety
Automatic capture and recall are safe only on an agent with no
untrusted ingest and no egress (pikuri's @private member): a
poisoned memory plus an outbound leg is the lethal trifecta. Two
structural defenses keep the capture pipeline honest:
- Only the user's own words are fed to extraction — assistant turns, tool results, and recalled context are never captured.
- Recalled context lands as
:system, never as a user turn, so it cannot be re-extracted into a self-reinforcing feedback loop.
Do not port memory onto an egress-capable agent without re-deriving the recall-poisoning mitigations.
Storage: mem0 + Qdrant
Memory is stored in mem0 with
Qdrant as the vector backend (mem0's pgvector path has a top-k
inversion bug — it returns the farthest matches), a local
OpenAI-compatible LLM + embedder (llama.cpp via OPENAI_BASE_URL),
and a non-reasoning extraction model (e.g. Qwen2.5-7B-Instruct —
a thinking model burns its token budget on chain-of-thought and
returns empty JSON).
Two ways to get a server:
- Let pikuri manage it.
Pikuri::Memory::Mem0Serveris a self-managed sidecar supervisor (the same pattern — and the same Qdrant engine — aspikuri-vectordb'sServer::Qdrant): it clones mem0 at a pinned commit, patchesDEFAULT_CONFIGto use Qdrant, and brings adocker composestack (mem0 + Qdrant + a ~5 MB socat relay, no Postgres) up throughPikuri::Subprocess.spawn.#clientreturns aMem0Clientpointed at it. Alocalhostrouter works as-is: the relay carries the container's LLM/embedder calls to the host's loopback over a unix socket, so no rootless-vs-rootful daemon configuration is needed (seeMem0Server's "router relay" yardoc section).
server = Pikuri::Memory::Mem0Server.ensure_running(
router_url: 'http://localhost:8080/v1',
llm_model: 'bartowski/Qwen2.5-7B-Instruct-GGUF:Q5_K_M',
embedder_model: 'nomic-ai/nomic-embed-text-v1.5-GGUF:Q8_0'
)
Pikuri::Memory::Extension.new(client: server.client, user_id: 'martin')
Needs docker (with the compose plugin), git, and socat. The
first run builds the mem0 image (a few minutes); the clone lives under
~/.cache/pikuri/mem0/temp/git and the Qdrant corpus + memory
history under ~/.cache/pikuri/mem0/data/ (bind-mounted into the
containers), so subsequent runs are fast and the data persists
across restarts.
- Bring your own. Point
Mem0Client.new(endpoint:)at a mem0 server you already run (configured as above). Skips the supervisor entirely.
Install
# Gemfile
gem 'pikuri-memory'
Depends only on pikuri-core.