Remina
Production-grade memory infrastructure for AI systems.
Remina provides battle-tested algorithms for memory management with full infrastructure flexibility. Think of it as SQLAlchemy for AI memory — the core logic is handled for you, while storage backends, vector databases, and embedding models remain fully pluggable.
The Problem
AI applications without persistent memory face compounding technical debt:
- Lost context — Conversations reset, users repeat themselves, personalization degrades
- Vendor lock-in — Existing solutions force specific infrastructure choices
- Scaling limitations — Naive approaches fail under production memory volumes
- Memory quality — No intelligent consolidation, deduplication, or relevance scoring
Remina addresses these at the infrastructure layer with a pluggable architecture and sophisticated memory algorithms.
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Remina Memory │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────────────────────────────────────────────────┐│
│ │ Core Algorithms ││
│ │ • Importance Scoring • Fact Extraction ││
│ │ • Hybrid Retrieval • Deduplication ││
│ └─────────────────────────────────────────────────────────┘│
│ │
│ ┌───────────┐ ┌───────────────┐ ┌───────────────────┐ │
│ │ L1 Cache │ │ L2 Storage │ │ Vector Store │ │
│ │ (Redis) │ │ (Pluggable) │ │ (Pluggable) │ │
│ │ FIXED │ │ │ │ │ │
│ └───────────┘ └───────────────┘ └───────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ Embedding + LLM Providers ││
│ │ (Pluggable) ││
│ └─────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘Capabilities
- Pluggable Architecture — Swap storage, vector DB, embeddings, and LLM providers independently
- Two-Tier Caching — Redis L1 for hot-path access, pluggable L2 for persistence
- Hybrid Retrieval — Combines vector similarity, temporal decay, and importance weighting
- Automatic Extraction — LLM-powered fact extraction from conversations
- Deduplication — Similarity-based duplicate prevention
- Importance Scoring — Relevance ranking based on recency, frequency, and base importance
Usage
from remina import Memory
# Initialize with defaults (SQLite + Chroma + OpenAI)
memory = Memory()
# Extract and store facts from text
result = memory.add(
messages="I'm John. I work as a software engineer at Google.",
user_id="user_123"
)
# Extracts: ["Name is John", "Works as a software engineer at Google"]
# Semantic search
results = memory.search(
query="What is the user's profession?",
user_id="user_123"
)
# Retrieve all memories
all_memories = memory.get_all(user_id="user_123")Supported Providers
| Category | Providers |
|---|---|
| Storage (L2) | SQLite, PostgreSQL, MongoDB |
| Vector Stores | Chroma, Pinecone, Qdrant, pgvector |
| Embeddings | OpenAI, Gemini, Cohere, Ollama, HuggingFace |
| LLMs | OpenAI, Gemini, Anthropic, Ollama |
License
Apache 2.0 — Free for commercial and personal use.