Remina

Production-grade memory infrastructure for AI systems.

Remina provides battle-tested algorithms for memory management with full infrastructure flexibility. Think of it as SQLAlchemy for AI memory — the core logic is handled for you, while storage backends, vector databases, and embedding models remain fully pluggable.

Quick Start Core Concepts Configuration API Reference

The Problem

AI applications without persistent memory face compounding technical debt:

Lost context — Conversations reset, users repeat themselves, personalization degrades
Vendor lock-in — Existing solutions force specific infrastructure choices
Scaling limitations — Naive approaches fail under production memory volumes
Memory quality — No intelligent consolidation, deduplication, or relevance scoring

Remina addresses these at the infrastructure layer with a pluggable architecture and sophisticated memory algorithms.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     Remina Memory                            │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────────────────────────────────────────────────┐│
│  │                  Core Algorithms                         ││
│  │  • Importance Scoring    • Fact Extraction              ││
│  │  • Hybrid Retrieval      • Deduplication                ││
│  └─────────────────────────────────────────────────────────┘│
│                                                              │
│  ┌───────────┐  ┌───────────────┐  ┌───────────────────┐   │
│  │ L1 Cache  │  │  L2 Storage   │  │   Vector Store    │   │
│  │  (Redis)  │  │  (Pluggable)  │  │   (Pluggable)     │   │
│  │  FIXED    │  │               │  │                   │   │
│  └───────────┘  └───────────────┘  └───────────────────┘   │
│                                                              │
│  ┌─────────────────────────────────────────────────────────┐│
│  │              Embedding + LLM Providers                   ││
│  │                    (Pluggable)                           ││
│  └─────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘

Capabilities

Pluggable Architecture — Swap storage, vector DB, embeddings, and LLM providers independently
Two-Tier Caching — Redis L1 for hot-path access, pluggable L2 for persistence
Hybrid Retrieval — Combines vector similarity, temporal decay, and importance weighting
Automatic Extraction — LLM-powered fact extraction from conversations
Deduplication — Similarity-based duplicate prevention
Importance Scoring — Relevance ranking based on recency, frequency, and base importance

Usage

from remina import Memory
 
# Initialize with defaults (SQLite + Chroma + OpenAI)
memory = Memory()
 
# Extract and store facts from text
result = memory.add(
    messages="I'm John. I work as a software engineer at Google.",
    user_id="user_123"
)
# Extracts: ["Name is John", "Works as a software engineer at Google"]
 
# Semantic search
results = memory.search(
    query="What is the user's profession?",
    user_id="user_123"
)
 
# Retrieve all memories
all_memories = memory.get_all(user_id="user_123")

Supported Providers

Category	Providers
Storage (L2)	SQLite, PostgreSQL, MongoDB
Vector Stores	Chroma, Pinecone, Qdrant, pgvector
Embeddings	OpenAI, Gemini, Cohere, Ollama, HuggingFace
LLMs	OpenAI, Gemini, Anthropic, Ollama

License

Apache 2.0 — Free for commercial and personal use.

Installation