Core Concepts
Technical overview of Remina's architecture and algorithms.
Architecture
Remina implements a layered architecture with provider abstraction:
┌──────────────────────────────────────────────────────────────┐
│ Remina Engine │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Core Algorithms │ │
│ │ • Importance Scoring • Memory Consolidation │ │
│ │ • Hybrid Retrieval • Contradiction Detection │ │
│ │ • Graph Linking • Adaptive Decay │ │
│ │ • Deduplication • Fact Extraction │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌───────────┐ ┌───────────────┐ ┌───────────────────┐ │
│ │ L1 Cache │ │ L2 Storage │ │ Vector Store │ │
│ │ (Redis) │ │ (Pluggable) │ │ (Pluggable) │ │
│ │ FIXED │ │ │ │ │ │
│ └───────────┘ └───────────────┘ └───────────────────┘ │
└──────────────────────────────────────────────────────────────┘Memory Model
Each memory entity contains:
@dataclass
class Memory:
id: str # Unique identifier (UUID)
user_id: str # Owner/namespace
content: str # Memory content
embedding: list[float] # Vector representation
# Metadata
metadata: dict # User-defined key-value pairs
tags: list[str] # Categorization tags
source: str # Origin (conversation, manual, extraction)
# Scoring factors
importance: float # Base importance (0-1)
decay_rate: float # Decay coefficient (default: 0.01)
access_count: int # Retrieval count
# Timestamps
created_at: datetime
updated_at: datetime
last_accessed_at: datetime
# Graph
links: list[str] # Related memory IDs
# State
is_consolidated: bool # Part of a merge operation
consolidated_from: list[str] # Source memory IDs if mergedTwo-Tier Caching
Remina uses a two-tier caching strategy optimized for AI workloads:
L1 Cache (Redis) — Fixed
- Purpose: Hot-path access for recent/frequent memories
- Target latency: < 5ms
- Behavior:
- Automatic promotion on access
- TTL-based eviction (default: 1 hour)
- Per-user memory limits (default: 100)
L2 Storage (Pluggable)
- Purpose: Persistent, durable storage
- Options: PostgreSQL, MongoDB, SQLite
- Behavior:
- Full memory persistence
- Queried on L1 cache miss
Cache Flow
┌─────────────────────────────────────────────────────────────┐
│ Search Request │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────┐
│ L1 Cache Hit? │
└─────────────────┘
│ │
Yes No
│ │
▼ ▼
┌───────────┐ ┌───────────────┐
│ Return │ │ Query Vector │
│ Cached │ │ Store │
└───────────┘ └───────────────┘
│
▼
┌───────────────┐
│ Query L2 │
│ Storage │
└───────────────┘
│
▼
┌───────────────┐
│ Promote to L1 │
└───────────────┘
│
▼
┌───────────────┐
│ Return │
└───────────────┘Fact Extraction
The memory.add() operation uses an LLM to extract discrete facts:
# Input
memory.add(
messages="I'm John. I'm a software engineer at Google. I prefer Python.",
user_id="john_123"
)
# LLM extracts:
# - "Name is John"
# - "Is a software engineer"
# - "Works at Google"
# - "Prefers Python"Extraction Pipeline
- Format Input — Normalize messages to prompt format
- LLM Call — Send to configured LLM with extraction prompt
- Parse Response — Extract JSON array of facts
- Deduplicate — Compare against existing memories
- Embed — Generate vector embeddings
- Store — Persist to vector store and L2 storage
Hybrid Retrieval
Remina combines multiple signals for relevance ranking:
final_score = (
0.5 * semantic_score + # Vector similarity
0.3 * importance_score + # Recency + frequency + base importance
0.2 * keyword_score # Direct term overlap
)Semantic Score
Cosine similarity between query embedding and memory embedding.
Importance Score
importance_score = (
weight_recency * recency_factor +
weight_frequency * frequency_factor +
weight_importance * base_importance
)Components:
- Recency factor: Temporal decay based on
last_accessed_at - Frequency factor: Derived from
access_count - Base importance: User-defined or extraction-derived importance
Keyword Score
Term overlap between query and memory content.
Deduplication
Before storing, Remina checks for semantic duplicates:
# Default threshold: 0.9 (90% similarity)
if cosine_similarity(new_embedding, existing_embedding) > dedup_threshold:
# Skip storage — duplicate detectedThis prevents redundant entries like:
- "Prefers Python" and "Loves Python programming"
- "Works at Google" and "Employed at Google"
Memory Consolidation (Planned)
Automatic consolidation of related memories:
# Before consolidation:
# - "Drinks coffee"
# - "Prefers dark roast"
# - "Has 3 cups daily"
# After consolidation:
# - "Coffee preference: dark roast, 3 cups daily"Provider Abstraction
All providers implement standardized interfaces:
Storage Provider
class StorageBase(ABC):
async def save(self, memories: List[Memory]) -> None
async def get(self, ids: List[str]) -> List[Memory]
async def delete(self, ids: List[str]) -> None
async def query(self, user_id: str, filters: Dict = None, limit: int = 100) -> List[Memory]
async def update(self, memory: Memory) -> None
async def count(self, user_id: str) -> int
async def close(self) -> NoneVector Store Provider
class VectorStoreBase(ABC):
async def upsert(self, id: str, embedding: List[float], metadata: Dict) -> None
async def upsert_batch(self, items: List[Tuple]) -> None
async def search(self, embedding: List[float], limit: int = 10, filters: Dict = None) -> List[VectorSearchResult]
async def delete(self, ids: List[str]) -> None
async def close(self) -> NoneEmbedding Provider
class EmbeddingBase(ABC):
def embed(self, text: str) -> List[float]
def embed_batch(self, texts: List[str]) -> List[List[float]]
@property
def dimensions(self) -> int
@property
def model_name(self) -> strLLM Provider
class LLMBase(ABC):
def generate_response(self, messages: List[Dict], tools: List[Dict] = None) -> Dict
@property
def model_name(self) -> strError Handling
Remina uses structured exceptions with error codes:
class ReminaError(Exception):
message: str
error_code: str
details: Dict
suggestion: str
debug_info: Dict
# Specific exceptions
class ConfigurationError(ReminaError): ...
class StorageError(ReminaError): ...
class VectorStoreError(ReminaError): ...
class EmbeddingError(ReminaError): ...
class LLMError(ReminaError): ...
class CacheError(ReminaError): ...
class MemoryNotFoundError(ReminaError): ...Next Steps
- Configuration — Provider and settings configuration
- Providers — Detailed provider documentation
- API Reference — Complete API documentation