Providers
LLM Providers

LLM Providers

LLM providers handle fact extraction from conversations.

OpenAI

Industry-standard LLMs.

Installation

pip install remina-memory[openai]

Configuration

"llm": {
    "provider": "openai",
    "config": {
        "api_key": None,  # Uses OPENAI_API_KEY env
        "model": "gpt-4o-mini",
        "temperature": 0.1,
        "max_tokens": 2000,
        "base_url": None,
    }
}
OptionTypeDefaultDescription
api_keystrenv varOpenAI API key
modelstrgpt-4o-miniModel name
temperaturefloat0.1Response randomness
max_tokensint2000Max output tokens
base_urlstrNoneCustom endpoint

Models

ModelSpeedCost
gpt-4o-miniFast$0.15/1M tokens
gpt-4oMedium$2.50/1M tokens
gpt-4-turboMedium$10/1M tokens

Google Gemini

Fast LLMs with free tier.

Installation

pip install remina-memory[gemini]

Configuration

"llm": {
    "provider": "gemini",
    "config": {
        "api_key": None,  # Uses GOOGLE_API_KEY env
        "model": "gemini-2.0-flash",
        "temperature": 0.1,
        "max_tokens": 2000,
    }
}

Models

ModelSpeedCost
gemini-2.0-flashVery fastFree tier
gemini-2.5-flashVery fastFree tier
gemini-1.5-proMediumPay-per-use

Anthropic Claude

Excellent instruction following.

Installation

pip install remina-memory[anthropic]

Configuration

"llm": {
    "provider": "anthropic",
    "config": {
        "api_key": None,  # Uses ANTHROPIC_API_KEY env
        "model": "claude-3-5-sonnet-20240620",
        "temperature": 0.1,
        "max_tokens": 2000,
    }
}

Models

ModelSpeedCost
claude-3-5-sonnet-20240620Fast$3/1M tokens
claude-3-opus-20240229Slow$15/1M tokens
claude-3-haiku-20240307Very fast$0.25/1M tokens

Ollama (Local)

Local LLMs without API costs.

Installation

pip install remina-memory[ollama]

Configuration

"llm": {
    "provider": "ollama",
    "config": {
        "base_url": "http://localhost:11434",
        "model": "llama3.2",
        "temperature": 0.1,
    }
}

Setup

curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.2

Models

ModelSizeRAM Required
llama3.22GB4GB
llama3.2:3b2GB4GB
mistral4GB8GB
mixtral26GB48GB

LLM Interface

All LLM providers implement:

class LLMBase(ABC):
    def generate_response(
        self,
        messages: List[Dict[str, str]],
        tools: Optional[List[Dict]] = None,
        tool_choice: str = "auto",
        **kwargs,
    ) -> Dict[str, Any]
    
    @property
    def model_name(self) -> str

Selection Guide

RequirementRecommendation
Best qualityOpenAI gpt-4o or Claude claude-3-5-sonnet
Best valueOpenAI gpt-4o-mini or Gemini gemini-2.0-flash
Free tierGemini
No API costsOllama
Privacy/on-premiseOllama
FastestGemini gemini-2.0-flash

Temperature Setting

For fact extraction, use low temperature (0.1) for consistent results:

"llm": {
    "provider": "openai",
    "config": {
        "model": "gpt-4o-mini",
        "temperature": 0.1,
    }
}