LLM Providers
LLM providers handle fact extraction from conversations.
OpenAI
Industry-standard LLMs.
Installation
pip install remina-memory[openai]Configuration
"llm": {
"provider": "openai",
"config": {
"api_key": None, # Uses OPENAI_API_KEY env
"model": "gpt-4o-mini",
"temperature": 0.1,
"max_tokens": 2000,
"base_url": None,
}
}| Option | Type | Default | Description |
|---|---|---|---|
api_key | str | env var | OpenAI API key |
model | str | gpt-4o-mini | Model name |
temperature | float | 0.1 | Response randomness |
max_tokens | int | 2000 | Max output tokens |
base_url | str | None | Custom endpoint |
Models
| Model | Speed | Cost |
|---|---|---|
gpt-4o-mini | Fast | $0.15/1M tokens |
gpt-4o | Medium | $2.50/1M tokens |
gpt-4-turbo | Medium | $10/1M tokens |
Google Gemini
Fast LLMs with free tier.
Installation
pip install remina-memory[gemini]Configuration
"llm": {
"provider": "gemini",
"config": {
"api_key": None, # Uses GOOGLE_API_KEY env
"model": "gemini-2.0-flash",
"temperature": 0.1,
"max_tokens": 2000,
}
}Models
| Model | Speed | Cost |
|---|---|---|
gemini-2.0-flash | Very fast | Free tier |
gemini-2.5-flash | Very fast | Free tier |
gemini-1.5-pro | Medium | Pay-per-use |
Anthropic Claude
Excellent instruction following.
Installation
pip install remina-memory[anthropic]Configuration
"llm": {
"provider": "anthropic",
"config": {
"api_key": None, # Uses ANTHROPIC_API_KEY env
"model": "claude-3-5-sonnet-20240620",
"temperature": 0.1,
"max_tokens": 2000,
}
}Models
| Model | Speed | Cost |
|---|---|---|
claude-3-5-sonnet-20240620 | Fast | $3/1M tokens |
claude-3-opus-20240229 | Slow | $15/1M tokens |
claude-3-haiku-20240307 | Very fast | $0.25/1M tokens |
Ollama (Local)
Local LLMs without API costs.
Installation
pip install remina-memory[ollama]Configuration
"llm": {
"provider": "ollama",
"config": {
"base_url": "http://localhost:11434",
"model": "llama3.2",
"temperature": 0.1,
}
}Setup
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.2Models
| Model | Size | RAM Required |
|---|---|---|
llama3.2 | 2GB | 4GB |
llama3.2:3b | 2GB | 4GB |
mistral | 4GB | 8GB |
mixtral | 26GB | 48GB |
LLM Interface
All LLM providers implement:
class LLMBase(ABC):
def generate_response(
self,
messages: List[Dict[str, str]],
tools: Optional[List[Dict]] = None,
tool_choice: str = "auto",
**kwargs,
) -> Dict[str, Any]
@property
def model_name(self) -> strSelection Guide
| Requirement | Recommendation |
|---|---|
| Best quality | OpenAI gpt-4o or Claude claude-3-5-sonnet |
| Best value | OpenAI gpt-4o-mini or Gemini gemini-2.0-flash |
| Free tier | Gemini |
| No API costs | Ollama |
| Privacy/on-premise | Ollama |
| Fastest | Gemini gemini-2.0-flash |
Temperature Setting
For fact extraction, use low temperature (0.1) for consistent results:
"llm": {
"provider": "openai",
"config": {
"model": "gpt-4o-mini",
"temperature": 0.1,
}
}