Hermes Agent ships with 8 external memory provider plugins that give the agent persistent, cross-session knowledge beyond the built-in MEMORY.md and USER.md. Only one external provider can be active at a time — the built-in memory is always active alongside it.
You can also select the active memory provider via hermes plugins → Provider Plugins → Memory Provider.
Or set manually in ~/.hermes/config.yaml:
memory:provider:openviking# or honcho, mem0, hindsight, holographic, retaindb, byterover, supermemory
How It Works
When a memory provider is active, Hermes automatically:
Injects provider context into the system prompt (what the provider knows)
Prefetches relevant memories before each turn (background, non-blocking)
Syncs conversation turns to the provider after each response
Extracts memories on session end (for providers that support it)
Mirrors built-in memory writes to the external provider
Adds provider-specific tools so the agent can search, store, and manage memories
The built-in memory (MEMORY.md / USER.md) continues to work exactly as before. The external provider is additive.
Available Providers
Honcho
AI-native cross-session user modeling with dialectic reasoning, session-scoped context injection, semantic search, and persistent conclusions. Base context now includes the session summary alongside user representation and peer cards, giving the agent awareness of what has already been discussed.
Best for
Multi-agent systems with cross-session context, user-agent alignment
Requires
pip install honcho-ai + API key or self-hosted instance
Architecture: Two-layer context injection — a base layer (session summary + representation + peer card, refreshed on contextCadence) plus a dialectic supplement (LLM reasoning, refreshed on dialecticCadence). The dialectic automatically selects cold-start prompts (general user facts) vs. warm prompts (session-scoped context) based on whether base context exists.
Three orthogonal config knobs control cost and depth independently:
contextCadence — how often the base layer refreshes (API call frequency)
dialecticCadence — how often the dialectic LLM fires (LLM call frequency)
dialecticDepth — how many .chat() passes per dialectic invocation (1–3, depth of reasoning)
Setup Wizard:
hermesmemorysetup# select "honcho" — runs the Honcho-specific post-setup
The legacy hermes honcho setup command still works (it now redirects to hermes memory setup), but is only registered after Honcho is selected as the active memory provider.
Config:$HERMES_HOME/honcho.json (profile-local) or ~/.honcho/config.json (global). Resolution order: $HERMES_HOME/honcho.json > ~/.hermes/honcho.json > ~/.honcho/config.json. See the config reference and the Honcho integration guide.
Full config reference
| Key | Default | Description |
|-----|---------|-------------|
| `apiKey` | -- | API key from [app.honcho.dev](https://app.honcho.dev) |
| `baseUrl` | -- | Base URL for self-hosted Honcho |
| `peerName` | -- | User peer identity |
| `aiPeer` | host key | AI peer identity (one per profile) |
| `workspace` | host key | Shared workspace ID |
| `contextTokens` | `null` (uncapped) | Token budget for auto-injected context per turn. Truncates at word boundaries |
| `contextCadence` | `1` | Minimum turns between `context()` API calls (base layer refresh) |
| `dialecticCadence` | `2` | Minimum turns between `peer.chat()` LLM calls. Recommended 1–5. Only applies to `hybrid`/`context` modes |
| `dialecticDepth` | `1` | Number of `.chat()` passes per dialectic invocation. Clamped 1–3. Pass 0: cold/warm prompt, pass 1: self-audit, pass 2: reconciliation |
| `dialecticDepthLevels` | `null` | Optional array of reasoning levels per pass, e.g. `["minimal", "low", "medium"]`. Overrides proportional defaults |
| `dialecticReasoningLevel` | `'low'` | Base reasoning level: `minimal`, `low`, `medium`, `high`, `max` |
| `dialecticDynamic` | `true` | When `true`, model can override reasoning level per-call via tool param |
| `dialecticMaxChars` | `600` | Max chars of dialectic result injected into system prompt |
| `recallMode` | `'hybrid'` | `hybrid` (auto-inject + tools), `context` (inject only), `tools` (tools only) |
| `writeFrequency` | `'async'` | When to flush messages: `async` (background thread), `turn` (sync), `session` (batch on end), or integer N |
| `saveMessages` | `true` | Whether to persist messages to Honcho API |
| `observationMode` | `'directional'` | `directional` (all on) or `unified` (shared pool). Override with `observation` object |
| `messageMaxChars` | `25000` | Max chars per message (chunked if exceeded) |
| `dialecticMaxInputChars` | `10000` | Max chars for dialectic query input to `peer.chat()` |
| `sessionStrategy` | `'per-directory'` | `per-directory`, `per-repo`, `per-session`, `global` |
Minimal honcho.json (cloud)
tip Migrating from hermes honcho
If you previously used hermes honcho setup, your config and all server-side data are intact. Just re-enable through the setup wizard again or manually set memory.provider: honcho to reactivate via the new system.
Multi-peer setup:
Honcho models conversations as peers exchanging messages — one user peer plus one AI peer per Hermes profile, all sharing a workspace. The workspace is the shared environment: the user peer is global across profiles, each AI peer is its own identity. Every AI peer builds an independent representation / card from its own observations, so a coder profile stays code-oriented while a writer profile stays editorial against the same user.
The mapping:
Concept
What it is
Workspace
Shared environment. All Hermes profiles under one workspace see the same user identity.
User peer (peerName)
The human. Shared across profiles in the workspace.
AI peer (aiPeer)
One per Hermes profile. Host key hermes → default; hermes.<profile> for others.
Observation
Per-peer toggles controlling what Honcho models from whose messages. directional (default, all four on) or unified (single-observer pool).
New profile, fresh Honcho peer
hermesprofilecreatecoder--clone
--clone creates a hermes.coder host block in honcho.json with aiPeer: "coder", shared workspace, inherited peerName, recallMode, writeFrequency, observation, etc. The AI peer is eagerly created in Honcho so it exists before the first message.
Existing profiles, backfill Honcho peers
hermeshonchosync
Scans every Hermes profile, creates host blocks for any profile without one, inherits settings from the default hermes block, and creates the new AI peers eagerly. Idempotent — skips profiles that already have a host block.
Per-profile observation
Each host block can override the observation config independently. Example: a code-focused profile where the AI peer observes the user but doesn't self-model:
Honcho builds a representation of this peer from its own messages
observeOthers
This peer observes the other peer's messages (feeds cross-peer reasoning)
Presets via observationMode:
"directional" (default) — all four flags on. Full mutual observation; enables cross-peer dialectic.
"unified" — user observeMe: true, AI observeOthers: true, rest false. Single-observer pool; AI models the user but not itself, user peer only self-models.
Server-side toggles set via the Honcho dashboard win over local defaults — synced back at session init.
See the Honcho page for the full observation reference.
Context database by Volcengine (ByteDance) with filesystem-style knowledge hierarchy, tiered retrieval, and automatic memory extraction into 6 categories.
Best for
Self-hosted knowledge management with structured browsing
# Start the OpenViking server first
pipinstallopenviking
openviking-server
# Then configure Hermes
hermesmemorysetup# select "openviking"# Or manually:
hermesconfigsetmemory.provideropenviking
echo"OPENVIKING_ENDPOINT=http://localhost:1933">>~/.hermes/.env
hermesmemorysetup# select "mem0"# Or manually:
hermesconfigsetmemory.providermem0
echo"MEM0_API_KEY=your-key">>~/.hermes/.env
Config:$HERMES_HOME/mem0.json
Key
Default
Description
user_id
hermes-user
User identifier
agent_id
hermes
Agent identifier
Hindsight
Long-term memory with knowledge graph, entity resolution, and multi-strategy retrieval. The hindsight_reflect tool provides cross-memory synthesis that no other provider offers. Automatically retains full conversation turns (including tool calls) with session-level document tracking.
Best for
Knowledge graph-based recall with entity relationships
hermesmemorysetup# select "hindsight"# Or manually:
hermesconfigsetmemory.providerhindsight
echo"HINDSIGHT_API_KEY=your-key">>~/.hermes/.env
The setup wizard installs dependencies automatically and only installs what's needed for the selected mode (hindsight-client for cloud, hindsight-all for local). Requires hindsight-client >= 0.4.22 (auto-upgraded on session start if outdated).
hermesmemorysetup# select "holographic"# Or manually:
hermesconfigsetmemory.providerholographic
Config:config.yaml under plugins.hermes-memory-store
Key
Default
Description
db_path
$HERMES_HOME/memory_store.db
SQLite database path
auto_extract
false
Auto-extract facts at session end
default_trust
0.5
Default trust score (0.0–1.0)
Unique capabilities:
probe — entity-specific algebraic recall (all facts about a person/thing)
reason — compositional AND queries across multiple entities
contradict — automated detection of conflicting facts
Trust scoring with asymmetric feedback (+0.05 helpful / -0.10 unhelpful)
RetainDB
Cloud memory API with hybrid search (Vector + BM25 + Reranking), 7 memory types, and delta compression.
Best for
Teams already using RetainDB's infrastructure
Requires
RetainDB account + API key
Data storage
RetainDB Cloud
Cost
$20/month
Tools:retaindb_profile (user profile), retaindb_search (semantic search), retaindb_context (task-relevant context), retaindb_remember (store with type + importance), retaindb_forget (delete memories)
Setup:
hermesmemorysetup# select "retaindb"# Or manually:
hermesconfigsetmemory.providerretaindb
echo"RETAINDB_API_KEY=your-key">>~/.hermes/.env
ByteRover
Persistent memory via the brv CLI — hierarchical knowledge tree with tiered retrieval (fuzzy text → LLM-driven search). Local-first with optional cloud sync.
Best for
Developers who want portable, local-first memory with a CLI
Requires
ByteRover CLI (npm install -g byterover-cli or install script)
Data storage
Local (default) or ByteRover Cloud (optional sync)
Cost
Free (local) or ByteRover pricing (cloud)
Tools:brv_query (search knowledge tree), brv_curate (store facts/decisions/patterns), brv_status (CLI version + tree stats)
Setup:
# Install the CLI first
curl-fsSLhttps://byterover.dev/install.sh|sh
# Then configure Hermes
hermesmemorysetup# select "byterover"# Or manually:
hermesconfigsetmemory.providerbyterover
Key features:
Automatic pre-compression extraction (saves insights before context compression discards them)
Knowledge tree stored at $HERMES_HOME/byterover/ (profile-scoped)
SOC2 Type II certified cloud sync (optional)
Supermemory
Semantic long-term memory with profile recall, semantic search, explicit memory tools, and session-end conversation ingest via the Supermemory graph API.
Best for
Semantic recall with user profiling and session-level graph building
Profile-scoped containers — use {identity} in container_tag (e.g. hermes-{identity} → hermes-coder) to isolate memories per Hermes profile
Multi-container mode — enable enable_custom_container_tags with a custom_containers list to let the agent read/write across named containers. Automatic operations (sync, prefetch) stay on the primary container.
Multi-container example
{"container_tag":"hermes","enable_custom_container_tags":true,"custom_containers":["project-alpha","shared-knowledge"],"custom_container_instructions":"Use project-alpha for coding context."}