Memory Providers

Hermes Agent ships with 8 external memory provider plugins that give the agent persistent, cross-session knowledge beyond the built-in MEMORY.md and USER.md. Only one external provider can be active at a time — the built-in memory is always active alongside it.

Quick Start

hermes memory setup      # interactive picker + configuration
hermes memory status     # check what's active
hermes memory off        # disable external provider

You can also select the active memory provider via hermes plugins → Provider Plugins → Memory Provider.

Or set manually in ~/.hermes/config.yaml:

memory:
  provider: openviking   # or honcho, mem0, hindsight, holographic, retaindb, byterover, supermemory

How It Works

When a memory provider is active, Hermes automatically:

Injects provider context into the system prompt (what the provider knows)
Prefetches relevant memories before each turn (background, non-blocking)
Syncs conversation turns to the provider after each response
Extracts memories on session end (for providers that support it)
Mirrors built-in memory writes to the external provider
Adds provider-specific tools so the agent can search, store, and manage memories

The built-in memory (MEMORY.md / USER.md) continues to work exactly as before. The external provider is additive.

Available Providers

Honcho

AI-native cross-session user modeling with dialectic reasoning, session-scoped context injection, semantic search, and persistent conclusions. Base context now includes the session summary alongside user representation and peer cards, giving the agent awareness of what has already been discussed.


Best for	Multi-agent systems with cross-session context, user-agent alignment
Requires	`pip install honcho-ai` + API key or self-hosted instance
Data storage	Honcho Cloud or self-hosted
Cost	Honcho pricing (cloud) / free (self-hosted)

Tools (5): honcho_profile (read/update peer card), honcho_search (semantic search), honcho_context (session context — summary, representation, card, messages), honcho_reasoning (LLM-synthesized), honcho_conclude (create/delete conclusions)

Architecture: Two-layer context injection — a base layer (session summary + representation + peer card, refreshed on contextCadence) plus a dialectic supplement (LLM reasoning, refreshed on dialecticCadence). The dialectic automatically selects cold-start prompts (general user facts) vs. warm prompts (session-scoped context) based on whether base context exists.

Three orthogonal config knobs control cost and depth independently:

contextCadence — how often the base layer refreshes (API call frequency)
dialecticCadence — how often the dialectic LLM fires (LLM call frequency)
dialecticDepth — how many .chat() passes per dialectic invocation (1–3, depth of reasoning)

Setup Wizard:

hermes memory setup        # select "honcho" — runs the Honcho-specific post-setup

The legacy hermes honcho setup command still works (it now redirects to hermes memory setup), but is only registered after Honcho is selected as the active memory provider.

Config: $HERMES_HOME/honcho.json (profile-local) or ~/.honcho/config.json (global). Resolution order: $HERMES_HOME/honcho.json > ~/.hermes/honcho.json > ~/.honcho/config.json. See the config reference and the Honcho integration guide.

Full config reference

| Key | Default | Description | |-----|---------|-------------| | `apiKey` | -- | API key from [app.honcho.dev](https://app.honcho.dev) | | `baseUrl` | -- | Base URL for self-hosted Honcho | | `peerName` | -- | User peer identity | | `aiPeer` | host key | AI peer identity (one per profile) | | `workspace` | host key | Shared workspace ID | | `contextTokens` | `null` (uncapped) | Token budget for auto-injected context per turn. Truncates at word boundaries | | `contextCadence` | `1` | Minimum turns between `context()` API calls (base layer refresh) | | `dialecticCadence` | `2` | Minimum turns between `peer.chat()` LLM calls. Recommended 1–5. Only applies to `hybrid`/`context` modes | | `dialecticDepth` | `1` | Number of `.chat()` passes per dialectic invocation. Clamped 1–3. Pass 0: cold/warm prompt, pass 1: self-audit, pass 2: reconciliation | | `dialecticDepthLevels` | `null` | Optional array of reasoning levels per pass, e.g. `["minimal", "low", "medium"]`. Overrides proportional defaults | | `dialecticReasoningLevel` | `'low'` | Base reasoning level: `minimal`, `low`, `medium`, `high`, `max` | | `dialecticDynamic` | `true` | When `true`, model can override reasoning level per-call via tool param | | `dialecticMaxChars` | `600` | Max chars of dialectic result injected into system prompt | | `recallMode` | `'hybrid'` | `hybrid` (auto-inject + tools), `context` (inject only), `tools` (tools only) | | `writeFrequency` | `'async'` | When to flush messages: `async` (background thread), `turn` (sync), `session` (batch on end), or integer N | | `saveMessages` | `true` | Whether to persist messages to Honcho API | | `observationMode` | `'directional'` | `directional` (all on) or `unified` (shared pool). Override with `observation` object | | `messageMaxChars` | `25000` | Max chars per message (chunked if exceeded) | | `dialecticMaxInputChars` | `10000` | Max chars for dialectic query input to `peer.chat()` | | `sessionStrategy` | `'per-directory'` | `per-directory`, `per-repo`, `per-session`, `global` |

Minimal honcho.json (cloud)

{
  "apiKey": "your-key-from-app.honcho.dev",
  "hosts": {
    "hermes": {
      "enabled": true,
      "aiPeer": "hermes",
      "peerName": "your-name",
      "workspace": "hermes"
    }
  }
}

Minimal honcho.json (self-hosted)

{
  "baseUrl": "http://localhost:8000",
  "hosts": {
    "hermes": {
      "enabled": true,
      "aiPeer": "hermes",
      "peerName": "your-name",
      "workspace": "hermes"
    }
  }
}

Concept	What it is
Workspace	Shared environment. All Hermes profiles under one workspace see the same user identity.
User peer (`peerName`)	The human. Shared across profiles in the workspace.
AI peer (`aiPeer`)	One per Hermes profile. Host key `hermes` → default; `hermes.<profile>` for others.
Observation	Per-peer toggles controlling what Honcho models from whose messages. `directional` (default, all four on) or `unified` (single-observer pool).

Toggle	Effect
`observeMe`	Honcho builds a representation of this peer from its own messages
`observeOthers`	This peer observes the other peer's messages (feeds cross-peer reasoning)

Key	Default	Description
`user_id`	`hermes-user`	User identifier
`agent_id`	`hermes`	Agent identifier

Key	Default	Description
`mode`	`cloud`	`cloud` or `local`
`bank_id`	`hermes`	Memory bank identifier
`recall_budget`	`mid`	Recall thoroughness: `low` / `mid` / `high`
`memory_mode`	`hybrid`	`hybrid` (context + tools), `context` (auto-inject only), `tools` (tools only)
`auto_retain`	`true`	Automatically retain conversation turns
`auto_recall`	`true`	Automatically recall memories before each turn
`retain_async`	`true`	Process retain asynchronously on the server
`retain_context`	`conversation between Hermes Agent and the User`	Context label for retained memories
`retain_tags`	—	Default tags applied to retained memories; merged with per-call tool tags
`retain_source`	—	Optional `metadata.source` attached to retained memories
`retain_user_prefix`	`User`	Label used before user turns in auto-retained transcripts
`retain_assistant_prefix`	`Assistant`	Label used before assistant turns in auto-retained transcripts
`recall_tags`	—	Tags to filter on recall

Key	Default	Description
`db_path`	`$HERMES_HOME/memory_store.db`	SQLite database path
`auto_extract`	`false`	Auto-extract facts at session end
`default_trust`	`0.5`	Default trust score (0.0–1.0)

Memory Providers

Quick Start

How It Works

Available Providers

Honcho

New profile, fresh Honcho peer

Existing profiles, backfill Honcho peers

Per-profile observation

OpenViking

Mem0

Hindsight

Holographic

RetainDB

ByteRover

Supermemory

Provider Comparison

Profile Isolation

Building a Memory Provider


Best for	Self-hosted knowledge management with structured browsing
Requires	`pip install openviking` + running server
Data storage	Self-hosted (local or cloud)
Cost	Free (open-source, AGPL-3.0)


Best for	Hands-off memory management — Mem0 handles extraction automatically
Requires	`pip install mem0ai` + API key
Data storage	Mem0 Cloud
Cost	Mem0 pricing


Best for	Knowledge graph-based recall with entity relationships
Requires	Cloud: API key from ui.hindsight.vectorize.io. Local: LLM API key (OpenAI, Groq, OpenRouter, etc.)
Data storage	Hindsight Cloud or local embedded PostgreSQL
Cost	Hindsight pricing (cloud) or free (local)


Best for	Local-only memory with advanced retrieval, no external dependencies
Requires	Nothing (SQLite is always available). NumPy optional for HRR algebra.
Data storage	Local SQLite
Cost	Free


Best for	Teams already using RetainDB's infrastructure
Requires	RetainDB account + API key
Data storage	RetainDB Cloud
Cost	$20/month


Best for	Developers who want portable, local-first memory with a CLI
Requires	ByteRover CLI (`npm install -g byterover-cli` or install script)
Data storage	Local (default) or ByteRover Cloud (optional sync)
Cost	Free (local) or ByteRover pricing (cloud)


Best for	Semantic recall with user profiling and session-level graph building
Requires	`pip install supermemory` + API key
Data storage	Supermemory Cloud
Cost	Supermemory pricing

Key	Default	Description
`container_tag`	`hermes`	Container tag used for search and writes. Supports `{identity}` template for profile-scoped tags.
`auto_recall`	`true`	Inject relevant memory context before turns
`auto_capture`	`true`	Store cleaned user-assistant turns after each response
`max_recall_results`	`10`	Max recalled items to format into context
`profile_frequency`	`50`	Include profile facts on first turn and every N turns
`capture_mode`	`all`	Skip tiny or trivial turns by default
`search_mode`	`hybrid`	Search mode: `hybrid`, `memories`, or `documents`
`api_timeout`	`5.0`	Timeout for SDK and ingest requests

Provider	Storage	Cost	Tools	Dependencies	Unique Feature
Honcho	Cloud	Paid	5	`honcho-ai`	Dialectic user modeling + session-scoped context
OpenViking	Self-hosted	Free	5	`openviking` + server	Filesystem hierarchy + tiered loading
Mem0	Cloud	Paid	3	`mem0ai`	Server-side LLM extraction
Hindsight	Cloud/Local	Free/Paid	3	`hindsight-client`	Knowledge graph + reflect synthesis
Holographic	Local	Free	2	None	HRR algebra + trust scoring
RetainDB	Cloud	$20/mo	5	`requests`	Delta compression
ByteRover	Local/Cloud	Free/Paid	3	`brv` CLI	Pre-compression extraction
Supermemory	Cloud	Paid	4	`supermemory`	Context fencing + session graph ingest + multi-container