SOUL.md: the file that makes your AI agent actually yours

The NousResearch Hermes agent has a file called SOUL.md. It doesn't live in the repository — it lives in ~/.hermes/. It's a user file, not a developer file. And it's the most interesting thing about how the agent is designed.

The NousResearch/hermes-agent repo has a file called SOUL.md. It doesn't live in the repository. It lives in ~/.hermes/. It's a user file, not a developer file. And it's the most interesting thing about how the agent is designed.

What SOUL.md is

SOUL.md lives at ~/.hermes/SOUL.md. It goes in slot #1 of the agent's system prompt, before project context, before tools, before anything else. It defines who the agent is: tone, communication style, what it avoids, how it handles uncertainty.

The default content is deliberately minimal:

"You are Hermes Agent, an intelligent AI assistant created by Nous Research. You are helpful, knowledgeable, and direct. You assist users with a wide range of tasks including answering questions, writing and editing code, analyzing information, creative work, and executing actions via your tools. You communicate clearly, admit uncertainty when appropriate, and prioritize being genuinely useful over being verbose."

That's the seed. If no SOUL.md exists, Hermes generates that baseline. If one does exist, it replaces it entirely. No merging. Complete replacement.

The division of responsibilities is clean: SOUL.md gets personality, tone, and style. AGENTS.md gets project structure, coding conventions, file paths. If a directive should apply everywhere, it goes in SOUL.md. If it belongs to a specific project, it goes in AGENTS.md. The two files don't bleed into each other.

How context files stack

SOUL.md is the first of several context layers assembled into the system prompt at session start:

SOUL.md: identity, loaded from $HERMES_HOME only, never from the working directory
Project context (first match wins): .hermes.md → AGENTS.md → CLAUDE.md → .cursorrules
Progressive discovery: as the agent navigates subdirectories during a session, it discovers and injects context files when relevant without bloating the initial prompt

Each file is scanned for prompt injection before inclusion. Patterns like instruction overrides, hidden HTML elements, and credential exfiltration attempts get blocked. Files over 20,000 characters are truncated at a 70/20 head/tail ratio.

The system prompt assembled from these layers is then frozen for the life of the conversation. That's not a coincidence.

Why the system prompt can't change mid-conversation

The most unusual constraint in Hermes is this: per-conversation prompt caching is treated as sacred. The AGENTS.md says so explicitly:

"A long-lived conversation reuses a cached prefix every turn. Anything that mutates past context, swaps toolsets, or rebuilds the system prompt mid-conversation invalidates that cache and multiplies the user's cost. We do not do it."

The only exception is context compression.

This shapes every architecture decision. Skills can be installed during a conversation, but they take effect next session (there's a --now flag for immediate effect, but it's opt-in). Toolsets are fixed at session start. Memory retrieval happens before the first turn, not between turns.

The narrow waist principle follows from the same logic: every tool added to the core schema ships on every API call. Adding a new core tool is expensive in perpetuity, not just once.

The footprint ladder

New capabilities follow a ranked decision tree. Choose the highest rung that works:

Extend existing code: zero new surface
CLI command + skill: zero model-tool footprint; the agent runs hermes <subcommand> guided by a skill
Service-gated tool: appears only when a prerequisite is configured; zero footprint otherwise
Plugin: lives in ~/.hermes/plugins/, discovered at runtime
MCP server: structured I/O as a tool without growing the core schema, reusable by any MCP host
New core tool: last resort, only for capabilities unreachable any other way

The result is that capability accumulates at the edges (platform adapters, skills, plugins) while the core stays narrow. "Smallest footprint" governs how capability is wired in, not whether the product is allowed to grow.

Skills and the curator

Skills are SKILL.md files. They live in ~/.hermes/skills/ and get invoked as slash commands. When the agent extracts a repeatable workflow into a skill, it gets tagged with created_by: "agent" and tracked.

The Curator runs in the background and manages that lifecycle: tracking usage counts, marking skills as stale after inactivity, auto-archiving what hasn't been used in a configurable number of days. It never deletes. The maximum destructive action is archive, to ~/.hermes/skills/.archive/. Pinned skills are exempt from all auto-transitions.

The learning loop: agent does something useful → user or agent extracts it as a skill → skill gets invoked, usage tracked → curator prunes what doesn't get used.

That's the "agent that grows with you" tagline made concrete. The skills accumulate, get curated, and the set that survives is the set that was actually useful.

What this architecture is doing

The interesting design bet here is that agent identity and behavior configuration should be files, not settings. Not a config form, not a provider-side system prompt, not a cloud profile. Files that live on your machine, that you can read and edit, that are stable across conversations.

SOUL.md is the persona file. AGENTS.md is the project brief. The skills directory is the institutional knowledge. The plugins directory is the extension system.

All of it is on your filesystem. All of it is versioned or versionable. When something goes wrong, you can read exactly what the agent was told and why it behaved the way it did.

That's a different bet than most agent frameworks are making, and it's a considered one.