Concepts: Tool Types, MCP, and Agent Safety

The lab connects your agent to real infrastructure tools. Here's the conceptual model for how agents use tools — and why safety boundaries are non-negotiable before your agent touches production.

1. Tools Are the Only Way Agents Touch the World

A tool is anything an agent can call to interact with the outside world: run a command, read a file, call an API, query a service, or delegate work to another agent. Tools are the only mechanism by which an agent affects or observes the environment outside its context window.

This definition has an important implication: an agent without tools is a chatbot. It can reason about infrastructure all day but cannot observe real state, run diagnostics, or execute changes. Tools are what make an agent operational rather than advisory.

Tool Boundaries Are Governance Boundaries

The governance question "what can this agent do?" is answered entirely by which tools are available to it. You do not control agent behavior by writing clever prompts. You control it by controlling tool access:

Remove terminal from platform_toolsets.cli → agent cannot run any shell commands, regardless of what it decides
Add terminal but set approvals.mode: manual → agent can run shell commands, but DANGEROUS_PATTERNS-matching commands require human approval
Set command_allowlist: ["SELECT", "EXPLAIN"] → specific patterns are pre-approved at L4 governance

This is why platform_toolsets.cli is the first thing to review when evaluating an agent's governance posture. Tool access is the mechanical boundary. SOUL.md NEVER rules are the behavioral boundary. Both must be reviewed together — but the mechanical boundary is easier to audit and harder to accidentally override.

Tool Discovery vs Tool Invocation

The agent interacts with tools in two distinct phases:

Discovery: At session startup, the tool registry builds a list of all available tools (those whose toolset is in platform_toolsets.cli AND whose check_fn() returns True). This list is expressed as JSON Schema function definitions and passed to the LLM in the tools parameter of every API call. The Brain "knows" what tools exist because they are in its context.

Invocation: During the agent loop, the LLM emits a tool_call object specifying the tool name and arguments. The registry's dispatch() method routes this to the correct handler. The handler executes, returns a result string, and the loop continues.

The agent cannot call a tool that is not in its schema list. Even if the Brain generates text that looks like a tool call to a non-registered tool, the registry returns {"error": "Unknown tool: ..."}. This is the enforcement mechanism for tool-based governance.

2. The Three Tool Integration Patterns

Pattern 1: CLI Tools via Terminal Toolset

The agent executes shell commands directly, using the terminal_tool registered in the terminal toolset. Any command that can be executed in a shell can be run by the agent: aws, kubectl, psql, curl, grep, terraform, ansible, etc.

How to configure it:

platform_toolsets:
  cli: [terminal, file, web, skills]

The presence of terminal in the list enables the agent to execute shell commands.

When to use it:

Any standard DevOps CLI tool that the environment has installed
Commands with well-known output formats (AWS JSON responses, kubectl YAML/JSON, psql CSV)
Operations that map directly to existing operational procedures in SKILL.md

What it enables vs what it does not: CLI tools give the agent general command execution capability. This is powerful and general. What it does NOT do:

Provide structured input validation (the agent must format commands correctly)
Guarantee output parsing (the agent must interpret CLI output from the context)
Enforce safe operations (that is the job of DANGEROUS_PATTERNS and SOUL.md)

DevOps analogy: Direct SSH access to a server. Powerful, flexible, and requires strict access controls. You would not give a junior team member unrestricted SSH to production. Same principle applies to agent tool access.

The tradeoff: CLI is the lowest-friction integration pattern. Any tool your team already uses can be called by the agent without writing adapter code. The cost is that CLI tools have no type safety, no versioned API contract, and no structured output guarantee. The agent must interpret the output based on its training knowledge and the expected output blocks in SKILL.md.

Pattern 2: MCP (Model Context Protocol) Tools

MCP is a standardized protocol for tool integration. Instead of the agent calling tools directly, tools expose themselves as MCP servers. The agent connects to MCP servers using the MCP client protocol and calls tools through that standardized interface.

What it is: The agent calls a structured external service that implements the MCP protocol (standardized by Anthropic in late 2024). MCP servers expose typed function interfaces — the agent calls a function with typed inputs and receives a typed output, rather than running a shell command and parsing text.

The key difference from CLI: MCP is a protocol layer, not a direct invocation. The agent speaks MCP; the MCP server translates that into the actual tool call.

How it works: An MCP server is a process that the agent connects to over a socket or HTTP. The server exposes a tools/list endpoint that returns tool schemas. Hermes's MCP client (registered in the mcp toolset) discovers available tools from connected MCP servers and registers them in the same ToolRegistry used for CLI tools. From the Brain's perspective, MCP tools and CLI tools look identical — both appear as function schemas in the tool list.

When to use it:

Complex integrations that benefit from structured I/O (observability platforms, ticketing systems, notification services)
Services where you want a versioned API contract that is stable across agent updates
When you want to share tools across multiple agent frameworks — MCP is a cross-platform protocol

The tradeoff: MCP provides better structure, versioning, and reusability than CLI tools. The cost is setup complexity: you need a running MCP server, a connection configuration, and the server must implement the protocol correctly. For simple DevOps CLI tools that already work well from the command line, MCP adds overhead without significant benefit. For complex integrations (PagerDuty, Datadog, Slack), MCP is the right choice.

DevOps analogy: The Container Runtime Interface (CRI) in Kubernetes. CRI is a standardized interface that lets Kubernetes talk to any container runtime (containerd, CRI-O, etc.) without knowing the runtime's internal API. MCP does the same for AI tool integration: standardize the interface so tools are swappable without changing the agent.

Pattern 3: Mock Wrapper Scripts

A thin shell script placed earlier in PATH than the real CLI tool. The wrapper intercepts calls to the CLI tool (e.g., aws, psql, kubectl) and routes them either to pre-baked mock data files or to the real tool, based on an environment variable (HERMES_LAB_MODE).

How it works:

if [[ "$HERMES_LAB_MODE" != "mock" ]]; then
  exec "$(command -v aws)" "$@"   # pass through to real aws CLI
fi
# MOCK MODE: serve pre-baked JSON
case "$1 $2" in
  "rds describe-db-instances")
    cat "$MOCK_DATA_DIR/rds/describe-db-instances.json"
    ;;
  "ce get-cost-and-usage")
    cat "$MOCK_DATA_DIR/cost-explorer/normal-spend.json"
    ;;
esac

The agent never knows it is in mock mode. It runs the same aws rds describe-db-instances command it would run in live mode. The wrapper transparently substitutes mock data.

The scenario selection mechanism:

Mock wrappers support a second environment variable, HERMES_LAB_SCENARIO:

SCENARIO="${HERMES_LAB_SCENARIO:-clean}"
if [[ "$SCENARIO" == "messy" ]]; then
  cat "$MOCK_DATA_DIR/rds/describe-db-instances-slow.json"
else
  cat "$MOCK_DATA_DIR/rds/describe-db-instances.json"
fi

This allows switching between a clean baseline scenario (clean) and a problematic scenario (messy) without changing any agent configuration.

When to use it:

Lab environments where real infrastructure is not available
Testing agent behavior against specific scenarios without a live system
Simulating failure scenarios that would be unsafe or expensive to reproduce on real infrastructure

Setup:

export HERMES_LAB_MODE=mock
export HERMES_LAB_SCENARIO=messy    # or: clean
export PATH="$(pwd)/course/infrastructure/wrappers:$PATH"

Adding the wrappers/ directory at the front of PATH ensures that when the agent runs aws rds describe-db-instances, the OS finds wrappers/aws first and executes the mock wrapper instead of the real AWS CLI.

Choosing the Right Pattern

Factor	CLI	MCP	Mock Wrapper
Setup complexity	Low	Medium-High	Low
Reusability across AI systems	Low	High	N/A (testing only)
Works offline	Yes	Depends	Yes
Appropriate for production	Yes	Yes	No
Best for	Standard DevOps CLIs (aws, kubectl, psql)	Shared team tools, complex integrations	Lab environments, CI testing

Use CLI tools when: The tool already exists as a CLI binary your team uses operationally. No additional setup required beyond adding terminal to platform_toolsets.cli.

Use MCP when: You need structured I/O that a CLI tool cannot provide cleanly, or when you want to reuse the same integration across multiple agent frameworks.

Use wrappers when: You are in a lab environment, testing environment, or demonstration environment. Wrappers are also the right choice for smoke testing: HERMES_LAB_MODE=mock lets you verify the skill procedure is correct before running it against real infrastructure.

3. Platform Toolsets: What Each Enables

The platform_toolsets.cli list in config.yaml controls which logical tool groups are available:

Toolset	What it enables	Typical use
`terminal`	Shell command execution (`terminal_command`, `execute_script`)	Domain specialist agents running CLI tools
`file`	File read/write/create/search (`read_file`, `write_file`, `search_files`)	Agents that need to read config files, write reports, or create output files
`web`	Web search and page retrieval (`web_search`, `read_webpage`)	Any agent that may need to look up documentation, check external APIs, or verify public information
`skills`	Skills discovery (`skills_search`)	Agents with a large skill library that need to dynamically select the right skill
`mcp`	All tools from connected MCP servers	Agents using structured external integrations
`memory`	Cross-session memory storage/retrieval	Agents that need to persist findings across multiple sessions
`delegate`	Subagent delegation (`delegate_task`)	Fleet coordinator and multi-agent orchestration patterns

Domain specialist agents (Track A, B, C) use: [terminal, file, web, skills]

Fleet coordinator uses: [web, skills] — no terminal, so it cannot execute commands directly

The minimum functional set: An agent needs at least [terminal, skills] to run CLI-based diagnostic skills. Remove terminal and the agent can read its skills but cannot execute the commands in them.

4. DANGEROUS_PATTERNS: The Mechanical Safety Gate

Before any terminal command executes, check_all_command_guards(command, env_type) runs. This function normalizes the command (strips ANSI escapes, null bytes, Unicode homoglyphs — obfuscation bypass prevention) then matches it against the DANGEROUS_PATTERNS list in tools/approval.py.

The DANGEROUS_PATTERNS list includes approximately 30 patterns covering:

Category	Example patterns
Destructive filesystem	`rm -rf`, `rm -r`, `find -delete`, `xargs rm`
Database destruction	`DROP TABLE`, `DROP DATABASE`, `TRUNCATE TABLE`, `DELETE FROM` (without WHERE)
System file writes	`> /etc/`, `tee /etc/`, `sed -i /etc/`, `cp ... /etc/`
System service control	`systemctl stop`, `systemctl disable`, `systemctl mask`
Process termination	`kill -9 -1` (all processes), `pkill -9`
Remote code execution	`curl ...
Shell injection	`bash -c`, `python -c`, `bash -lc`
Sensitive path writes	`~/.ssh/`, `~/.hermes/.env`
Self-termination	`pkill hermes`, `killall gateway`

What is NOT in DANGEROUS_PATTERNS (by design):

Some commands that could cause damage in the wrong context are intentionally not in the list. Examples:

kubectl delete pod, kubectl drain, kubectl cordon
aws ec2 terminate-instances
CREATE INDEX, ALTER TABLE
aws rds modify-db-instance

These are governed by SOUL.md NEVER rules rather than DANGEROUS_PATTERNS. This separation is intentional: DANGEROUS_PATTERNS covers commands that are catastrophically, universally dangerous. Domain-specific dangerous commands (dangerous in context, but potentially legitimate in other contexts) are handled by SOUL.md NEVER rules.

The implication: SOUL.md NEVER rules are load-bearing for Track B (FinOps) and Track C (Kubernetes) — they are the primary (and only) control for the most dangerous commands in those domains.

Approval Modes

When DANGEROUS_PATTERNS detects a match, the approvals.mode from config.yaml determines the response:

manual (L2 governance): The agent thread blocks. The user sees:

⚠️  DANGEROUS COMMAND: SQL DROP
    DROP TABLE users
    [o]nce  |  [s]ession  |  [a]lways  |  [d]eny

once: approve this instance only; future matches require re-approval
session: approve for the duration of this Hermes session
always: add to command_allowlist in config (permanent, survives session restart)
deny: block the command; the agent receives {"approved": False, ...} and must report it cannot proceed

The 5-minute timeout (timeout: 300) is important for lab flows with multiple approval steps — without it, the agent would be blocked indefinitely.

smart (L3 governance): An auxiliary LLM (Haiku-tier) reviews the flagged command:

APPROVE: command is a false positive (e.g., python -c "print('hello')" triggering the -c flag pattern) → auto-approved
DENY: command is genuinely dangerous → blocked, no user prompt
ESCALATE: uncertain → falls through to manual prompt

Smart mode reduces approval fatigue for agents running complex diagnostic scripts with harmless commands that happen to match patterns.

off: All DANGEROUS_PATTERNS checks are bypassed. Only appropriate for trusted local development environments. Never for production.

5. Two-Layer Safety: SOUL.md + DANGEROUS_PATTERNS

Hermes implements a two-layer safety system. The layers are independent — both must be satisfied for a command to execute.

Layer 1 — Behavioral (SOUL.md NEVER rules):

The SOUL.md file defines the agent's identity, scope, and behavioral constraints. The NEVER DO section encodes domain-specific prohibitions:

NEVER execute ALTER TABLE, CREATE INDEX, or any DDL without explicit human approval
NEVER recommend VACUUM FULL during business hours
NEVER mask an ambiguous root cause

These rules are part of the Brain's context. When the Brain generates a plan, it applies these rules during reasoning — before emitting a tool call. The behavioral layer operates at the decision level.

Layer 2 — Mechanical (DANGEROUS_PATTERNS in tools/approval.py):

Even if the Brain decides to run a command (violating a SOUL.md rule), tools/approval.py intercepts every terminal command before execution.

The two-layer model is necessary because the behavioral layer (SOUL.md) can be overridden by context — a sufficiently clever framing can sometimes bypass prompt-level constraints. The mechanical layer (DANGEROUS_PATTERNS) cannot be bypassed by context: the regex runs after the command is formed, before it executes, regardless of what the Brain decided.

Failure Mode	Layer 1 (Behavioral)	Layer 2 (Mechanical)
Agent misunderstands instructions	NEVER rules create strong baseline resistance	Approval gate catches the result anyway
Rare LLM compliance failure	Doesn't help — the rule was "forgotten"	Approval gate fires regardless
Novel edge case not in SOUL.md	Not covered	Covered if command matches DANGEROUS_PATTERNS
Human needs visibility into correct decisions	Not provided	Approval events create an audit trail

6. SOUL.md: Agent Identity and Behavioral Constraints

SOUL.md is the identity file for your agent. It defines who the agent is, what it does, what it will never do, and when it escalates. The LLM reads SOUL.md and internalizes it as "who I am" — it is not a per-request instruction; it is a persistent identity layer that shapes every response.

SOUL.md is loaded at agent startup and becomes part of the system prompt. Changes to SOUL.md take effect on the next session.

SOUL.md Structure

# Agent Name — Role Title

**Role:** One-line role description
**Domain:** Track A: Database | Track B: FinOps | Track C: Kubernetes | Fleet Coordinator
**Scope:** What this agent is responsible for — and what it explicitly is NOT responsible for

## Identity

[2-3 sentences. First person. Start with: "You are [Name], a [role] agent for [team/org]."]

## Behavior Rules

[Positive rules — what to always do, how to do it, reporting format]
[NEVER rules — hard prohibitions in ALL CAPS]

## Escalation Policy

[Specific, quantified, observable conditions for handing off to a human]

DevOps analogy: The CLAUDE.md for a project, applied to an agent identity. Just as CLAUDE.md provides system context for AI generation tasks, SOUL.md provides identity context for an autonomous agent.

SOUL.md vs. Per-Request System Prompts

	SOUL.md	Per-Request System Prompt
When loaded	Once at agent startup	Reconstructed each turn
Scope	Entire session	Single turn
Purpose	Persistent identity layer	Contextual instruction
Written by	Agent designer (profile author)	Agent runtime (prompt builder)
Mutability	Fixed for the session	Can change each turn

SOUL.md is loaded by agent/prompt_builder.py during system prompt assembly. It is injected as the first, highest-priority context block — before skills, before memory, before the user's current instruction.

The Specificity Requirement for NEVER Rules

Generic rules are useless. Compare:

Generic (useless):

NEVER do anything that could harm production.

Domain-specific (useful):

NEVER execute ALTER TABLE, CREATE INDEX, or any DDL without explicit human approval
— reason: locks the table for the duration, blocks production writes during business hours.

NEVER recommend VACUUM FULL during business hours
— reason: acquires exclusive lock on the table, blocks all reads and writes for the duration
(minutes to hours on large tables), causes application timeout cascade.

The domain-specific version tells the Brain exactly which actions to avoid and exactly what catastrophic outcome each action causes.

Identity Shapes Refusal Behavior

An agent with a well-designed SOUL.md will refuse even when a user explicitly asks. The identity makes refusal consistent with self-concept. An identity that says "You are Aria, a database reliability agent who diagnoses performance problems and recommends fixes but never executes changes" will refuse DDL execution even if the user says "I authorize you to run this CREATE INDEX."

7. When to Use CLI vs. MCP

Factor	CLI	MCP
Setup complexity	Low	Medium-High
Reusability across AI systems	Low	High
Safety configurability	Governed by DANGEROUS_PATTERNS + SOUL.md	Server-level policy
Debugging ease	High (test in terminal)	Lower (protocol layer)
Best for	Read-only diagnostics, familiar CLIs	Shared team infrastructure, multi-AI environments

Decision framework:

Is the tool primarily a CLI you use every day? Use direct CLI.
Is the tool a third-party service with a REST API and complex auth? Use MCP.
Are you building tool infrastructure that multiple agents or teams will share? Use MCP.
Are you testing or in a lab environment? Use mock wrappers.

Summary

Concept	What It Is	DevOps Analogy
CLI tool	Agent executes shell commands as subprocess	Direct SSH access
MCP	Standardized tool protocol — tools as swappable servers	Container Runtime Interface (CRI)
Mock wrapper	Bash script intercepting CLI tools via PATH manipulation	Test double / stub for infrastructure commands
`platform_toolsets.cli`	Config key controlling which tool categories exist for this agent	RBAC policy for service accounts
DANGEROUS_PATTERNS	Regex list intercepting destructive commands before execution	Network policy deny rules
Approval modes	manual (L2) / smart (L3) / off — controls what happens on pattern match	Change management tiers
SOUL.md	Agent identity and behavioral constraints — loaded once at startup	CLAUDE.md for the agent itself
Two-layer safety	SOUL.md (behavioral) + DANGEROUS_PATTERNS (mechanical)	Defense in depth

Context engineering connection: Tool configuration IS context engineering — you are engineering the operational context (what the agent can and cannot do) before the agent ever runs a task.

Next: Reference — Tool Configuration and Safety Setup

1. Tools Are the Only Way Agents Touch the World​

Tool Boundaries Are Governance Boundaries​

Tool Discovery vs Tool Invocation​

2. The Three Tool Integration Patterns​

Pattern 1: CLI Tools via Terminal Toolset​

Pattern 2: MCP (Model Context Protocol) Tools​

Pattern 3: Mock Wrapper Scripts​

Choosing the Right Pattern​

3. Platform Toolsets: What Each Enables​

4. DANGEROUS_PATTERNS: The Mechanical Safety Gate​

Approval Modes​

5. Two-Layer Safety: SOUL.md + DANGEROUS_PATTERNS​

6. SOUL.md: Agent Identity and Behavioral Constraints​

SOUL.md Structure​

SOUL.md vs. Per-Request System Prompts​

The Specificity Requirement for NEVER Rules​

Identity Shapes Refusal Behavior​

7. When to Use CLI vs. MCP​

Summary​

1. Tools Are the Only Way Agents Touch the World

Tool Boundaries Are Governance Boundaries

Tool Discovery vs Tool Invocation

2. The Three Tool Integration Patterns

Pattern 1: CLI Tools via Terminal Toolset

Pattern 2: MCP (Model Context Protocol) Tools

Pattern 3: Mock Wrapper Scripts

Choosing the Right Pattern

3. Platform Toolsets: What Each Enables

4. DANGEROUS_PATTERNS: The Mechanical Safety Gate

Approval Modes

5. Two-Layer Safety: SOUL.md + DANGEROUS_PATTERNS

6. SOUL.md: Agent Identity and Behavioral Constraints

SOUL.md Structure

SOUL.md vs. Per-Request System Prompts

The Specificity Requirement for NEVER Rules

Identity Shapes Refusal Behavior

7. When to Use CLI vs. MCP

Summary