Exploratory: Agent Skills Stretch Projects

These are exploratory stretch projects — not required to complete Module 7. They are for participants who finish the main lab early or want to push the skill-writing discipline further.

Project 1: Cross-Domain Skill

Estimated time: 45 minutes Extends: Module 7 lab (any track) Prerequisites: Your track's SKILL.md completed and loading in Hermes

What You Will Build

Write a SKILL.md that crosses domain boundaries — for example, a "deployment health check" skill that covers both application metrics AND infrastructure state. Most real incidents involve multiple layers, but most skills are single-domain. A cross-domain skill must handle the coordination explicitly.

Challenge

The challenge is scope management. Cross-domain skills get long fast. You need to decide: which sub-domains are in scope, where to draw the boundary (and escalate to a specialist), and how to structure the decision tree so the agent knows when it has "enough" to diagnose versus when it needs to go deeper.

Steps

Choose a cross-domain scenario relevant to your environment (examples: "service degradation — check both K8s pod health and RDS connection pool", "cost anomaly — check both EC2 utilization and data transfer charges")
Map the domain boundary: define which tools belong to each domain and what information must cross the boundary (e.g., an EC2 instance ID that appears in both the K8s node pool and the CloudWatch metrics)
Write the cross-domain SKILL.md, explicitly handling the boundary: what does the agent conclude at the infrastructure layer, what does it pass to the application layer
Identify the escalation condition: at what point does the agent determine it needs a human specialist rather than escalating between its own sub-domains
Test with Hermes against simulated data from both domains

Expected Deliverable

A SKILL.md covering two domains with an explicit handoff section and clear escalation criteria. The skill should be under 2,000 tokens (context budget discipline).

Project 2: Skill Validation Harness

Estimated time: 30 minutes Extends: Module 7 lab (any track) Prerequisites: Your SKILL.md from the lab

What You Will Build

A test harness — a set of 5-8 simulated scenarios with expected agent outputs — that validates your skill against realistic edge cases before deploying to a real environment.

Challenge

Skills fail at edges. The happy path (normal input, expected output) rarely reveals problems. Edge cases reveal ambiguity: missing data, threshold boundary conditions, contradictory signals (high CPU but low network AND low disk — what's the cause?). A validation harness forces you to specify expected behavior before running the agent, which reveals ambiguity in the skill definition itself.

Steps

Create a file skill-test-scenarios.md with 5-8 test cases in this format:

## Scenario 1: Normal Operation
**Inputs:** instance_id=i-0abc123, cpu_avg=45, cpu_peak=60, network_packets=normal, status_checks=ok
**Expected agent action:** Document as normal, no escalation
**Expected report sections:** State=running, CPU=normal, recommendation=monitor

## Scenario 2: CPU Critical with Disk I/O Spike
**Inputs:** instance_id=i-0def456, cpu_avg=92, cpu_peak=98, disk_read_ops=high, network=normal
**Expected agent action:** Identify I/O bound workload, recommend EBS optimization check, escalate P2
**Expected report sections:** Root cause = I/O bound, recommendation = IOPS-optimized volume

Run each scenario through Hermes with your skill loaded. Compare actual agent output to expected output.
For each deviation: determine whether the skill is ambiguous (fix the skill) or the expected output was wrong (update the scenario).
Iterate until all scenarios match.

Expected Deliverable

skill-test-scenarios.md with 5-8 scenarios, actual vs. expected comparison notes, and at least one skill update you made based on a discovered ambiguity.

Project 3: Memory-Augmented Skill

Estimated time: 45 minutes Extends: Module 7 lab (any track) Prerequisites: Your SKILL.md from the lab, Hermes memory tool enabled

What You Will Build

Extend your skill to use Hermes's long-term memory tool — storing key findings after each execution and retrieving relevant history at the start of future executions. This turns a stateless diagnostic skill into a skill that accumulates operational intelligence over time.

Challenge

Memory retrieval adds latency and context cost. The challenge is deciding what to store (valuable patterns, not noise) and what to retrieve (relevant history for this specific instance or domain, not all stored memory). A poorly designed memory-augmented skill stores everything and retrieves everything — this bloats the context and reduces response quality.

Steps

Add two sections to your existing SKILL.md:

## Memory Retrieval (Start of Execution)
- Query long-term memory for: previous findings for {instance_id}
- If found: include prior incident summary in Step 1 context
- If not found: proceed without historical context

## Memory Storage (End of Execution)
- Store: {instance_id}, {timestamp}, {diagnosis_summary}, {action_taken}, {escalation_decision}
- Tag with: domain="ec2-health", instance_id={instance_id}
- Do NOT store: raw metric dumps, full command output (too verbose for recall value)

Run the skill twice on the same simulated instance. Verify the second run includes the first run's context.
Test the retrieval quality: does the agent's second-run diagnosis benefit from the first-run context, or is the retrieved memory adding noise?

Expected Deliverable

An updated SKILL.md with memory sections, plus notes on what you stored vs. what you decided not to store and why.

Which Project Should You Do?

Your Focus	Recommended Project
Breadth — multiple services in your domain	Project 1 (cross-domain)
Quality and reliability	Project 2 (validation harness)
Stateful agents	Project 3 (memory augmentation)
Under 30 minutes available	Project 2 — most focused and highest skill-quality impact

All three projects extend the skill-writing discipline from the lab. The goal is to experience the edges: where does your SKILL.md fail, and how do you fix it?

Project 4: Compare with kube-troublesim (advanced)

Difficulty: Advanced • Time: 60-90 min • Prerequisites: Completed the main Module 7 lab; KIND cluster running

The kube-troublesim repository (kubeagentix organization) is a nascent collection of Kubernetes failure-mode YAML manifests — similar in spirit to the six baked scenarios in infrastructure/scenarios/k8s/ that this course ships. As of April 2026 the repo has one commit, no releases, and no README — treat it as a "watch this space" tool, not a stable lab dependency.

What to try

Clone the kube-troublesim repo and inspect the set01/ directory

Apply one of its scenarios to your KIND cluster:

git clone https://github.com/kubeagentix/kube-troublesim.git
kubectl apply -f kube-troublesim/set01/01-imagepull-error.yaml

Compare the failure mode it produces against the equivalent course scenario at infrastructure/scenarios/k8s/01-image-pull-backoff.yaml
Note which manifests overlap with the course's six K8S-02 failure modes and which are different
Run sre-k8s-pod-health against a kube-troublesim scenario — does it diagnose correctly?

What to document

Which kube-troublesim scenarios mapped 1:1 to course scenarios
Which produced different kubectl outputs (and why — e.g., different image, different probe config)
Whether the skill's six decision branches were sufficient or if any failure mode escaped them

Why this is exploratory and not required

kube-troublesim is at an early stage (1 commit, no releases, no README, no documented KIND compatibility). The course's baked scenarios are version-controlled, reproducible, and require zero external dependencies — they are the reliable lab path. This stretch project is for participants who want to explore alternative chaos-engineering tooling and contribute observations back to the course.

Project 1: Cross-Domain Skill​

What You Will Build​

Challenge​

Steps​

Expected Deliverable​

Project 2: Skill Validation Harness​

What You Will Build​

Challenge​

Steps​

Expected Deliverable​

Project 3: Memory-Augmented Skill​

What You Will Build​

Challenge​

Steps​

Expected Deliverable​

Which Project Should You Do?​

Project 4: Compare with kube-troublesim (advanced)​

What to try​

What to document​

Why this is exploratory and not required​

Project 1: Cross-Domain Skill

What You Will Build

Challenge

Steps

Expected Deliverable

Project 2: Skill Validation Harness

What You Will Build

Challenge

Steps

Expected Deliverable

Project 3: Memory-Augmented Skill

What You Will Build

Challenge

Steps

Expected Deliverable

Which Project Should You Do?

Project 4: Compare with kube-troublesim (advanced)

What to try

What to document

Why this is exploratory and not required