Quiz: Fleet Orchestration

These questions test your understanding of fleet architecture, delegation patterns, and coordinator design.

Question 1: Fleet Pattern Selection

A team manages a large microservices architecture with 12 separate services. They want to deploy agents that can diagnose issues across all services. Each service has unique technology characteristics. Which fleet pattern is most appropriate?

A) Round-robin — distribute all diagnostic tasks equally across 12 identical agents B) Skill-based routing — deploy specialist agents per technology domain, with a coordinator that routes to the right specialist based on the incident domain C) Hierarchical delegation — add another tier of management between the coordinator and specialists D) Single generalist agent — 12 specialists is too complex to coordinate

Show Answer

Correct answer: B) Skill-based routing — deploy specialist agents per technology domain, with a coordinator that routes to the right specialist based on the incident domain

With 12 services of varied technology, the key insight is that specialization matters. A Postgres specialist will diagnose database issues more reliably than a generalist. A Kubernetes specialist will diagnose pod issues more reliably than a generalist.

Skill-based routing captures this advantage: the coordinator understands which domain each service's issues fall into, routes to the appropriate specialist, and synthesizes the findings. You do not need 12 specialist agents — you likely need 3-5 domain specialists (database, kubernetes, networking, cost) that cover all 12 services.

Why not round-robin? Round-robin assumes homogeneous tasks and homogeneous agents. 12 specialized services require specialized diagnosis.

Why not hierarchical? With 5 or fewer domain specialists, the overhead of a hierarchical tier is not justified. Hierarchical delegation makes sense at 10+ specialists.

Why not single generalist? A generalist agent handling 12 specialized services will produce lower-quality diagnoses than specialists. The problem is not coordination complexity — it is diagnostic quality.

Question 2: Coordinator Role

A coordinator agent is handling a cross-domain incident. The DB Health specialist responds with "I found the root cause: connection pool exhaustion." The K8s specialist responds with "All pods are healthy, no issues found." The FinOps specialist responds with "No cost anomalies in the last 24 hours." What should the coordinator do next?

A) Adopt the DB specialist's conclusion as the final answer and close the incident B) Ask the DB specialist to provide remediation steps and execute them autonomously C) Synthesize: confirm the root cause from DB evidence, note that K8s and FinOps found no related issues, generate cross-domain report with correlation analysis (did the connection pool issue occur in isolation, or is there upstream context?) D) Escalate to all three on-call teams since each specialist responded with a different finding

Show Answer

Correct answer: C) Synthesize: confirm the root cause from DB evidence, note that K8s and FinOps found no related issues, generate cross-domain report with correlation analysis

The coordinator's job is synthesis, not delegation only. Even when one specialist finds the issue and others find nothing, the "nothing found" responses are part of the synthesis:

"K8s shows healthy pods" means the connection pool issue is not caused by application scaling
"FinOps shows no anomalies" means this is not a cost-driven event

Together: "Root cause is connection pool exhaustion in isolation — not associated with pod scaling (K8s normal) or unusual cost patterns (FinOps normal). Likely cause: application query pattern change or parameter configuration."

This is more valuable than just reporting the DB finding. The coordinator adds value by confirming what did NOT happen — eliminating alternative explanations.

Why not Option A? Adopting only the DB finding loses the cross-domain context. The absence of issues in other domains is also evidence.

Why not Option B? The coordinator's SOUL.md says: "I never attempt domain-specific diagnosis or remediation myself." Remediation belongs to the specialist + human review chain.

Why not Option D? Finding the root cause in one domain and confirming it in isolation is not a reason to escalate all three on-call teams.

Question 3: Delegation Quality

A coordinator delegates this task to the K8s specialist: "Check the Kubernetes environment." The specialist returns a generic report covering all pods across all namespaces. What is wrong with the delegation?

A) The coordinator should not delegate to the K8s specialist at all — Kubernetes is a general skill all agents should have B) The delegation task is too broad — it lacks the incident time window, specific namespace or service, and the specific question being answered. The specialist had no bounded scope and produced an unfocused report. C) The coordinator should have used round-robin instead of skill-based routing D) The K8s specialist's SOUL.md is misconfigured — it should reject broad tasks

Show Answer

Correct answer: B) The delegation task is too broad — it lacks the incident time window, specific namespace or service, and the specific question being answered

Good delegation from the coordinator specifies:

Scope: Which namespace, service, or component to inspect
Time window: When the incident started and how long to look back
Context: What the coordinator already knows (from other specialists or the original incident report)
Specific question: What the coordinator needs this specialist to answer

"Check the Kubernetes environment" is the equivalent of waking up your on-call engineer and saying "look at the cluster." They will look at everything, find things to mention, and return a generic report that doesn't answer the question.

A better delegation: "Check pod health in namespace=app for the period 02:00-06:00 UTC April 1. Incident context: API latency increased 300% starting 02:15. Specifically: did any pods in app namespace restart, crash, or become unhealthy around 02:15? Include exit codes for any crashed pods."

The specialist now knows exactly what question to answer and what time period to examine.

Question 4: Shared Context

In a fleet with three specialists running in parallel, the DB specialist discovers connection pool exhaustion at 02:15. The K8s specialist is simultaneously analyzing the same time period. Should the K8s specialist have the DB specialist's finding in its context?

A) Never — specialists should work independently to avoid anchoring bias B) Always — all specialists should see all other specialists' findings in real-time C) It depends — if the connection pool finding could change what the K8s specialist looks for (e.g., look for pod count increase that caused the connection pressure), include it. If it risks anchoring the K8s specialist away from independent findings, keep it separate. D) The coordinator should run specialists sequentially, always passing prior findings to the next specialist

Show Answer

Correct answer: C) It depends — include the finding if it changes what the K8s specialist should look for; keep separate if it risks anchoring bias

This is the coordinator-mediated context pattern from the concepts reading. The coordinator decides what information crosses domain boundaries based on whether it is helpful or introduces bias.

Include the DB finding if:

"Connection pool exhaustion starting 02:15" — tells K8s specialist to look for pod count increases around 02:10-02:15 that might have created additional connections
The causal hypothesis is: pod scaling → more connections → exhaustion. K8s data can confirm or refute this.

Keep separate if:

The coordinator wants to know what the K8s specialist finds independently, without the DB finding biasing their analysis
The coordinator can compare independent findings afterward to assess whether they converge or diverge

In the Module 12 lab, the coordinator includes relevant cross-domain context in its delegation messages — the "Note:" section in the good delegation example. This is the most common pattern for 2-3 specialist fleets where the coordinator has enough context to know what's useful to share.

Option D (always sequential, always pass findings) loses the parallelism benefit — it takes 3x longer than parallel operation with coordinator-mediated context.

Question 5: Fleet Value Proposition

A team says: "Instead of a three-specialist fleet, we'll use one general-purpose agent and give it all three skill files (DB health, K8s health, FinOps)." What is the primary limitation of this approach compared to the fleet architecture?

A) A single agent cannot use three skill files simultaneously — Hermes has a skill limit of one per session B) Loading three full skill files into one agent's context window consumes significant context budget, leaves less space for operational data, and the agent lacks tool specialization (DB agent needs psql access; K8s agent needs kubectl; one agent with all three expands the blast radius of each) C) A generalist agent is always less accurate than a specialist — this is a fundamental property of AI models D) There is no limitation — the single-agent approach is preferable for small teams

Show Answer

Correct answer: B) Loading three full skill files into one agent's context window consumes significant context budget, leaves less space for operational data, and the agent lacks tool specialization

The single-agent approach has three real trade-offs:

Context budget: Three full SKILL.md files (each 1,200-2,000 tokens for medium complexity) use 3,600-6,000 tokens of context window before the agent has seen any operational data. The fleet architecture loads only the relevant skill into each specialist's context — the K8s specialist only carries the K8s skill, not all three.

Tool specialization: The DB health agent needs psql access; the K8s agent needs kubectl; the FinOps agent needs AWS Cost Explorer. Giving one agent all three tool sets expands the blast radius: an error in the generalist agent's configuration could affect all three infrastructure domains simultaneously. Specialist agents have narrow, domain-appropriate tool access.

Safety boundaries: Giving one agent read access to RDS + kubectl + Cost Explorer + potentially more creates a wider attack surface for prompt injection or misconfiguration.

When the single-agent approach is acceptable: For simple scenarios where the domains rarely overlap, or for agents handling sequential (not parallel) investigations, one agent with selective skill loading can work. But the fleet architecture scales better as complexity grows.

Option C is partially true (specialists often produce better output) but is not the primary architectural reason to use fleets.

Question 6: Dual Apply Path Tradeoff

Your Phase 9 FLEET-01 lab introduced two apply paths: Path A (direct kubectl apply at L4 governance) and Path B (GitOps PR-based apply via gh pr create + apply.sh sync). Your team is debating which path to use for their production incident response pipeline. Which statement best captures the tradeoff?

A) Path A is always faster, so it is always the right choice for production B) Path B is always safer, so it is always the right choice for production C) Path A is faster and simpler but has no diff review opportunity; Path B is auditable and review-gated but requires GitOps infrastructure (ArgoCD/Flux/CI) that not every team runs D) Path A and Path B are functionally equivalent — pick whichever your team prefers

Show Answer

Correct answer: C) Path A is faster and simpler but has no diff review opportunity; Path B is auditable and review-gated but requires GitOps infrastructure that not every team runs

Phase 9's dual path design teaches that there is no universal "best" answer. Path A (direct apply) trades auditability for simplicity — one command, one approval, one apply, via Telegram. Path B (GitOps PR) trades speed for auditability — the human sees the diff in GitHub UI, the PR history is the audit record, and the same flow works across multiple operators without a Telegram bot.

The decision depends on your operational context:

Small team, mature Telegram bot, high-trust operators → Path A works fine
Larger team, compliance mandate, existing ArgoCD/Flux/CI → Path B is the natural fit
Regulated environment with change management requirements → Path B with formal PR review

Phase 9 ships BOTH paths because not every participant has ArgoCD installed — the helm/kubectl fallback script (apply.sh) is Sub-path B2, the only fully implementable Path B mechanism in v1.1. ArgoCD (Sub-path B1) is a v1.2 alternative for teams already running it.

Why not A or B alone? Neither path is universally best. Phase 9's design intentionally teaches the tradeoff by making participants walk both paths and compare them.

Why not D? They are NOT functionally equivalent. Path A commits the change to the cluster directly after Telegram approval. Path B commits to a git branch, requires human PR merge, then syncs to the cluster via apply.sh. The audit trail, review opportunity, and rollback characteristics differ meaningfully.

Question 7: Re-delegation with Governance Escalation

In the Phase 9 FLEET-01 chain, Morgan receives Telegram /approve incident-001 and then re-delegates to the SAME Track C specialist that diagnosed the issue. The re-delegation spawns a child agent with HERMES_LAB_GOVERNANCE=L4 + HERMES_LAB_TRACK=track-c. Why does Morgan re-delegate to the SAME specialist instead of picking a new one or running the apply herself?

A) Morgan does not technically have the terminal toolset in her config.yaml, so she cannot run kubectl B) The diagnosing specialist has full context from the diagnostic run, so re-delegating to it produces better applied fixes than handing off to a fresh specialist C) Morgan's SOUL.md NEVER rules prohibit her from calling terminal tools directly AND the diagnosing specialist has diagnostic context that makes it the correct applier D) Hermes requires that the same agent that opened a delegation chain must close it

Show Answer

Correct answer: C) Morgan's SOUL.md NEVER rules prohibit her from calling terminal tools directly AND the diagnosing specialist has diagnostic context that makes it the correct applier

Two reasons combine here:

Reason 1 — Behavioral prohibition (SOUL.md NEVER rule): Morgan's Phase 9 SOUL.md includes a new rule: "NEVER call terminal tools directly — your role is delegation, not execution." Even though her config.yaml DOES include terminal in her platform_toolsets.cli (the Phase 9 toolset fix required for delegated specialists to inherit it), Morgan is behaviorally prohibited from using terminal herself. The terminal entry in her config is "mechanical capability so children can inherit it" — the SOUL.md NEVER rule is the behavioral prohibition against direct use. This is the belt + suspenders pattern: config (belt) + SOUL.md (suspenders).

Reason 2 — Context retention: The diagnosing specialist has the full diagnostic context (pod name, namespace, exact root cause finding, proposed fix command). Re-delegating to a fresh specialist would lose that context and require a redundant re-diagnosis. SOUL.md's re-delegation rule explicitly says "re-delegate to the SAME specialist that diagnosed the issue."

Why not A alone? Before Phase 9, Morgan did NOT have terminal in her config. The Phase 9 Plan 01 toolset fix ADDED it (otherwise Hermes delegate_tool.py toolset intersection would block children from inheriting terminal). So "Morgan lacks terminal" is no longer mechanically true — the prohibition is behavioral (SOUL.md NEVER rule), not mechanical (config.yaml).

Why not B alone? Context retention is real and valuable but insufficient as the sole reason. Without the NEVER rule, Morgan could theoretically run kubectl herself. The NEVER rule is what enforces the delegation-only pattern structurally.

Why not D? Hermes has no such constraint. Any agent can delegate to any compatible specialist. The "same specialist" pattern comes from Morgan's SOUL.md behavioral rule, not from Hermes delegation mechanics.

Question 8: Productionizing Decision

Your team has built a Phase 9 fleet coordinator and wants to deploy it in production. Your alert volume is ~15 alerts/minute at peak, all routed through a single Morgan webhook receiver. Your team runs three different services with different on-call rotations and wants to keep alert audit logs separate per service for SOC 2 compliance. Which deployment pattern from the Module 12 productionization reference (§7.4 Scaling) should you choose?

A) Single Morgan Deployment with replicas: 1, all services share one agent B) Single Morgan Deployment with replicas: 2-10 behind an HPA, plus per-team profile variants (fleet-serviceA, fleet-serviceB, fleet-serviceC) sharing the same runtime C) Three separate Morgan Deployments in three separate namespaces, one per service, with dedicated image builds and dedicated audit log aggregation per team D) Queue-based pattern with SQS → worker pool pulling from the queue

Show Answer

Correct answer: C) Three separate Morgan Deployments in three separate namespaces, one per service, with dedicated image builds and dedicated audit log aggregation per team

The key constraint is "keep alert audit logs separate per service for SOC 2 compliance." That is a multi-tenant isolation requirement at the regulatory level, not a convenience preference.

The Module 12 reference §7.4 Scaling discusses two isolation models:

Profile-per-team (shared runtime): Each team gets its own fleet profile name but shares a single Hermes runtime and audit log stream
Deployment-per-team (stronger isolation): Each team gets its own Hermes deployment, namespace, and audit log aggregation

The reference explicitly states: "Small shops use (1). Regulated environments use (2)." SOC 2 compliance is a regulated environment with audit separation requirements. Each service gets its own Morgan Deployment in its own namespace, with its own audit log aggregation pipeline.

Why not A? replicas: 1 cannot handle 15 alerts/minute reliably (single point of failure, no horizontal scaling). It also fails the SOC 2 audit separation requirement.

Why not B? Option B is the correct pattern for teams WITHOUT regulatory constraints (§7.4 Model 1). It shares a runtime and produces a shared audit log stream across services. That fails the SOC 2 requirement for per-service audit separation.

Why not D? Queue-based is recommended "if your alert volume exceeds ~10 alerts/second" per §7.4. At 15 alerts/minute (0.25/second), you are nowhere near the queue threshold. Trigger-based is the correct pattern for this volume. Adding queue infrastructure is unnecessary complexity here.

The §7.5 Production decision table shows this scenario under "Regulated environment, audit mandate": Deployment-per-team with GitOps sync and SIEM integration for audit log aggregation.

Question 1: Fleet Pattern Selection​

Question 2: Coordinator Role​

Question 3: Delegation Quality​

Question 4: Shared Context​

Question 5: Fleet Value Proposition​

Question 6: Dual Apply Path Tradeoff​

Question 7: Re-delegation with Governance Escalation​

Question 8: Productionizing Decision​

Question 1: Fleet Pattern Selection

Question 2: Coordinator Role

Question 3: Delegation Quality

Question 4: Shared Context

Question 5: Fleet Value Proposition

Question 6: Dual Apply Path Tradeoff

Question 7: Re-delegation with Governance Escalation

Question 8: Productionizing Decision