Skip to main content

Reference: Cron Config, Webhook Setup, and Slack Integration

Quick-reference for Module 11 — configuring automated triggers and interfaces in Hermes.


1. Hermes Cron Job Configuration

Cron jobs are defined in the agent's config.yaml under the schedules key.

Basic Cron Job

schedules:
daily_db_health:
schedule: "0 7 * * *" # Daily at 07:00 UTC
task: "Run daily DB health check: review slow queries from last 24 hours, identify top 5 by total_exec_time, flag any queries with execution time > 1000ms average"
output:
channel: slack
target: "#platform-alerts"
format: markdown
include_confidence: true
on_error:
notify: "#platform-oncall"
retry: false # Don't retry on failure — wait for next scheduled run

Cron Job with Conditional Alert

schedules:
hourly_cost_check:
schedule: "0 * * * *" # Every hour
task: "Check AWS cost-per-hour against yesterday's baseline. Alert only if current hour exceeds baseline by more than 20%."
output:
channel: slack
target: "#finops-alerts"
format: markdown
only_if: "anomaly_detected" # Only post if agent identifies an anomaly
on_normal:
log_only: true # Log to file, don't post to Slack when normal

Cron Job Parameters

ParameterValuesDescription
scheduleCron expressionStandard 5-field cron (minute hour day month weekday)
taskStringTask description sent to agent
output.channelslack, email, log, webhookWhere output goes
output.targetChannel name, email address, URLDestination within channel
output.formatmarkdown, json, plainOutput format
output.only_ifanomaly_detected, alert_triggered, alwaysConditional posting
on_error.notifySlack channel or emailWhere to send error notification
on_error.retrytrue/falseWhether to retry on failure

Common Cron Expressions

ScheduleExpression
Daily at 07:00 UTC0 7 * * *
Every hour0 * * * *
Every 30 minutes*/30 * * * *
Weekdays at 08:00 UTC0 8 * * 1-5
Monday at 09:00 UTC0 9 * * 1
First of month at midnight0 0 1 * *

2. Webhook Subscription Configuration

Webhooks are configured in Hermes to listen for specific events from external systems.

Basic Webhook

webhooks:
cloudwatch_alarm:
path: "/webhooks/cloudwatch"
method: POST
validation:
type: hmac_sha256
secret_env_var: "CLOUDWATCH_WEBHOOK_SECRET"
payload_mapping:
alarm_name: "$.AlarmName"
metric_name: "$.Trigger.MetricName"
db_instance: "$.Trigger.Dimensions[?(@.name=='DBInstanceIdentifier')].value"
state: "$.NewStateValue"
timestamp: "$.StateChangeTime"
task_template: |
ALARM: {alarm_name} is in state {state} as of {timestamp}.
Investigate {db_instance} for {metric_name} issues.
Time window: last 30 minutes before {timestamp}.
route_to_agent: "rds-health-agent"
output:
channel: slack
target: "#db-alerts"

Payload Mapping JSONPath Reference

JSONPath ExpressionPurpose
$.AlarmNameCloudWatch alarm name
$.Trigger.MetricNameMetric that triggered the alarm
$.Trigger.Dimensions[0].valueFirst dimension value (e.g., instance ID)
$.NewStateValueNew alarm state (ALARM, OK, INSUFFICIENT_DATA)
$.StateChangeTimeISO 8601 timestamp of state change

PagerDuty Webhook

webhooks:
pagerduty_incident:
path: "/webhooks/pagerduty"
method: POST
validation:
type: x_pagerduty_signature
secret_env_var: "PAGERDUTY_WEBHOOK_SECRET"
payload_mapping:
incident_id: "$.messages[0].incident.id"
title: "$.messages[0].incident.title"
severity: "$.messages[0].incident.urgency"
created_at: "$.messages[0].incident.created_at"
task_template: |
PagerDuty incident {incident_id}: {title}
Severity: {severity}, Created: {created_at}
Run cross-domain investigation across all infrastructure domains.
route_to_agent: "incident-coordinator"
output:
channel: pagerduty
target: "{incident_id}" # Append findings to the PagerDuty incident

3. Slack Integration Overview

Slack integration in Hermes has two modes:

  1. Outbound: Agent posts findings to Slack channels (configured in cron and webhook output)
  2. Inbound (slash command): Humans invoke the agent via Slack slash command

Outbound Configuration

integrations:
slack:
workspace: "your-workspace"
auth_env_var: "SLACK_BOT_TOKEN"
default_channel: "#platform-agents"
message_format: markdown
include_timestamp: true
include_agent_name: true

Inbound (Slash Command) — Overview

Slash command integration requires a Slack App with slash command configuration. This is a demo walkthrough in the lab, not hands-on configuration (requires workspace admin access):

  1. Create a Slack App in your workspace (requires admin)
  2. Add slash command: /hermes → POST to https://your-hermes-host/slack/commands
  3. Configure Hermes with the Slack signing secret
  4. Users can then run: /hermes investigate db-prod-01 slow queries

In the lab: The facilitator demonstrates slash command usage on the training workspace. Participants observe the interaction pattern; actual slash command setup requires workspace admin access that most participants do not have in training environments.


4. Output Routing Reference

Where agent output goes is as important as what the agent produces.

ScenarioOutput Routing
Scheduled health reportSlack channel (always post, even if no findings)
Alert-triggered diagnosisBack to the alert ticket (PagerDuty comment, CloudWatch annotation)
On-call investigationDirect Slack message to on-call user
Weekly trend summaryEmail distribution list
Approval-required actionSlack with approval buttons (Module 13)

Structured Output Format for Routing

Agents posting to external channels should use structured output that renders well in the target medium:

## DB Health: db-prod-01 — 2026-04-01 07:00 UTC

**Status:** ELEVATED (requires monitoring)
**Top Finding:** Slow query average exec time increased 40% vs. 7-day baseline

**Evidence:**
- Top query by exec time: `SELECT * FROM orders WHERE...` (avg 450ms, +180ms vs. baseline)
- Connection pool: 45/100 (45%, within normal range)
- CPU: 32% average (normal)

**Recommendation:** Review query plan for orders table query. Consider adding index on `created_at` column.

**Escalation:** None — monitor trend, report again tomorrow at 07:00 UTC.

*Hermes DB Health Agent | Skill: rds-health-v1.2 | 14:23 elapsed*

This format renders correctly in Slack markdown and provides all information needed to act without clicking through to additional context.


5. Trigger Decision Matrix

FactorUse CronUse WebhookUse CLI
Human must triggerNoNoYes
Needs to respond to eventsNoYesNo
Scheduled at fixed intervalsYesNoNo
Needs human context in taskNoNoYes
Best for trending dataYesNoNo
Best for incident responseNoYesYes (fallback)

6. Phase 8: Real Trigger Types — Comparison

Module 11's existing content (Sections 1-4 above) covers Hermes-native cron and webhook patterns with simulated payloads. Phase 8 adds four REAL trigger sources you wire to live infrastructure:

Trigger TypeSourceWhen To UseState RequiredGovernance Inheritance
Hermes cron (Module 11 Steps 2-4)Internal schedulerMost agent work — gateway-shared state, fast iteration, audit trail contextGateway runningHERMES_LAB_GOVERNANCE from gateway env
Hermes webhook test (Module 11 Step 7)hermes webhook test CLILab/development — simulating events without external servicesGateway runningHERMES_LAB_GOVERNANCE from gateway env
AlertManager webhook (Phase 8 / TRIG-01)Real Prometheus + AlertManager on KINDEvent-driven incident response — alerts arrive without pollingKIND cluster + helm release + PrometheusRule appliedHERMES_LAB_GOVERNANCE from gateway env (universal inheritance)
K8s CronJob (Phase 8 / TRIG-02)Kubernetes native CronJob resourceGitOps schedule-in-git, stateless one-shot diagnostics, multi-tenant K8sKIND cluster + Docker image + SecretHERMES_LAB_GOVERNANCE set on container env spec
GitHub webhook (Phase 8 / TRIG-03)Real GitHub webhook via smee.io public proxyPR review automation, push-to-investigate workflowsGitHub repo + PAT + smee.io channel + smee-client runningHERMES_LAB_GOVERNANCE from gateway env
Telegram bot (Phase 8 / TRIG-04)Real Telegram bot via @BotFatherMobile-first chat ops, on-demand agent invocation from anywhereTelegram account + bot token + user ID allowlistHERMES_LAB_GOVERNANCE from gateway process env, per-process not per-message

Decision tree

  • You want a scheduled health check → Hermes cron (default) OR K8s CronJob (if GitOps + K8s primitives matter more than gateway state)
  • You want event-driven incident response → AlertManager webhook (real metrics-based alerting)
  • You want code-review automation → GitHub webhook (--deliver github_comment posts back automatically)
  • You want on-demand chat ops → Telegram bot (or Slack as production reference — see Section 3)
  • You're prototypinghermes webhook test with hand-crafted payloads (no external services)

Hermes cron vs K8s CronJob — the honest comparison

ConcernHermes cronK8s CronJob
Gateway-shared state (skills, history, audit)Yes — nativeNo — stateless container
GitOps schedule in gitNo — CLI-managedYes — YAML resource
K8s-native observability (job_metrics, pod logs)No — Hermes session logs onlyYes — Prometheus + Loki/Vector
Multi-tenant resource quotasNo — shared gatewayYes — namespace + ResourceQuota
Iteration speedFast — tweak prompt, re-registerSlow — rebuild image, kubectl apply
Where the agent runsGateway processK8s pod
Image sizen/a (uses host hermes install)~700-900MB minimal Dockerfile

Real-world stance: most agent work uses Hermes cron because state matters. K8s CronJob shines for fire-and-forget diagnostic jobs deployed alongside other K8s primitives via the same GitOps pipeline.


7. Phase 8 New Environment Variables

Phase 8 adds four new environment variables to the lab export block. The full set as of Phase 8 is:

Env VarValuesSourceUsed For
HERMES_LAB_MODEmock | livePhase 1Existing
HERMES_LAB_SCENARIOclean | crashloop2 | ...Phase 1 + 6Existing
HERMES_LAB_GOVERNANCEL1 | L2 | L3 | L4Phase 7Existing — inherited by triggered agents
HERMES_LAB_TRACKtrack-a | track-b | track-cPhase 7Existing
MOCK_DATA_DIRpathPhase 1Existing
PATH (additions)infrastructure/wrappers:$PATHPhase 1Existing
GITHUB_TOKENclassic PAT with repo scope OR fine-grained PAT with "Pull requests: Read and Write"Phase 8 (TRIG-03)GitHub webhook + agent comment posting via gh pr comment
TELEGRAM_BOT_TOKENbot token from @BotFatherPhase 8 (TRIG-04)Telegram bot connection — activates the Hermes Telegram adapter
TELEGRAM_ALLOWED_USERScomma-separated Telegram user IDs (from @userinfobot)Phase 8 (TRIG-04)Restricts which Telegram users the bot will respond to
SMEE_URLhttps://smee.io/<channel-id>Phase 8 (TRIG-03)Public webhook proxy URL for forwarding GitHub events to local gateway

How to acquire each

  • GITHUB_TOKEN: https://github.com/settings/tokens → Generate new token (classic) → check repo scope → copy ghp_... value. Time: ~3 min.
  • TELEGRAM_BOT_TOKEN: Open Telegram → search @BotFather → /newbot → choose name + username → copy token. Time: ~2 min.
  • TELEGRAM_ALLOWED_USERS: Open Telegram → search @userinfobot → /start → copy your numeric user ID. Time: ~30 sec.
  • SMEE_URL: Visit https://smee.io/ → click "Start a new channel" → copy URL. Time: ~30 sec.

Storage: put these in ~/.hermes/.env (gitignored) or export inline before each lab session. NEVER commit real tokens — Telegram and GitHub both have secret-scanning that will flag leaked credentials within minutes.

Governance inheritance: all four trigger types read HERMES_LAB_GOVERNANCE from their execution environment. The K8s CronJob sets it on the container env spec; the AlertManager, GitHub, and Telegram triggers all inherit it from the gateway process env. This means a scheduled or triggered agent running unattended is governed by the same allowlist as an interactive one — there is no special "scheduled agent" governance bypass.