Skip to main content

Notifications

A failing job that nobody knows about is a worse problem than a failing job. Nomaflow's notification layer routes job events — failures, long-running detections, optional successes — to Slack, email or generic webhooks.

The setup is layered:

  1. Transports (Slack workspace, SMTP server, webhook URLs) are configured once at framework level — they're a property of the install, not the job.
  2. Job alerts blocks decide which events to emit and who to address.
  3. Routing picks a transport for each (job tag, recipient) pair.

This page covers the wiring end-to-end.


Transports — framework-level setup

Open Settings → Notifications. The page lists every configured channel.

Settings · NotificationsOutbound channels for job events.SLACKWebhook URL · 🔒 encryptedhttps://hooks.slack.com/services/T0…/B0…/x…Default channel#nomaflow-alerts✓ Test pingEMAILSMTP host · port · 🔒 passwordsmtp.corp.local · 587 · …Default fromnomaflow@corp.local✓ Test mailWEBHOOKURL · headers · 🔒 secrethttps://opsgenie.corp.local/webhookAuth headerAuthorization: …✓ Test POST

Each transport carries:

FieldNotes
URL / hostSlack webhook URL · SMTP host:port · generic webhook URL.
Credentials🔒 encrypted at rest (Slack URL counts as a secret; SMTP password; webhook bearer / signing secret).
Default recipientThe channel / address / endpoint used when a job's alerts block doesn't specify recipients.
Test buttonSends a one-line "test from Nomaflow" message — confirms the wiring works before a real failure puts it to the test.

Slack

The default mapping: Slack receives a one-line message styled with the job's state colour (red for failure, yellow for long-run, green for success).

SettingRequiredNotes
Webhook URLYesGet this from Slack admin → Apps → Incoming Webhooks. One webhook can fan-out to multiple channels through Slack's own routing.
Default channelNoOverrides the channel baked into the webhook. Use #nomaflow-alerts if you have one.
UsernameNoThe bot name. Defaults to "Nomaflow".
Icon emojiNoDefaults to :gear:.

Email

Standard SMTP. The framework sends a small HTML message with a link back to the Run detail page.

SettingRequiredNotes
HostYesSMTP server hostname.
PortYes587 for STARTTLS, 465 for TLS.
Username / PasswordIf the server requires auth.Password is 🔒 encrypted.
Default fromYesAddress the mail comes from.
TLS modeDefaults to STARTTLS.Pick TLS for legacy servers that need it.

Generic webhook

For OpsGenie, PagerDuty, Mattermost, your own dispatcher — anything that accepts a JSON POST.

SettingRequiredNotes
URLYesThe endpoint.
HeadersNoAuth headers, content-type override. Header values can be 🔒 encrypted.
Body templateNoOverride the default body shape. Variables: ${job_id}, ${run_id}, ${state}, ${error}, ${started_at}.

The default body shape:

{
"job_id": "reporting-nightly-sync",
"run_id": "run_a8c4d",
"state": "FAILED",
"triggered_by": "cron",
"started_at": "2026-05-26T02:00:00Z",
"finished_at": "2026-05-26T02:14:22Z",
"error": "OperationalError: …",
"url": "https://liberty.corp.local/nomaflow/runs/run_a8c4d/"
}

The url is the link operators click to reach the Run detail page directly.


Job alerts block

Inside the Job editor's Alerts section:

FieldDefaultNotes
on_failuretrue (when the block exists)Emit on a FAILED run. The most common setting — leave it on.
on_long_run_minutesnoneEmit a warning if the run is still RUNNING after N minutes. The run keeps going — this is a heads-up, not an abort.
recipients[]Channel-specific identifiers. Empty = use the transport's default recipient.

Recipients per transport

The recipients field is type-aware — same string can map to several transports.

TransportRecipient formatExample
Slack#channel or @user["#data-oncall", "@alice"]
EmailRFC 5322 address["data-team@corp.local"]
WebhookEndpoint id (registered at framework level)["opsgenie-data"]

When a job is fired and an alert is due, the framework iterates over the recipient list. Each recipient matches one transport (by format / by registration); each matched transport gets the alert.

When recipients is empty

The transport's default recipient is used (#nomaflow-alerts for Slack, the SMTP default-from for mail, the webhook's primary URL). This is the right default for most installs — one job-level place sets policy, transports decide.

Per-transport routing by tag

Some installs want different teams to receive different jobs. The framework's notification routing supports tag rules:

Job tagRoutes toConfigured in
team-dataSlack #data-teamSettings → Notifications → Routing rules.
team-securityEmail security@corp.localSame.
team-platformPagerDuty webhookSame.

A job tagged team-data, etl flows through both the team-data rule and any tag-less default. The rule engine de-duplicates so a single recipient doesn't get the same message twice.


What events emit what

EventTriggered byDefault level
Run failedon_failure = true (default).High — pages people.
Run long-runningon_long_run_minutes = N set, run still in flight past N.Medium — warning.
Run succeededSet on_success = true (off by default — most installs don't want it).Low.
Job re-enabled / disabledOperator toggled the catalogue card.Low — informational, off by default.
Run cancelled by userOperator clicked ✕ Cancel.Low — visible in the run history; alert is opt-in.

on_success is intentionally off by default. A job that runs hourly successfully ten thousand times a year shouldn't generate ten thousand "OK!" messages. Turn it on for high-value jobs where success itself is news ("the monthly report was delivered").


Anatomy of a failure alert

When a run fails:

  1. The runner writes the FAILED state to the run row.
  2. The notifications layer reads the job's alerts block.
  3. For each matched recipient, it builds a transport-specific message:
TransportMessage
SlackOne-line red message: ❌ reporting-nightly-sync run failed at step copy-orders · OperationalError: connection refused with a "View run" link.
EmailSubject: [Nomaflow] FAILED reporting-nightly-sync. Body: same one-liner plus the full traceback as a code block, plus a link.
WebhookJSON POST as described above.
  1. The HTTP / SMTP call is fire-and-forget. If the upstream is unreachable, the notification fails — but the run's failure is already recorded. Nomaflow doesn't retry notification delivery (a flaky notification path shouldn't fail a job).

The framework log records every notification attempt with its outcome — search there if a recipient reports "I didn't get the page".


Sending success alerts conditionally

A common pattern: alert on success only for jobs where success itself matters. Two ways to do it:

PatternHow
Per-job flag.Set on_success = true in the job's alerts block. Fires on every successful run.
In-step push.Add an HTTP step at the end of the job that POSTs to the webhook. Fires only when the preceding steps succeed (because steps run in order). Gives you full control over the message body.

The second pattern is what the Scheduled DB sync recipe uses — the success notification is just another step.


Quiet hours

Some installs don't want pages at 03:00 for low-priority jobs. Two approaches:

PatternBehaviour
Tag-based routing.A low-priority tag routes to email (no page); a high-priority tag routes to PagerDuty (pages). The setting is per job.
Recipient-side rules.The recipient channel (PagerDuty's own service policies, Slack's notification preferences) handles quiet hours. Nomaflow always sends; the receiver mutes.

The second approach scales better — Nomaflow has one notification policy (alert on every failure), the receivers decide what to do with it. Adding a "quiet hours" mode to Nomaflow itself would multiply the moving parts.


Common pitfalls

MistakeSymptomFix
Webhook URL stored in plain text in jobs.toml.URL leaks into version control.Always store at framework level (🔒 encrypted), reference by transport name.
Test button green, real alert never lands.Network blocks production-time traffic that the test allowed.Check firewall rules; the test uses the same transport but at config save time.
on_success = true on every job.Channel is full of green ticks; failures get lost in the noise.Turn off on_success except where it matters.
on_long_run_minutes = 1 on an ETL that always takes 5.Spurious warnings every night.Tune to the job's normal runtime + headroom.
Multiple recipients on the same channel.Same message delivered twice.The de-dup engine should catch this; if not, narrow the recipients list.

Inspecting notification history

The framework log records every notification dispatch with its outcome (SENT, FAILED_TRANSPORT, FAILED_DELIVERY). Search the log for notification to find delivery issues.

For a long-term audit (six months back: who got paged for what?), some installs add a small notification audit table populated by a Python helper. The framework doesn't ship this by default — most teams find the framework log sufficient.


What's next