Notifications

A failing job that nobody knows about is a worse problem than a failing job. Nomaflow's notification layer routes job events — failures, long-running detections, optional successes — to Slack, email or generic webhooks.

The setup is layered:

Transports (Slack workspace, SMTP server, webhook URLs) are configured once at framework level — they're a property of the install, not the job.
Job alerts blocks decide which events to emit and who to address.
Routing picks a transport for each (job tag, recipient) pair.

This page covers the wiring end-to-end.

Transports — framework-level setup

Open Settings → Notifications. The page lists every configured channel.

Each transport carries:

Field	Notes
URL / host	Slack webhook URL · SMTP host:port · generic webhook URL.
Credentials	🔒 encrypted at rest (Slack URL counts as a secret; SMTP password; webhook bearer / signing secret).
Default recipient	The channel / address / endpoint used when a job's alerts block doesn't specify recipients.
Test button	Sends a one-line "test from Nomaflow" message — confirms the wiring works before a real failure puts it to the test.

Slack

The default mapping: Slack receives a one-line message styled with the job's state colour (red for failure, yellow for long-run, green for success).

Setting	Required	Notes
Webhook URL	Yes	Get this from Slack admin → Apps → Incoming Webhooks. One webhook can fan-out to multiple channels through Slack's own routing.
Default channel	No	Overrides the channel baked into the webhook. Use `#nomaflow-alerts` if you have one.
Username	No	The bot name. Defaults to "Nomaflow".
Icon emoji	No	Defaults to `:gear:`.

Email

Standard SMTP. The framework sends a small HTML message with a link back to the Run detail page.

Setting	Required	Notes
Host	Yes	SMTP server hostname.
Port	Yes	587 for STARTTLS, 465 for TLS.
Username / Password	If the server requires auth.	Password is 🔒 encrypted.
Default from	Yes	Address the mail comes from.
TLS mode	Defaults to STARTTLS.	Pick TLS for legacy servers that need it.

Generic webhook

For OpsGenie, PagerDuty, Mattermost, your own dispatcher — anything that accepts a JSON POST.

Setting	Required	Notes
URL	Yes	The endpoint.
Headers	No	Auth headers, content-type override. Header values can be 🔒 encrypted.
Body template	No	Override the default body shape. Variables: `${job_id}`, `${run_id}`, `${state}`, `${error}`, `${started_at}`.

The default body shape:

{
  "job_id": "reporting-nightly-sync",
  "run_id": "run_a8c4d",
  "state": "FAILED",
  "triggered_by": "cron",
  "started_at": "2026-05-26T02:00:00Z",
  "finished_at": "2026-05-26T02:14:22Z",
  "error": "OperationalError: …",
  "url": "https://liberty.corp.local/nomaflow/runs/run_a8c4d/"
}

The url is the link operators click to reach the Run detail page directly.

Job alerts block

Inside the Job editor's Alerts section:

Field	Default	Notes
`on_failure`	`true` (when the block exists)	Emit on a `FAILED` run. The most common setting — leave it on.
`on_long_run_minutes`	none	Emit a warning if the run is still `RUNNING` after N minutes. The run keeps going — this is a heads-up, not an abort.
`recipients`	`[]`	Channel-specific identifiers. Empty = use the transport's default recipient.

Recipients per transport

The recipients field is type-aware — same string can map to several transports.

Transport	Recipient format	Example
Slack	`#channel` or `@user`	`["#data-oncall", "@alice"]`
Email	RFC 5322 address	`["data-team@corp.local"]`
Webhook	Endpoint id (registered at framework level)	`["opsgenie-data"]`

When a job is fired and an alert is due, the framework iterates over the recipient list. Each recipient matches one transport (by format / by registration); each matched transport gets the alert.

When `recipients` is empty

The transport's default recipient is used (#nomaflow-alerts for Slack, the SMTP default-from for mail, the webhook's primary URL). This is the right default for most installs — one job-level place sets policy, transports decide.

Per-transport routing by tag

Some installs want different teams to receive different jobs. The framework's notification routing supports tag rules:

Job tag	Routes to	Configured in
`team-data`	Slack `#data-team`	Settings → Notifications → Routing rules.
`team-security`	Email `security@corp.local`	Same.
`team-platform`	PagerDuty webhook	Same.

A job tagged team-data, etl flows through both the team-data rule and any tag-less default. The rule engine de-duplicates so a single recipient doesn't get the same message twice.

What events emit what

Event	Triggered by	Default level
Run failed	`on_failure = true` (default).	High — pages people.
Run long-running	`on_long_run_minutes = N` set, run still in flight past N.	Medium — warning.
Run succeeded	Set `on_success = true` (off by default — most installs don't want it).	Low.
Job re-enabled / disabled	Operator toggled the catalogue card.	Low — informational, off by default.
Run cancelled by user	Operator clicked ✕ Cancel.	Low — visible in the run history; alert is opt-in.

on_success is intentionally off by default. A job that runs hourly successfully ten thousand times a year shouldn't generate ten thousand "OK!" messages. Turn it on for high-value jobs where success itself is news ("the monthly report was delivered").

Anatomy of a failure alert

When a run fails:

The runner writes the FAILED state to the run row.
The notifications layer reads the job's alerts block.
For each matched recipient, it builds a transport-specific message:

Transport	Message
Slack	One-line red message: ❌ reporting-nightly-sync run failed at step copy-orders · OperationalError: connection refused with a "View run" link.
Email	Subject: `[Nomaflow] FAILED reporting-nightly-sync`. Body: same one-liner plus the full traceback as a code block, plus a link.
Webhook	JSON POST as described above.

The HTTP / SMTP call is fire-and-forget. If the upstream is unreachable, the notification fails — but the run's failure is already recorded. Nomaflow doesn't retry notification delivery (a flaky notification path shouldn't fail a job).

The framework log records every notification attempt with its outcome — search there if a recipient reports "I didn't get the page".

Sending success alerts conditionally

A common pattern: alert on success only for jobs where success itself matters. Two ways to do it:

Pattern	How
Per-job flag.	Set `on_success = true` in the job's alerts block. Fires on every successful run.
In-step push.	Add an HTTP step at the end of the job that POSTs to the webhook. Fires only when the preceding steps succeed (because steps run in order). Gives you full control over the message body.

The second pattern is what the Scheduled DB sync recipe uses — the success notification is just another step.

Quiet hours

Some installs don't want pages at 03:00 for low-priority jobs. Two approaches:

Pattern	Behaviour
Tag-based routing.	A `low-priority` tag routes to email (no page); a `high-priority` tag routes to PagerDuty (pages). The setting is per job.
Recipient-side rules.	The recipient channel (PagerDuty's own service policies, Slack's notification preferences) handles quiet hours. Nomaflow always sends; the receiver mutes.

The second approach scales better — Nomaflow has one notification policy (alert on every failure), the receivers decide what to do with it. Adding a "quiet hours" mode to Nomaflow itself would multiply the moving parts.

Common pitfalls

Mistake	Symptom	Fix
Webhook URL stored in plain text in `jobs.toml`.	URL leaks into version control.	Always store at framework level (🔒 encrypted), reference by transport name.
Test button green, real alert never lands.	Network blocks production-time traffic that the test allowed.	Check firewall rules; the test uses the same transport but at config save time.
`on_success = true` on every job.	Channel is full of green ticks; failures get lost in the noise.	Turn off `on_success` except where it matters.
`on_long_run_minutes = 1` on an ETL that always takes 5.	Spurious warnings every night.	Tune to the job's normal runtime + headroom.
Multiple recipients on the same channel.	Same message delivered twice.	The de-dup engine should catch this; if not, narrow the recipients list.

Inspecting notification history

The framework log records every notification dispatch with its outcome (SENT, FAILED_TRANSPORT, FAILED_DELIVERY). Search the log for notification to find delivery issues.

For a long-term audit (six months back: who got paged for what?), some installs add a small notification audit table populated by a Python helper. The framework doesn't ship this by default — most teams find the framework log sufficient.

What's next

Administration — restart behaviour and multi-replica notifications.
Recipe — Scheduled DB sync — uses the failure alert path end-to-end.
Custom Python steps — push notifications from within a step.

Transports — framework-level setup​

Slack​

Email​

Generic webhook​

Job alerts block​

Recipients per transport​

When recipients is empty​

Per-transport routing by tag​

What events emit what​

Anatomy of a failure alert​

Sending success alerts conditionally​

Quiet hours​

Common pitfalls​

Inspecting notification history​

What's next​