Skip to content
Runbook Engine

Runbooks that show up
when it matters.

The best runbook is the one that appears automatically when an incident opens — not the one you have to search Confluence for at 3am. Strake connects your runbooks to your alerts.

Structured runbooks cut MTTR from 67 to 23 minutes. The difference is not better documentation — it is documentation that arrives automatically when the incident opens.

01
The Problem

Docs that collect dust.

// 01
Nobody opens Confluence at 3am

Your runbooks are written, formatted, and published. Nobody reads them during an incident. Engineers go to Slack, ask who knows, and wait. The docs exist — they just don't surface at the right moment.

// 02
Tribal knowledge is a single point of failure

When the engineer who 'just knows' is on vacation, a 20-minute incident becomes a 3-hour one. Every hour of downtime that trace back to missing context is the cost of runbooks that aren't connected to anything.

02
Runbook Structure

Connected to alerts.
Not buried in docs.

Each runbook links to an alert trigger. When the alert fires, the runbook surfaces automatically in the incident — with steps already scoped to the active conditions.

RB-14postgres-high-connection-count
Triggered 4 min agoLIVE
Alert
postgres.conn > 90%
Service
postgres-primary
Connections
100 / 100
On-call
@rnewton
Identify connection source

Run SELECT client_addr, count(*) FROM pg_stat_activity GROUP BY client_addr ORDER BY count DESC; to see which host is saturating the pool.

2
Check for long-running queries

Run SELECT pid, now() - query_start as duration, query FROM pg_stat_activity WHERE state = 'active' ORDER BY duration DESC; — anything over 5 minutes should be investigated.

3
Terminate idle connections if needed

SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle' AND now() - state_change > interval '10 minutes';

4
Scale connection pool if issue persists

If steps 1-3 do not resolve saturation, increase PgBouncer max_client_conn and restart the pool. Alert the team before making config changes.

03
How It Works

Write once.
Surface every time.

// 01
Author runbooks in Strake

Write runbooks with structured steps, command snippets, and decision trees. Link each runbook to an alert rule from PagerDuty or Datadog.

// 02
Alert fires, runbook surfaces

When an incident opens from a linked alert, the runbook surfaces automatically in the incident view — no searching, no Confluence, no asking Slack.

// 03
Team works the steps

Engineers check off steps as they go. Progress is visible to the whole team. Action items are created directly from runbook steps when follow-up is needed.

Stop losing 45 minutes
finding the runbook.

Free during private beta. No credit card required.