Skip to content
SRE Intelligence Platform

Know Before You Deploy.
Recover Before It Hurts.

Strake tells you whether it's safe to push right now — based on system health, error budget burn, open incidents, and deploy velocity. When things break anyway, your runbooks are connected to alerts, your team knows what to do, and the knowledge doesn't live only in whoever has been here longest.

Teams are running more infrastructure with fewer SREs than at any point in the last decade. The tools haven't kept up.

strake / deploy-gate
Current conditions:
feat/checkout-v2 → main
payment-service · 47 files changed · prod-us-east-1
prod-us-east-1
Strake Recommendation
Hold Deploy

Error budget at 18% remaining with SLO window closing in 72 hours. 2 active incidents on service dependencies. 3 deploys in the last 2 hours. Risk of cascading failure is elevated.

Error Budget18%
System HealthDegraded
Open Incidents2 active
Deploy Velocity3 / 2hr
Active Runbooks2 triggered
RB-14postgres-high-connection-countLive
RB-07payment-service-5xx-spikeMonitoring
Updated just now
01
The Problem

Your team shouldn't need a dedicated SRE
to run production reliably.

But without the right infrastructure, every incident becomes a fire drill.

// 01
Only one person knows what to do at 3am

When the engineer who "just knows" is on vacation, incidents that should take 20 minutes take 3 hours. Knowledge concentrated in one person isn't expertise — it's a single point of failure.

// 02
Runbooks that nobody uses

Your runbooks are in Confluence. Your engineers are in the terminal. Nobody opens Confluence at 3am. The best runbook is the one that shows up automatically when an incident opens — not the one you have to search for.

// 03
43% of incidents start with a deploy

Your team spends the first 30 minutes of every incident figuring out if a recent deploy caused it. That context should be automatic. It isn't — unless you build the connection between your deploy history and your incident workflow.

03
Deploy Gate

What Strake reads
before you push.

Every deploy decision is based on five live signals pulled from your existing stack. No new agents. No new dashboards. Strake reads what's already there and tells you what it means right now.

Signal 01 — SLO Budget

How much error budget is left in the current window, and how fast it's burning. Strake flags when a deploy risks exhausting the rest of it before the window closes.

Signal 02 — Active Incidents

Open incidents across the service and its direct dependencies. Deploying into an active incident almost always makes things worse and root cause harder to find.

Signal 03 — Change Velocity

How many deploys have gone out in the last few hours. High velocity makes root cause isolation nearly impossible when something breaks.

Signal 04 — System Health

Current health of the target service and its dependency graph — latency, error rates, resource saturation. The baseline you're deploying into matters.

Signal 05 — Dependency Changes

Diffs lockfiles between builds. Flags new packages, major version bumps, and suspicious publish timing.

strake / deploy-gate · all servicesupdated 12s ago
Service
Budget
Incidents
Velocity
Health
Deps
Gate
payment-service
18%
2
3/2hr
Degraded
1 ⚠
HOLD
api-gateway
84%
0
1/2hr
Nominal
GO
checkout-svc
41%
1
2/2hr
Elevated
2 ⚠
HOLD
user-service
92%
0
0/2hr
Nominal
GO
notification-svc
6%
1
5/2hr
Critical
3 ⚠
HOLD
search-indexer
77%
0
1/2hr
Nominal
GO
Services monitored
6
Currently blocked
3
Clear to deploy
3
05
Supply Chain Defense

Your CI pulls
what it's told.
Strake asks
what changed.

Supply chain attacks exploit the gap between a malicious publish and your next CI build. That window is measured in minutes — not days. CVE databases lag behind by design: a vulnerability has to be discovered, reported, catalogued, and published before your scanner knows it exists.

Strake diffs your lockfiles between the last known-good deploy and the current one. Every new package, every version bump, every maintainer change — flagged before it reaches production. No CVE required.

// The attack window scanners miss

The axios npm compromise was live for 3 hours before it was pulled. The LiteLLM attack affected 500,000 machines in 40 minutes. Traditional scanners check CVE databases — Strake checks what actually changed.

DEPENDENCY ANALYSIS
live
package-lock.json3a7f2c18b4e9d2
3 changes detected
CRITICAL
axios2.1.02.1.1
Published 47 minutes ago
Maintainer changed since last release
WARNING
lodash4.17.215.0.0
Major version bump
No code changes in this PR reference this package
OK
react-query5.62.05.62.1
Patch release · well-established package
Gate Verdict
HOLD1 critical · 1 warning · 1 clean
Resolve critical before deploy
// 5.1
Lockfile Drift

Detects changes in package-lock.json, yarn.lock, go.sum, and requirements.txt between your last known-good deploy and this one. Every change is accounted for.

// 5.2
Phase 2
Suspicious Timing

Flags dependencies published within hours of your build. The attack window most scanners miss entirely — before a CVE exists, before anyone has noticed.

// 5.3
Zero CVE Required

Doesn't wait for a vulnerability to be reported. Catches supply chain compromises at deploy time — not days later when the CVE database catches up.

RB-14postgres-high-connection-count
LIVE
Alert
postgres.conn > 90%
Service
postgres-primary
Connections
100 / 100
On-call
@rnewton
Steps4 / 6 complete
01
Verify alert is not spurious
✓ Confirmed — connections at 100/100, p99 latency 1840ms
02:14:18
02
Check for long-running queries
✓ Found 14 queries > 30s — all from payment-service v2.1.7
02:14:31
03
Verify pgBouncer pool configuration
✓ pool_size: 100 · max_client_conn: 100 · pool_mode: transaction
02:14:47
04
Correlate with recent deploys
✓ payment-service v2.1.7 deployed 02:13:44 — N+1 query pattern introduced
02:15:02
05
Resize pool or initiate rollback
Decision required — pool resize (faster) vs rollback v2.1.7 (safer). See notes for tradeoffs.
In progress
06
Verify recovery and close incident
Monitor error rate for 10 min · update incident record · add postmortem note
This runbook has run 7 times · last updated 2 days ago by @rnewtonView incident history →
04
Runbook Engine

The Notion page
from 2021 is not
a runbook.

Strake connects runbooks directly to the alerts that trigger them. When PagerDuty fires, the right runbook opens — not a search bar, not a Notion space, not a Slack message asking who knows what to do.

Steps are tracked. What the engineer found, what they decided. Every time the runbook runs, the record gets richer. The next engineer who gets paged starts from that, not from zero.

RB-14 · Incident History
Date
Resolution
Time
Jan 15
Pool resize · resolved
22min
Dec 28
Rollback v2.0.4 · resolved
41min
Dec 09
Query kill + pool flush
18min
Nov 22
Escalated · manual DBA
94min
// What this means

The Nov 22 incident took 94 minutes. The last three averaged 27 minutes. That's the runbook getting smarter — and it's the clearest signal of what Strake actually does.

Built for the stack you already run
datadogDatadog
pagerdutyPagerDuty
gh-actionsGitHub Actions
prometheusPrometheus
grafanaGrafana
kubernetesKubernetes
opsgenieOpsGenie
slackSlack

+ CloudWatch · Terraform · Loki · Confluence · Notion · GCP Cloud Run · AWS ECS · and more

The runbook for our database failover lived in a Confluence page that hadn't been opened in 14 months. Found that out at 3am on a Sunday.

Senior SRE · Series B fintech

We had a deploy gate. It was a Slack message: "anyone know if it's okay to push right now?" Someone always said yes. Then we'd find out.

Staff Engineer · Infrastructure team

The tribal knowledge problem is real. Three incidents in the last year that came down to one person not knowing what another person knew.

VP Engineering · 80-person startup
Free during private beta

Get early access.

Strake is in private beta. We're working directly with early teams to shape the product. No credit card. No commitment. Just a conversation about deploy safety.

Built for the engineer
who gets the page.

Strake is in private beta with a small cohort of senior engineers and SRE leads. If you're on-call and your MTTR isn't where it needs to be, we want to talk.

Free during private beta
Free to Use
No credit card required
Get Started FreeBook a DemoNo sales script. No demo theater.
A real conversation with the team.