Skip to main content

Incident Response Plans

TL;DR
  • Right custom agent handles each incident type automatically — no human routing at 3 AM
  • Filter by severity, service, title, and type to match precisely the incidents you care about
  • Turn any plan on or off with one click — pause routing during maintenance without deleting
  • See all plans, statuses, and custom agent mappings in a unified grid

The problem: one playbook for every fire

Not every incident is the same. A P1 database corruption requires deep log analysis and immediate action. A P3 performance degradation needs a quick metrics check. A deployment rollback needs source code context and deployment history.

Yet most automation treats all incidents identically — same investigation steps, same tools, same urgency. Your on-call engineer ends up being the router, deciding which runbook to follow, which dashboards to check, and how urgently to respond. At 3 AM, that decision-making overhead directly increases your MTTR.

How response plans work

Response plans connect incident filters to custom agents. When an incident arrives, your agent evaluates it against active response plans and routes it to the right custom agent automatically.

Agent Canvas canvas showing incident trigger nodes connected to a custom agent with edges

Each response plan has two parts:

PartWhat it controlsExample
Incident filterWhich incidents to matchP1 + P2 incidents on api-gateway service
Custom agent handlerHow to respondUse api-expert custom agent in Review mode

Filter criteria

CriteriaWhat it filtersExample
Severity / PriorityOne or more severity levelsP1 + P2 (multi-select)
Impacted serviceWhich service is affectedapi-gateway, payment-service
Incident typeClassificationDefault, Major, Security
Title containsKeyword match in incident title"CPU spike", "Out of memory"

You can select multiple severity levels in a single plan — your agent matches incidents at any of the selected levels.

Custom agent configuration

Each plan specifies how your agent responds:

SettingOptionsDefault
Response custom agentAny configured custom agentPre-selected when creating from graph
Agent autonomy levelAutonomous, ReviewAutonomous
  • Autonomous — Your agent analyzes incidents and independently performs mitigation or resource modifications with the required permissions.
  • Review — Your agent diagnoses incidents, then mitigates or modifies resources only after its proposed actions are reviewed and approved.

Autonomous mode acknowledgement

When you select Autonomous, an info icon (ℹ️) appears next to the option. Click it to review the Autonomous mode acknowledgement — a dialog explaining what autonomous operation means, including agent boundaries, AI model limitations, your responsibilities for scoping access and reviewing outcomes, and liability terms.

tip

Start with Review mode for new response plans to validate your agent's investigation behavior before granting full autonomy. You can switch to Autonomous after you're confident in the agent's tool selection and investigation patterns.

What makes this different

Unlike static alert rules, response plans route to specialized agents. Each plan can point to a different custom agent with different tools and expertise — so database incidents get a database expert and API incidents get a deployment-aware investigator.

Unlike manual runbook selection, your agent makes the routing decision automatically. The right expertise matches the right problem without human judgment at 3 AM.

Unlike one-size-fits-all automation, response plans let you tune investigation depth per incident type. Use autonomous mode for P1 outages. Use review mode for lower-severity alerts. Match your response to the severity of the problem.

Before and after

BeforeAfter
Incident routingHuman decides which playbook to followAgent matches incident to specialized response plan
Tool selectionEngineer opens relevant dashboards manuallyRight custom agent with right tools handles it
Investigation depthSame approach for P1 and P4Autonomous for critical, review for low-severity
Pausing a planDelete the plan, recreate laterClick Turn off — configuration preserved
Plan visibilityNavigate between multiple pagesOne grid shows plans, statuses, and custom agent mappings

How to create a response plan

You can create and manage response plans in two places:

PathBest for
Builder → Incident response plansManaging all plans in a grid with filtering, search, and one-click enable/disable
Builder → Agent Canvas (canvas)Visualizing which triggers route to which custom agents

From either path, click New incident response plan (or the + button on a custom agent node in the canvas) to open the create wizard.

Watch out for the default quickstart plan

When you first connect an incident platform, a default quickstart response plan is created automatically. If you create your own plans, delete the quickstart plan from Builder → Incident response plans. Overlapping plans can cause incidents to be routed to the wrong custom agent or processed twice.

Enable and disable plans

You can turn any response plan on or off without deleting it. This is useful during maintenance windows, testing, or when you want to temporarily stop routing certain incident types.

  1. Navigate to Builder → Incident response plans
  2. Select the plan by clicking its checkbox
  3. Click Turn off in the toolbar — a confirmation dialog appears
  4. Click Yes to disable the plan

The plan's status changes to Off and the scanner stops matching incidents against it. Your filter configuration is preserved.

To re-enable, select the plan and click Turn on — it takes effect immediately with no confirmation needed.

You can also toggle plans from Builder → Agent Canvas → Table view → Incident response plans tab, which provides the same controls in the unified grid.

Unified grid view

The Table view in the Agent Canvas shows all your response plans alongside custom agents, scheduled tasks, and tools. Switch to the Incident response plans tab to see:

ColumnWhat it shows
Response plan namePlan identifier
StatusOn (green) or Off (red) badge
Custom agent nameWhich custom agent handles matched incidents
SeveritySeverity levels the plan filters on
Incident typeType classification
Impacted serviceService filter
Title containsKeyword filter

Use the Status filter to quickly find disabled plans, and the search box to find plans by name.

Example: Routing database vs. API incidents

Your team runs two services: api-gateway and postgres-primary. API incidents typically involve deployment rollbacks and need source code context. Database incidents require deep log analysis with Kusto queries.

You create two response plans:

TriggerFilterCustom agentMode
api-high-sevP1 + P2 on api-gatewayDeploymentAnalyzerReview
db-criticalP1 on postgres-primaryDatabaseExpertAutonomous

Get started

ResourceWhat you'll learn
Set up an incident trigger →Configure response plans to automate incident handling
CapabilityWhat it adds
Incident Response →Broader incident automation capability
Root Cause Analysis →Hypothesis-driven investigation
Custom agents →Create specialized agents that work together
Was this page helpful?