Skip to main content

Step 4: Automate Incident Response

Connect your incident platform and let the agent handle alerts automatically — from detection to diagnosis to fix, without you typing a single message.

What you'll accomplish

  • Connect Azure Monitor as your incident platform
  • Create a response plan with severity filters and autonomy settings
  • See the agent handle a real alert end-to-end — from acknowledgment to code fix and PR

Prerequisites

RequirementDetails
Completed Steps 1–3Create agent, Team onboarding, First investigation
Azure resources connectedAt least one Azure subscription with resources the agent can monitor

Connect Azure Monitor

  1. In the left sidebar, go to Builder → Incident platform.
  2. Click the Incident platform dropdown and select Azure Monitor.
  3. The Quickstart response plan toggle is on by default. Turn it off — you will create your own response plan next.
  4. Click Save.

Wait for the connection to complete. The status changes to "Azure Monitor connected. Your next step is to set up incident response plans."

Azure Monitor connected with green checkmark status

Checkpoint: The incident platform page shows a green checkmark with "Azure Monitor connected."

Other platforms

You can also connect PagerDuty or ServiceNow from the same dropdown.


Create an incident response plan

An incident response plan tells the agent which incidents to pick up and how much autonomy it has. The steps below are for Azure Monitor — PagerDuty and ServiceNow response plans use different filter fields based on their own incident metadata (priority, category, assignment group, etc.).

  1. Go to Builder → Incident response plans in the left sidebar.
  2. Click New incident response plan.
  3. Step 1 — Set up incident filters:
    • Enter a name (e.g., all-incidents).
    • Select severity levels — choose All severity to catch everything during setup.
    • Optionally add a title filter to narrow scope.
  4. Click Next.
Response plan creation form with name and severity fields
  1. Step 2 — Preview filter results: Review matching past incidents from your incident platform (empty if no incidents exist yet). Click Next.
  2. Step 3 — Save response plan:
    • Choose how much control the agent has:
      • Autonomous (Default) — the agent investigates and acts independently, including code fixes and container restarts.
      • Review — the agent diagnoses but waits for your approval before acting.
    • Click Save.
Response plan autonomy options showing Review and Autonomous modes

Checkpoint: Your response plan appears in the list with status On and the autonomy level you selected.


What happens when an alert fires

When Azure Monitor fires an alert that matches your response plan, the agent investigates automatically. What the agent does depends on the context you've given it — runbooks, code repositories, Azure resources, and prior investigations all shape the depth and actions of the investigation.

Example: HTTP 500 errors on a container app

In this example, the agent has a runbook for handling HTTP 500 errors, a connected code repository, and Azure resource access.

Incidents page showing one completed Sev3 alert with green Completed status

The agent builds a plan from your runbook. Rather than following a generic troubleshooting sequence, the agent reads the HTTP 500 runbook uploaded during onboarding and follows your team's procedures — checking upstream dependencies first, then connection pool, then recent deployments.

Agent showing investigation plan for HTTP 5xx alert with 6 numbered steps

It recalls prior knowledge. If the agent investigated a similar issue before, it recognizes the pattern and skips discovery — combining your runbook procedures with what it learned from previous investigations.

It takes action. In Review mode, the agent asks for your approval before each action. In Autonomous mode, it acts independently. In this example, the agent:

  • Read the source code and identified the root cause
  • Edited the code to fix the bug
  • Restarted the container to mitigate the alert
  • Committed the fix and pushed it to a new branch
  • Created a GitHub Issue for tracking
  • Verified the service was healthy after the fix

It delivers a remediation summary. The agent produces a structured report with everything the team needs to follow up:

Remediation Summary table showing alert, immediate mitigation, permanent fix, root cause, status, and tracking
ItemWhat the agent reports
AlertWhich alert fired, severity, affected resource
Immediate mitigationWhat was done to restore service right now
Permanent fixCode changes made and branch pushed
Root causeSpecific code bug or configuration issue with file references
StatusCurrent health of the affected resource
TrackingGitHub Issue number
Next stepsMerge PR and redeploy

Your results will vary based on the context your agent has. An agent with more runbooks, connected repositories, and prior investigations will produce deeper, more targeted responses.


CapabilityWhat it adds
Incident Response PlansConfigure severity filters and custom response instructions
PagerDuty IncidentsIngest and investigate PagerDuty incidents
MemoryHow prior investigations improve future ones
Monitor Agent UsageTrack ongoing incidents and agent activity

Next step

→ Step 5: Automate workflows

Was this page helpful?