Execute Mitigations
- Ask your agent to fix an issue — it proposes, you approve, it executes
- Full audit trail: who triggered it, what changed, whether it worked
- Choose your level of trust: Review mode (approve each action) or Autonomous mode (agent handles it)
The problem: diagnosis without action wastes time
You've identified the issue. Now what? You navigate to the Azure portal, find the right blade, confirm the resource, click through confirmation dialogs, wait for the operation to complete, then verify it worked. The investigation took five minutes; the fix takes another ten.
This friction exists across your operational workflows:
- Daily operations: Scaling resources for expected load, restarting services during maintenance windows
- Compliance checks: Hardening security settings across dozens of storage accounts
- On-call response: Executing well-known fixes quickly so engineers can get back to sleep
- Proactive optimization: Adjusting SKUs based on usage patterns before problems occur
How your agent closes the loop
When your agent identifies an issue, it doesn't stop at telling you what's wrong. It proposes a specific remediation action and, depending on your run mode, either waits for your approval or executes immediately.
The agent follows a consistent pattern: diagnose → identify action → check permissions → execute (or propose) → verify the fix worked. Every action is logged with who triggered it, what changed, why, and whether it succeeded.
What makes this different from scripts
Scripts are rigid — they run the same action regardless of context. Your agent reasons about the situation first. It considers what it found during investigation, what it remembers from past incidents, and what your skills and knowledge base recommend. The same symptom might lead to a restart in one case and a scale-up in another, because the agent adapts based on evidence.
Run modes give you graduated trust. Start in Review mode where the agent proposes and you approve. Move to Autonomous when you're confident in the pattern. Use ReadOnly for monitoring-only agents that never take action.
What your agent can do
Your agent can execute any Azure action through Azure CLI commands — if you can run it in az, your agent can run it too. This includes managing any resource type, modifying configurations, creating resources, and running any Azure operation.
| Command type | What it enables |
|---|---|
| Read commands | Query any Azure resource — az webapp list, az containerapp show, az vm list, az network vnet show. Run immediately, no approval needed. |
| Write commands | Modify any Azure resource — az webapp restart, az containerapp update, az vm resize, az role assignment create. Requires approval in Review mode. |
The agent's actions are constrained only by the permissions assigned to its managed identity. Grant Contributor on a resource group, and your agent can manage everything in that group. Grant a custom role with specific actions, and your agent is limited to those actions.
Safety guardrails
The agent enforces safety constraints at the command level:
- Delete operations blocked —
deleteandremovecommands are never executed. The agent returns an error directing users to the Azure Portal for deletions. - Key Vault commands blocked — all
az keyvaultcommands are blocked to prevent credential exposure. - Management locks respected — before modifying any resource, the agent checks for Azure management locks. Resources with ReadOnly locks cannot be modified.
- Subscription validation — subscription IDs in commands are validated for correct GUID format before execution.
Before and after
| Before | After | |
|---|---|---|
| Fix execution | Navigate to Azure Portal, find resource, click through blades | Ask agent, approve, done |
| Verification | Manually check if fix worked | Agent verifies and reports result |
| Audit | Hope someone documented what they did | Full audit trail in Application Insights |
| Knowledge | One engineer knows the fix | Agent applies learned patterns consistently |
Permission requirements
By default, agents have Reader access and cannot take actions. You explicitly grant write permissions by assigning roles to your agent's managed identity.
| Scope | What the agent can act on | Recommended for |
|---|---|---|
| Resource | A single resource only | Maximum restriction, start here |
| Resource Group | All resources in one group | Production workloads |
| Subscription | Any resource in the subscription | Development and testing only |
The agent checks Azure management locks before modifying any resource. Resources with ReadOnly locks cannot be modified, regardless of permissions or run mode. Delete and remove operations are blocked entirely — use the Azure Portal for deletions.
Alternative response paths
Direct mitigations aren't the only option. Many teams prefer to route findings to work items or ticketing systems instead of executing actions directly — especially when human review is required or change management processes apply.
| Response path | How it works | Best for |
|---|---|---|
| Direct mitigation | Agent executes restart, scale, or hardening | Trusted patterns, non-production |
| Create work item | Agent creates GitHub Issue or ADO work item | Human-in-the-loop, change management |
| Send notification | Agent posts to Teams or sends email | Awareness without action |
| Trigger workflow | Agent dispatches GitHub Actions or Logic Apps | CI/CD integration, multi-step processes |
Configure work item creation and notifications through connectors. For example, connect a GitHub MCP server to let your agent create issues, or connect Azure DevOps to create work items automatically.
See Send Notifications and Workflow Automation for chaining these response types together.
Example: Incident-triggered mitigation
3:47 AM — PagerDuty fires an alert: "High memory on prod-api"
Your agent (in Review mode) handles everything while you sleep:
-
Acknowledges the incident — PagerDuty shows "Acknowledged by SRE Agent"
-
Investigates automatically
- Queries App Insights: memory at 94%, trending up over 2 hours
- Checks deployment history: no recent deploys
- Recalls from memory: "Last time this happened, restart resolved it"
-
Proposes a fix — Posts to the incident thread:
Memory at 94% on prod-api (App Service).
Recommended action: Restart the App Service.
Evidence:
- Memory climbing since 1:30 AM
- No recent deployments
- Past incident: restart resolved similar issue on 2026-01-15
[Approve] [Deny] -
You approve (or in Autonomous mode, agent executes immediately)
-
Agent executes and verifies
✓ Restarted prod-api
✓ Memory now at 42%
✓ Incident resolved
What happened: You clicked Approve and the agent handled investigation, action, and verification.
Audit trail
Every mitigation action is recorded with full context:
| Field | What's captured |
|---|---|
| Identity | Which agent and managed identity |
| Action | Exact operation performed |
| Timestamp | When it executed |
| Trigger | The diagnosis or condition that led to the action |
| Result | Success or failure, with post-action verification |
Query the audit trail in Application Insights via Monitor → Logs in the agent portal. Every az command is logged as an AgentAzCliExecution custom event. See Audit Agent Actions.
Get started
Mitigations work out of the box with the built-in Azure CLI tool. Control how much autonomy your agent has through Run Modes.
| Resource | What you'll learn |
|---|---|
| Set Up a Response Plan → | Configure response plans that include automated mitigations |
| Run Modes → | Configure ReadOnly, Review, or Autonomous execution levels |
Related capabilities
| Capability | What it adds |
|---|---|
| Run Modes → | Control the level of autonomy for each action |
| Scheduled Tasks → | Schedule health checks that trigger mitigations automatically |
| Workflow Automation → | Chain mitigations with notifications and ticket creation |
| Audit Agent Actions → | Review and query the full action history |
| Permissions → | Understand agent permission model |