Agent Hooks
- Enforce quality gates on agent responses before they reach users
- Audit and control tool usage with custom bash or Python scripts
- Block dangerous operations automatically with policy enforcement hooks
- Prevent early task completion by validating agent output meets your criteria
The problem
Your agent executes tasks autonomously, investigating incidents, running tools, generating responses. But autonomy without oversight creates risk:
- Incomplete responses: The agent says "done" before addressing everything you asked for
- Unaudited tool usage: You have no visibility into which tools the agent calls or what results it gets
- No policy enforcement: Dangerous operations (destructive commands, unauthorized changes) proceed unchecked
- Quality gaps: Responses miss critical information because there's no validation step
You need a way to intercept agent behavior at key moments, without slowing it down or removing its autonomy entirely.
How agent hooks work
Hooks are custom checkpoints you attach to specific agent events. When an event fires, your hook evaluates the situation and decides whether to allow or block the action.
New thread starts → Start hook fires → Inject context
Agent about to call → PreToolUse hook checks the call → Allow, deny, ask, or override policy
Agent used a tool → PostToolUse hook checks result → Allow, block, or inject context
Agent about to stop → Stop hook evaluates response → Allow or reject
Two levels of hooks
Hooks operate at two levels:
| Level | Where to configure | Who can configure | Scope |
|---|---|---|---|
| Agent level | Builder → Hooks in the portal | SRE Agent Administrator | Applies to the entire agent, all threads and all custom agents |
| Custom agent level | Agent Canvas → Custom agent → Manage Hooks, or via the REST API v2 | SRE Agent Administrator | Applies only when that specific custom agent runs |
Both levels can coexist. If an agent-level hook and a custom-agent-level hook both match the same event, both run.
Four hook events are supported:
| Event | Triggers when | You can | Configure via |
|---|---|---|---|
| Start | A new thread begins (first message) | Inject context, filter by thread source | API / YAML |
| PreToolUse | Agent is about to execute a tool | Allow, deny, or ask; inject context; override tool access policies | API / YAML |
| PostToolUse | A tool finishes executing | Audit usage, block results, inject additional context | Portal, API / YAML |
| Stop | Agent is about to return a final response | Validate completeness, reject and force the agent to continue | Portal, API / YAML |
Start and PreToolUse hooks are configured via the REST API v2 or YAML. The portal UI currently supports PostToolUse and Stop hooks.
Two execution types
You can implement hooks using either an LLM or a shell script:
| Type | How it works | Best for |
|---|---|---|
| Prompt | An LLM evaluates your prompt and returns a JSON decision | Nuanced validation ("Is this response complete?") |
| Command | A bash or Python script runs in a sandboxed environment | Deterministic checks, policy enforcement, auditing |
Prompt hooks are powerful for subjective evaluation, checking if a response addresses all user concerns or verifying that an investigation was thorough enough. They use the $ARGUMENTS placeholder to receive the full hook context. If $ARGUMENTS is not present in the prompt, the context is appended automatically. Prompt hooks also receive ReadFile and GrepSearch tools when a conversation transcript is available, allowing the LLM to reason about the full conversation history.
Command hooks are better for deterministic checks: validating that a response contains required markers, blocking dangerous commands, or logging tool usage to an external system.
Hooks complement run mode safety controls and tool access policies. Run modes control what the agent can do. Policies control which tools. Hooks control how well it does it and what happens with the results.
Before and after
| Before | After | |
|---|---|---|
| Response quality | Agent stops when it thinks it's done | Your Stop hook validates completeness before the response reaches users |
| Tool visibility | No audit trail of tool execution | PostToolUse hooks log and verify every tool call |
| Policy enforcement | Dangerous commands execute unchecked | Scripts block rm -rf, sudo, and other risky patterns automatically |
| Quality assurance | Prompt engineering is your only lever | LLM-based hooks evaluate nuance; scripts enforce deterministic rules |
How to configure hooks
The easiest way to create hooks is through the portal UI:
- Agent-level hooks: Go to Builder → Hooks → click Create hook
- Custom-agent-level hooks: Go to Agent Canvas → click a custom agent → Manage Hooks
See the Create Hooks via Portal tutorial for step-by-step instructions.
Hooks can also be configured via the REST API v2 using PUT /api/v2/extendedAgent/agents/{agentName}. The YAML format below shows the full configuration schema. See the API tutorial for details.
Note: The Agent Canvas YAML tab displays v1 format and does not show hooks. Use the Hooks page under Builder to view and manage hooks.
api_version: azuresre.ai/v2
kind: ExtendedAgent
metadata:
name: my_hooked_agent
spec:
instructions: |
You are a helpful assistant.
handoffDescription: ""
enableVanillaMode: true
hooks:
Stop:
- type: prompt
prompt: |
Check if the response ends with "Task complete."
$ARGUMENTS
Respond with:
- {"ok": true} if it does
- {"ok": false, "reason": "End your response with 'Task complete.'"} if not
timeout: 30
PostToolUse:
- type: command
matcher: "RunShellCommand|RunInTerminal|RunAzCliWriteCommands"
timeout: 30
failMode: block
script: |
#!/usr/bin/env python3
import sys, json, re
context = json.load(sys.stdin)
command = context.get('tool_input', {}).get('command', '')
dangerous = [r'\brm\s+-rf\b', r'\bsudo\b', r'\bchmod\s+777\b']
for pattern in dangerous:
if re.search(pattern, command):
print(json.dumps({"decision": "block", "reason": f"Blocked: {pattern}"}))
sys.exit(0)
print(json.dumps({"decision": "allow"}))
Hook response format
Hooks must output JSON. Two formats are supported:
Simple format (recommended for prompt hooks):
{"ok": true}
{"ok": false, "reason": "Please include more details."}
Expanded format (recommended for command hooks):
{"decision": "allow"}
{"decision": "block", "reason": "Dangerous command detected."}
{"decision": "allow", "hookSpecificOutput": {"additionalContext": "Tool audit logged."}}
Command hooks can also use exit codes instead of JSON output:
| Exit code | Behavior |
|---|---|
0 with no output | Allow (no objection) |
0 with JSON | Parse JSON for decision |
2 | Always block — stderr becomes the reason |
| Other | Uses failMode setting (allow or block) |
For Stop hooks, a rejection without a reason is treated as approval — the agent stops normally. Always provide a reason field when rejecting.
You can define multiple hooks for the same event. For PostToolUse, each hook with a matching matcher pattern runs independently. If multiple hooks provide additionalContext, the last hook's context is injected into the conversation.
Configuration reference
| Option | Type | Default | Description |
|---|---|---|---|
type | string | prompt | prompt or command |
prompt | string | — | LLM prompt text (required for prompt hooks). Use $ARGUMENTS for context injection |
command | string | — | Inline shell command (for command hooks, mutually exclusive with script) |
script | string | — | Multi-line script (for command hooks, mutually exclusive with command) |
matcher | string | — | Regex pattern for runtime tool names (required for PreToolUse and PostToolUse hooks). * matches all tools. Patterns are anchored as ^(pattern)$ and matched case-sensitively. Use actual runtime tool names (e.g., RunShellCommand, RunAzCliWriteCommands) — see tool access policies for the full list. Empty or null matches nothing. |
timeout | int | 30 | Execution timeout in seconds (must be positive; values above 300 are flagged during CLI validation) |
failMode | string | allow | How to handle hook errors: allow (tool proceeds) or block (tool is denied). Use block for security-critical hooks — allow means a hook crash silently removes the guardrail. |
model | string | ReasoningFast | Model for prompt hooks (scenario name or deployment name) |
maxRejections | int | 3 (agent default) | Max rejections before forcing stop. Range: 1–25. Applies to prompt-type Stop hooks only — command-type Stop hooks have no implicit limit. When multiple prompt hooks specify different values, the maximum is used. |
sources | list | — | Thread source filter for Start hooks. Valid values: Conversation, Alert, Incident, ScheduledTask, Teams, HttpTrigger, Playground, and others. When omitted, the hook fires for all thread types. |
Hook context schema
Hooks receive structured JSON context about the current event. Prompt hooks receive it via the $ARGUMENTS placeholder in the prompt text. Command hooks receive it as JSON on stdin.
For both hook types, the execution_summary field contains a file path to the conversation transcript (not inline content). For prompt hooks, the LLM receives ReadFile and GrepSearch tools to access this file. For command hooks, the file is available at the specified path in the sandbox.
Common fields
{
"hook_event_name": "Stop",
"agent_name": "my_agent",
"current_turn": 5,
"max_turns": 50,
"execution_summary": "/path/to/transcript.txt"
}
Stop hook fields
{
"final_output": "Here is my response...",
"stop_hook_active": false,
"stop_rejection_count": 0
}
PostToolUse hook fields
{
"tool_name": "ExecutePythonCode",
"tool_input": { "code": "print(2+2)" },
"tool_result": "4",
"tool_succeeded": true
}
PreToolUse hook fields
{
"tool_name": "RunShellCommand",
"tool_input": { "command": "kubectl apply -f deploy.yaml" },
"tool_description": "Execute a shell command in the sandbox",
"agent_mode": "Autonomous",
"is_write_action": true,
"requires_approval": false,
"requires_browser_connection": false,
"call_id": "call_abc123"
}
PreToolUse hooks can return a permissionDecision in the response:
| Decision | Response format | Effect |
|---|---|---|
| No objection | {"ok": true} | Hook has no opinion on this tool. Evaluation continues to tool access policy rules and default approval checks. Does not bypass any policies. |
| Allow (policy override) | {"ok": true, "hookSpecificOutput": {"permissionDecision": "allow"}} | Tool executes immediately, skipping policy evaluation and default approval. Can override even a global deny. Only user-defined hooks (not system hooks) can trigger this. Every override is audit-logged. Restrict hook authoring to trusted administrators. |
| Deny | {"ok": false, "reason": "..."} | Tool blocked. Reason fed to agent. |
| Ask | {"hookSpecificOutput": {"permissionDecision": "ask", "permissionReason": "..."}} | Execution suspends for user confirmation. |
Start hook fields
{
"start_message": "Investigate the high CPU alert on prod-web-01",
"thread_source": "Alert"
}
Start hooks are non-blocking — they can inject additional context but cannot prevent the thread from starting. Use the sources field on the hook definition to filter by thread type (e.g., only fire for Alert or Incident threads).
Hook execution order
When multiple hooks exist (agent-level, global, system), they execute in this order:
- System global hooks — Non-bypassable safety checks (read-only guard, browser connection requirements). These are global hooks with system provenance — they cannot be configured or disabled by users.
- Agent-specific hooks — Hooks configured on the agent via portal, API, or YAML
- User global hooks — Hooks configured at the SRE Agent instance level via the global hooks API
Within each tier, hooks run sequentially. A deny from any hook short-circuits the chain — remaining hooks are skipped. An allow is tracked but does not short-circuit — later hooks can still deny.
Model tiers
Prompt hooks use an AI model to evaluate agent behavior. You can select which model tier the hook uses, balancing evaluation quality against cost and latency.
| Tier | Best for | Trade-off |
|---|---|---|
| Reasoning | Complex policy enforcement — multi-step validation, nuanced compliance checks | Highest quality, higher cost and latency |
| Fast Reasoning (default) | Most hooks — response validation, audit checks, safety enforcement | Good reasoning with low latency |
| General Purpose | Simple format checks, basic compliance validation | Balanced accuracy, cost, and speed |
| Fast | Lightweight checks — presence validation, format verification | Lowest cost, fastest response |
| Long Context | Hooks that process large outputs — full document analysis, extensive tool results | Handles larger input, higher cost |
Hooks default to Fast Reasoning because they run on every agent response or tool call — low latency matters. Use Reasoning only for hooks that enforce complex policies where accuracy is critical.
Limits
| Limit | Value |
|---|---|
| Script size | 64 KB maximum |
| Timeout | 1–300 seconds |
| Max rejections (prompt Stop hooks) | 1–25 (default: 3) |
| Supported script shebangs | #!/bin/bash, #!/usr/bin/env python3 |
| Script execution environment | Sandboxed code interpreter |
Example: Audit all tool usage
hooks:
PostToolUse:
- type: command
matcher: "*"
timeout: 30
failMode: allow
script: |
#!/usr/bin/env python3
import sys, json
context = json.load(sys.stdin)
tool_name = context.get('tool_name', 'unknown')
print(f"Tool used: {tool_name}", file=sys.stderr)
output = {
"decision": "allow",
"hookSpecificOutput": {
"additionalContext": f"[AUDIT] Tool '{tool_name}' was executed."
}
}
print(json.dumps(output))
The additionalContext field is injected as a user message into the conversation, giving the agent visibility into the audit trail.
Example: Require completion marker
hooks:
Stop:
- type: command
timeout: 30
failMode: allow
script: |
#!/bin/bash
CONTEXT=$(cat)
FINAL_OUTPUT=$(echo "$CONTEXT" | jq -r '.final_output // empty')
if [[ "$FINAL_OUTPUT" == *"Task complete."* ]]; then
exit 0
else
echo "Please end your response with 'Task complete.'" >&2
exit 2
fi
Best practices
- Always provide a reason when rejecting — Rejections without reasons are treated as approvals
- Use appropriate timeouts — Long-running hooks slow down agent execution
- Handle errors gracefully — Use
failMode: allowfor non-critical hooks (logging, enrichment). UsefailMode: blockfor security-critical hooks (policy enforcement, destructive command blocking) so the guardrail stays active even if the hook script fails or times out - Be specific with matchers — Overly broad PostToolUse matchers can cause performance issues
- Test hooks thoroughly — Hooks that always reject can cause loops (mitigated by
maxRejections) - Log to stderr — Use stderr for debugging output; stdout is parsed as the hook result
Get started
Here's what a Stop hook looks like in action — the agent initially responds with just "4", but the hook rejects because the completion marker is missing. The agent then continues and adds the marker:
| Resource | What you'll learn |
|---|---|
| Create and Manage Hooks (Portal) → | Create hooks visually in the portal UI — no API calls needed |
| Configure Agent Hooks (API) → | Set up hooks using the REST API v2 and YAML |
Related capabilities
| Capability | How it relates |
|---|---|
| Run Modes → | Hooks complement run mode safety controls — modes control what, hooks control how well |
| Tool Access Policies → | Hooks evaluate before tool access policies — a hook allow overrides policy rules |
| Python Tools → | Create custom tools that hooks can audit and validate |