WWT Agentic Architecture, Governance & Microsoft Tooling
In this blog
Executive Summary
World Wide Technology is working with customers on Agentic Architecture, and we thought this was a great opportunity to highlight deploying a fleet of up to 50 Microsoft Copilot agents across Sales, Finance, IT & Security, and HR & Legal. This document describes the target architecture, orchestration model, memory strategy, governance framework, and the specific Microsoft tooling that underpins each layer.
The architecture follows a hub-and-spoke pattern: a single Meta-Orchestrator classifies intent and dispatches to four Cluster Coordinators, each managing a group of specialized leaf agents. The registry in Microsoft Dataverse serves as the single source of truth for every agent in the fleet — routing decisions, governance triggers, and observability all read from it.
At a glance
- Agents Clusters Action-capable SLA target
- ~50 total 4 (Sales, Finance, IT/Sec, HR) ≤ 20% of fleet < 3 s p95 latency
Architecture overview
The platform is organized into five layers. Traffic enters through user-facing channels, passes through the Meta-Orchestrator, fans out to Cluster Coordinators, reaches leaf agents, and terminates at shared back-end services.
Layer 1 — User Channels
Agents are surfaced in Microsoft Teams (personal and shared channels), SharePoint Embedded, custom web chat widgets, and direct REST API calls. Azure API Management (APIM) is the single ingress point for all API-based traffic.
What this does: APIM enforces rate limits, injects authentication headers, and exposes the circuit-breaker policy that automatically backs off failed agents.
Layer 2 — Meta-Orchestrator Agent
A Copilot Studio agent performs two-phase intent classification. Phase 1 uses a lightweight classifier (BM25 + semantic re-ranking via Azure AI Search) to narrow intent to one of four clusters. Phase 2 uses the cluster's confidence scores to select the best leaf agent or escalate to a human if no agent crosses the 0.80 threshold.
What this does: Keeps routing logic out of individual agents. If a leaf agent is deprecated, only the registry entry changes — the orchestrator auto-discovers the replacement.
Layer 3 — Cluster Coordinators
Each cluster has a dedicated coordinator agent that holds cluster-level context (e.g., CRM account hierarchy for Sales, chart of accounts for Finance). Coordinators handle conflict resolution when two leaf agents return contradictory answers within the same cluster.
Layer 4 — Leaf Agents (~50 total)
Cluster Agents Action-capable Representative agents
Sales ~12 ~3 CRM query, pipeline forecast, customer 360
IT & Security ~14 ~3 Helpdesk triage, patch status, threat intel
Finance ~12 ~2 Invoice agent, budget variance, AP/AR
HR & Legal ~12 ~2 Policy lookup, onboarding guide, contract review
Presenter note: The 20% action-capable cap is a governance guardrail, not a technical limit. Agents that write data require a dedicated Entra App Registration and a mandatory human-in-the-loop step for high-stakes operations.
Layer 5 — Shared Services
All agents share a common back-end: Microsoft Graph (user/email/calendar data), Dataverse (registry and long-term memory), Azure AI Search (intent index), Application Insights (telemetry), Azure Monitor (SLA alerts), and Microsoft Sentinel (security events).
What this does: Agents never call external APIs directly. Every external dependency is wrapped as an MCP connector so tool signatures are typed, audited, and swappable.
Orchestration Logic
The Meta-Orchestrator runs a two-phase dispatch loop on every user turn.
Phase 1 — Intent Classification
The raw utterance is vectorized and matched against the intent-utterances index in Azure AI Search. The index contains every intent_pattern.example_utterances entry from the agent registry, with fields for agent_id, cluster_id, intent_label, and confidence_threshold.
* A BM25 keyword pass short-lists candidate agents.
* Semantic re-ranking (Azure AI Foundry embedding model) scores each candidate.
* If the top score is below 0.60, the orchestrator responds with a clarifying question rather than guessing.
What this does: Separates the routing model from individual agent prompts. Adding a new agent automatically extends coverage — you register the agent and its utterances appear in the index within minutes.
Phase 2 — Cluster Dispatch & Escalation
The orchestrator passes the top candidate and the session context to the appropriate Cluster Coordinator. The coordinator may fan out to multiple leaf agents in parallel (up to 3) when the intent is ambiguous within a cluster. Results are merged with a recency + authority weighting.
Escalation rules (in priority order):
* Confidence < 0.80 after re-ranking → human handoff queue.
* Agent status is degraded or suspended → skip, try next candidate.
* Latency SLA (3 s p95) exceeded → return partial response + async follow-up.
* Conflicting answers from ≥ 2 leaf agents → coordinator adjudicates; logs conflict event to App Insights.
Prompt Injection Defense
Because agents retrieve content from uncontrolled sources (email, SharePoint, external APIs), the orchestrator applies a sanitization step before forwarding retrieved content to leaf agents. Instruction-like fragments in retrieved content are wrapped in a trust boundary marker, and the leaf agent system prompt explicitly instructs the model to treat that zone as data, not instructions.
Presenter note: This is a critical talking point for security stakeholders. The platform cannot eliminate prompt injection risk entirely, but it dramatically reduces blast radius by ensuring no retrieved content is ever treated as a trusted instruction source.
Memory Model
The platform uses three distinct memory tiers. Each tier has a different scope, persistence mechanism, and access control boundary.
Tier Scope Storage Details
1 — Organizational All agents, all users Dataverse + AI Search Read-only product catalogue, policy documents, org chart. Refreshed nightly via Power Automate. Agents surface this as grounding context.
2 — Session Single conversation Copilot Studio topic variables User identity, prior turns, active intent, partial draft. Scoped to the session — never shared cross-user. Flushed on session end.
3 — Long-term user Per-user, cross-session Dataverse (row-level security) Approved preferences, saved searches, prior case references. Explicit opt-in required. Accessible only to agents in the user's assigned cluster.
Shared vs. Isolated Memory
Organizational memory (Tier 1) is read-only and shared across the entire fleet. Session memory (Tier 2) is strictly isolated per conversation. Long-term user memory (Tier 3) is per-user and accessible only within the cluster the user belongs to — an HR agent cannot read a Sales agent's memory for the same user.
What this does: Row-level security in Dataverse ensures memory isolation is enforced at the data layer, not just in the agent prompt. Even if a prompt is compromised, the token cannot read another user's records.
Conflict Resolution
When two leaf agents return different facts (e.g. two Finance agents disagree on a budget figure), the Cluster Coordinator uses a simple authority stack: data retrieved directly from a system of record (Graph, Dataverse) outranks data retrieved from a SharePoint document, which outranks data retrieved from email. Ties are broken by recency.
Governance Framework
Governance is treated as a catalog problem first and a security problem second. An agent that cannot be found, described, or attributed cannot be governed. The registry solves discovery; Entra ID, Purview, and Power Automate handle enforcement.
Agent Lifecycle States
Status Meaning
draft Agent is registered but not yet serving production traffic. Can be tested in Copilot Studio sandbox.
active Serving production traffic. Owner has signed off. Review cadence clock is running.
degraded Automatically set when a latency or token SLA breach is detected. Agent continues to serve but is deprioritised during routing.
suspended Automatically set on a security incident. Agent is immediately removed from routing. Owner and AI Platform team are notified.
deprecated Terminal state — agent ID is permanently retired. A replacement must use a new agent_id.
Presenter note: "deprecated is terminal" is a deliberate choice. It prevents accidental reactivation of retired agents and gives auditors a clean history.
Review Cadence
Every agent has a review_cadence_days field in the registry (default: 90 days). Power Automate Flow 1 queries Dataverse daily and sends an Adaptive Card in Teams to the agent owner when a review is overdue. If no action is taken within 7 days, the agent is automatically degraded and the AI Platform team is cc'd.
Required Registry Fields
* owner_upn — a named individual, never a team alias or shared mailbox.
* service_principal_id — a dedicated Entra App Registration per agent.
* deprecation_date — maximum 2 years from registration.
* data_scopes — at least one, with the approved_by field populated.
* intent_patterns — at least one, with ≥ 3 example utterances.
What this does: These five fields together guarantee that every agent has an accountable owner, a bounded data footprint, and a known expiry. CI validation blocks any registry PR that is missing them.
Confidence & Action Thresholds
Two thresholds govern agent behaviour. The confidence_threshold (minimum 0.80 for action-capable agents) determines whether an agent acts autonomously. The action_requires_confirmation flag on individual intents requires human approvabefore executing write operations above a configurable risk level.
Registry Data Model
The registry is a set of JSON documents stored in source control (this repo) and mirrored to Dataverse and Azure AI Search by the register-agent.py script. Dataverse is the runtime source of truth; source control is the change log.
AgentRecord — Core Entity
Field Type Purpose
agent_id string (UUID) Immutable primary key. Deprecated agents never reuse IDs.
display_name string Human-readable name shown in Teams cards and governance dashboards.
cluster_id enum sales | it-security | finance | hr-legal. Determines routing and memory scope.
owner_upn email Named individual accountable for the agent. Required — no team aliases.
service_principal_id GUID Entra App Registration. Scopes define what the agent can call.
capability_mode enum read-only | action-capable. Action-capable agents require AI Platform team PR approval.
status enum draft | active | degraded | suspended | deprecated.
sla_response_ms number p95 latency target in milliseconds. Default: 3000 ms.
token_sla_per_session number Maximum tokens consumed per session. Azure Monitor fires an alert on breach.
deprecation_date ISO date Maximum 2 years from registration. Deprecation Sweep flow triggers warnings at 30 and 7 days out.
review_cadence_days number Days between mandatory owner reviews. Default: 90.
data_scopes[] array Each entry includes scope_name, classification, approved_by, and approved_at.
intent_patterns[] array Each entry includes intent_label, confidence_threshold, example_utterances[], and optional escalation_agent_id.
What this does: The schema is enforced by JSON Schema validation in CI and by register-agent.py before any write to Dataverse. Invalid records are rejected before they can affect production routing.
Microsoft Tooling Map
Every layer of the platform maps to a specific Microsoft product. The table below shows which tool owns which responsibility and what you configure there.
Layer / Concern Microsoft Tool What you configure / own here
Agent authoring Copilot Studio Agent YAML definitions, topics, trigger phrases, and actions. Import from this repo's /agents folder.
Foundational model Azure AI Foundry GPT-4o deployment, embedding model for semantic search, prompt flow for batch evaluation.
Identity & permissions Microsoft Entra ID / PIM One App Registration per agent. PIM provides just-in-time elevation for action-capable agents.
Data access Microsoft Graph User profiles, mail search, calendar, Teams channels. Accessed via MCP graph-connector.
Registry & memory Dataverse AgentRecord entity (registry), long-term user memory table, session audit log.
Data governance Microsoft Purview Sensitivity labels on data_scopes. Agents cannot access MIP-protected content above their clearance.
API gateway Azure API Management Rate limiting, circuit-breaker policy (auto-trip on 5xx rate > 20%), mTLS for agent-to-agent calls.
Intent routing Azure AI Search intent-utterances index. Populated by register-agent.py. Semantic ranking requires Standard tier.
Observability Application Insights Custom AuditEvent telemetry: agent_id, session_id, prompt_tokens, completion_tokens, latency_ms.
SLA alerting Azure Monitor Alert rules for p95 latency breach, token SLA breach, anomaly detection on error rate.
Security events Microsoft Sentinel SIEM ingestion of AuditEvent stream. Analytics rules fire Power Automate Flow 2 on security incidents.
Governance automation Power Automate 3 flows: review reminder, auto-degradation on SLA breach, deprecation sweep.
Governance UI Power Apps Optional: model-driven app over the AgentRecord entity for non-technical governance stakeholders.
MCP connectors Azure Container Apps Hosts MCP servers for Graph and Dataverse. Agents call tools, never raw APIs.
CI / CD GitHub Actions validate-registry.yml on every PR; deploy-infra.yml pushes Bicep on merge to main.
Power Automate Governance Flows
Three Power Automate flows keep the registry accurate and ensure agents are held accountable. All flows read from and write to the Dataverse AgentRecord entity.
Flow 1 — Review Reminder
Step Detail
Trigger Recurrence — runs daily at 06:00 UTC.
Query Dataverse: list all active agents where last_reviewed_at is more than review_cadence_days ago.
Condition For each overdue agent: calculate days_overdue. Set urgency style (default / warning / attention) based on 0–14 / 15–21 / 22+ days overdue.
Notify Post Adaptive Card to the owner's Teams chat. Card shows agent name, cluster, data scopes, deprecation date, and days overdue. Three action buttons: Mark Reviewed, Request Extension, Deprecate Now.
Timeout 7-day wait step. If owner takes no action: auto-degrade the agent in Dataverse and notify the AI Platform team.
Flow 2 — Auto-Degradation
Severity Action
latency_breach Set status = degraded. Post warning card to owner. Agent continues serving but is deprioritised in routing.
anomaly_flag Set status = degraded. Require owner to acknowledge within 24 hours or trigger suspension.
security_incident Set status = suspended immediately. Remove agent from all routing. Notify owner + AI Platform team + Security team. Create incident ticket.
Trigger: HTTP webhook called by an Azure Monitor alert rule. The alert payload includes alert_type, agent_id, and metric_value.
Flow 3 — Deprecation Sweep
Stage Action
T-30 days Send deprecation warning to owner. Include replacement agent suggestion if one is registered.
T-7 days Final warning. Escalate to cluster lead if owner is unresponsive.
T-0 (deprecation day) Set status = deprecated. Remove all intent patterns from AI Search index. Archive Dataverse record. Send shutdown confirmation.
Trigger: Weekly schedule plus on-demand invocation from the Deprecate Now button in the review reminder card.
Token SLA & Observability
Every agent invocation logs a custom AuditEvent to Application Insights containing: agent_id, session_id, user_upn (hashed), intent_label, latency_ms, prompt_tokens, completion_tokens, and total_tokens.
SLA Monitoring
Alert Condition
Latency SLA breach p95 latency_ms > agent.sla_response_ms over a 5-minute rolling window.
Token SLA breach Session cumulative total_tokens > agent.token_sla_per_session.
Error rate anomaly Dynamic threshold detection on 5xx rate over a 1-hour window.
Key KQL Queries
Paste these into Application Insights → Logs to get immediate visibility:
Token usage by agent (last 7 days):
customEvents | where name == "AuditEvent" | extend agent_id = tostring(customDimensions.agent_id) | extend total_tokens = toint(customDimensions.total_tokens) | summarize total = sum(total_tokens), avg = avg(total_tokens) by agent_id, bin(timestamp, 1d) | order by total desc
Sessions approaching token SLA:
customEvents | where name == "AuditEvent" | extend session_id = tostring(customDimensions.session_id) | extend agent_id = tostring(customDimensions.agent_id) | extend total_tokens = toint(customDimensions.total_tokens) | summarize session_tokens = sum(total_tokens) by session_id, agent_id | where session_tokens > 5000 | order by session_tokens desc
Implementation Roadmap
The recommended sequencing delivers governance infrastructure before agents, so that every agent deployed from day one is governed from day one.
Phase Timeline Deliverables
- 1 Weeks 1–3 Deploy Azure infrastructure (Bicep). Set up Dataverse schema. Configure APIM. CI/CD pipelines live.
- 2 Weeks 4–6 Register orchestrator + 4 cluster coordinators. Import Power Automate governance flows. Deploy MCP connectors for Graph and Dataverse.
- 3 Weeks 7–12 Pilot cluster (IT & Security recommended — high volume, well-defined intents). Register ~14 leaf agents. Establish token and latency baselines.
- 4 Months 4–6 Roll out remaining 3 clusters. Enable long-term user memory (per-cluster opt-in). Onboard governance stakeholders to Power Apps dashboard.
- 5 Ongoing Quarterly registry audits. Annual deprecation sweep review. Continuous model evaluation via Azure AI Foundry prompt flow.