Claude's Zero Trust Framework for AI Agents: The Three-Tier Maturity Model Every Australian Enterprise Should Know

Anthropic shipped a Zero Trust framework for AI agents on 27 May 2026, and the timing matters for Australian enterprises that are already in pilot. The frontier-model release cadence has compressed the window between vulnerability disclosure and active exploitation from months to hours. For an APRA-regulated business deploying Claude in a customer-facing agent, that compression hits twice: the underlying infrastructure is exposed to AI-accelerated offence, and the agents themselves introduce a new autonomy surface that traditional access controls were never designed for.

The Zero Trust framework is the first cohesive answer from a frontier lab on how to operate agents safely at scale. It is built around a three-tier maturity model: Foundation, Advanced, and Optimized. Each tier is a coherent set of controls a security team can implement, audit, and defend against APRA CPS 230, the Privacy Act 1988, and (for AUSTRAC reporting entities) the AML/CTF Act.

This post walks through what each tier covers, where Australian enterprises typically sit today, and the order in which to actually implement the controls.

Why traditional access controls fail for agents

The classic security model assumes a human is at the keyboard. A user logs in, is authenticated against an identity provider, and receives a session token that scopes what they can do. When the user finishes, the session ends.

Agents break every part of that model. A Claude-based agent operating against a banking back office runs continuously, holds context across hundreds of decisions, and selects its own tools to achieve a goal a human has only loosely specified. The cryptographic identity that authorised the agent at session start has, by step 200, almost no relationship to the action being taken. A reviewer cannot tell whether action 200 was a sensible next step or a prompt-injected detour into a corner of the business the agent was never meant to touch.

Anthropic frames this as three new attack surfaces unique to agentic systems:

Prompt injection: an attacker plants instructions in a document, an email, or a webpage the agent will read, and the agent acts on them as if they came from the operator.
Tool poisoning: an attacker compromises an MCP server or third-party API the agent depends on, then uses that channel to coerce the agent into misusing its existing permissions.
Memory poisoning: an attacker contaminates the agent's persistent context, so future sessions inherit the malicious state and behave incorrectly without any new attack surface.

All three surfaces share the property that legitimate authentication does not prevent them. The agent is who it says it is. The infrastructure is doing what it was told to do. The damage happens inside the policy boundary, which is exactly the failure mode CPS 230 was rewritten in 2023 to address: operational risk that arises from how a critical operation is executed, not from who is authorised to execute it.

The three tiers, mapped to Australian reality

Foundation

Foundation is the minimum viable control set. If you are running a Claude agent in production without these, you cannot defensibly tell an APRA prudential review what the agent did last Thursday.

A cryptographically rooted agent identity that is distinct from any human identity, with no service accounts borrowed from the data warehouse.
Per-task permission scoping: the agent gets the minimum tool surface needed for the current goal, not the union of every tool it might ever need.
A complete, append-only audit log of every tool invocation, prompt, and response, retained for the period your CPS 234 information security policy requires (most AU banks default to seven years).
A break-glass kill switch the operations team can pull without involving the model vendor.

A typical mid-market Australian deployment, say a 200-staff superannuation fund running a Claude agent for member correspondence, can stand up Foundation in four to six weeks if the identity provider work is already done. The work usually costs $45,000 to $80,000 of consulting plus an internal sponsor's time.

Advanced

Advanced adds the controls that survive contact with a determined adversary. This is the tier most Australian enterprises should be planning toward over the next two quarters.

Sandboxed tool execution: the agent's writes go to an isolated environment that is reviewed before promotion to production data.
Structured input validation on every external document the agent ingests, with a classifier that flags likely prompt-injection patterns before the model sees them.
Output validation that compares the agent's proposed action against the policy library before the action is executed.
Tool-level rate limits that are scoped per task, not per session, so a poisoned tool cannot drain a quota across an entire day.
Continuous adversarial testing of the agent against a maintained library of injection payloads, with regressions blocking deployment.

For an AUSTRAC reporting entity that uses Claude to draft Suspicious Matter Reports, Advanced is the threshold where the agent's outputs can be relied on without a second human reviewing every draft. The cost ratio is roughly $120,000 to $250,000 for the initial implementation, then 15 to 20 percent of that as annual maintenance to keep the test library current.

Optimized

Optimized is the frontier. Almost no Australian enterprise sits here today, and most should not target it until they have eighteen months of Advanced-tier operating history.

Cryptographically attested memory: the agent's persistent context is signed and verified on read, so an attacker who reaches the memory store cannot silently alter it.
Agentic SOAR: a security automation layer that watches the agent's behaviour in real time and quarantines sessions that drift from established baselines, at machine speed rather than human-review speed.
Multi-agent containment: when the primary agent calls a secondary, the call crosses a trust boundary with its own authentication and audit.
Compliance-as-code: the CPS 230, Privacy Act, and AML/CTF obligations are encoded as machine-checkable rules the agent's output validator runs against.

A large Australian bank or insurer that has run Advanced for a year, and is now expanding the agent's mandate from drafting member correspondence to actually executing trades against the member's portfolio, is the natural Optimized candidate. The implementation is a $1,200,000 to $2,500,000 programme over twelve to eighteen months, sized around the bank's existing security operations centre.

How to sequence the work

Most Australian enterprises currently sit between Foundation and Advanced, with the audit log present but the per-task permission scoping incomplete and the adversarial testing not yet started. The right sequence is rarely the order Anthropic lists the controls in.

Start with the audit log. Without a complete log, every later control is harder to validate and harder to defend in a CPS 234 review. Once the log is trustworthy, work on per-task permission scoping. This is where most accidents happen: an agent with a broader permission set than it actually needs eventually does something with that excess permission, and the incident is then catalogued as an agent failure rather than a permission-scoping failure.

The third move is input validation on the highest-risk ingestion channel. For most Australian businesses this is inbound email, because the email channel is the easiest one for an external attacker to inject into. A simple classifier that flags prompt-injection patterns before the agent reads the message is a low-cost, high-value control.

Output validation and continuous adversarial testing come after. The reason is sequencing risk: if you start adversarial testing before the audit log and input validation are solid, the tests will surface failures you cannot then triage, and the security team will lose confidence in the agent before it has a chance to mature.

What this means for your next agent rollout

The Zero Trust framework is not a checklist to apply after the agent is in production. It is the operating model that should shape the rollout from the first design conversation. If your team is currently scoping a Claude agent and the design is silent on which of the Foundation controls are in place, that is the conversation worth having before any code is written.

The Australian-specific risk is regulatory. APRA prudential reviews have already started asking for evidence of agent-specific controls in CPS 234 audits, and the questions are going to get harder. An enterprise that can show Foundation today and a credible plan toward Advanced over the next two quarters will be in a defensible position. An enterprise that cannot is on borrowed time.

If you want a working session on where your agent stack actually sits against Foundation, Advanced, and Optimized, and what the next four control improvements should be, book a brainstorm with the Automata AI team and we will walk through the framework against your specific deployment.

Claude's Zero Trust Framework for AI Agents: The Three-Tier Maturity Model Every Australian Enterprise Should Know

Why traditional access controls fail for agents

The three tiers, mapped to Australian reality

Foundation

Advanced

Optimized

How to sequence the work

What this means for your next agent rollout

Ready to move from AI pilot to production?

More from the blog

Claude Cowork and Canva: Marketing Output for Non-Designers

From ChatGPT to Claude Cowork: A Migration Guide for Australian Teams

Claude Cowork With Outlook and Microsoft 365: What Works Today