You asked Claude Code to add a feature to the payments service. It did. It also refactored two utility functions it deemed inconsistent with the new approach, updated a shared config constant, and modified a test fixture you had not touched in eighteen months. The PR review took three hours. A regression hit production the following Tuesday.
That is not an unusual story.
Claude Code plan mode changes the sequence. Before Claude touches a single file, it produces a written plan: what it intends to change, which files are involved, and what risks it has identified. The engineer reviews the plan. If it looks right, they approve it and Claude executes. If it has drifted from the intent, they push back before a single line of code is written.
For a 30-engineer Sydney team where 5% of PRs get reverted due to unintended scope creep, the annual rework cost sits around $90,000. That is roughly 18 days of senior engineer time at $120/hr fully loaded, plus coordination overhead and the morale cost of fixing something that should not have broken. Plan mode discipline brings that revert rate below 1% in most teams that run it consistently.
The right policy: change shape, not engineer preference
The wrong policy is 'use plan mode when you feel like it'. Different engineers have different threat models, different familiarity with the codebase, and different tolerance for surprise. That makes plan mode optional in name but effectively absent in practice.
The right policy requires plan mode based on the shape of the change. That distinction matters because confidence is invisible to a reviewer and to the post-incident log.
The change touches more than three files. Multi-file changes compound in ways that are hard to reason about in real time. A plan makes the dependency graph visible before execution.
A database migration or schema change is involved. Migrations are the most irreversible changes in most systems. A plan that does not account for rollback should be rejected before Claude runs.
The change affects shared infrastructure or authentication. One misconfigured auth middleware can take down services that do not look related. Platform team sign-off belongs in the plan.
The engineer is new to this area of the codebase. Not a judgement. A risk signal. Writing the plan forces articulation of what they believe is true, which surfaces misunderstandings before they become production incidents.
The path is regulated: financial, health, or government. Under APRA CPS 230, material operational changes require documented change management. A plan mode record is lightweight evidence that the change was deliberate and scoped.
Most plan mode failures are not Claude problems. They are specification problems: the engineer handed Claude a vague instruction and approved whatever came back.
What a good plan contains
A good plan is short enough to review in five minutes and specific enough to hold Claude accountable. Three sections cover most cases.
Intent. What the change accomplishes in plain English. One paragraph. No hedging.
Approach. Which files change, what new structures get introduced, what gets removed. Specific enough that a second engineer could execute the plan without Claude.
Risks and mitigations. What could break, how the change rolls back, what tests cover the risk. If there is no rollback path, that belongs in the plan too.
A plan that says 'I will figure this out as I go' is not a plan. Reject it. A plan that says 'I will modify these four files to add pagination to the invoice API, the rollback is a feature flag already in place, and the change is covered by the existing integration test suite' is something you can reason about before Claude writes a single line.

Review discipline that actually holds
Plan mode only generates value if reviews actually happen. The failure mode is rubber-stamping: engineers approve plans without reading them, which gives the team the friction of plan mode without the safety. Good review policy assigns reviewers by change path, not by availability.
Regulated paths get reviewed by a senior engineer and a domain owner before Claude runs.
Shared infrastructure changes get reviewed by the platform team.
Customer-facing logic gets reviewed by a product engineer who can test the intent against expected behaviour.
A reviewer is asking three questions: is the intent what we actually want, is the approach the simplest one that works, and is the risk section honest about what could go wrong. A plan where the risk section reads 'minimal' for a schema migration is not passing that third check.
If you are starting from scratch and want a clear picture of where your highest-risk AI-assisted change exposure sits, the AI Readiness Assessment is typically the right starting point before writing any internal policy.
When plan mode costs more than it saves
Plan mode adds overhead. It takes an extra five to fifteen minutes per change, more for complex changes where Claude asks clarifying questions before drafting. For a team shipping ten changes a day, that overhead is real. The answer is not to skip plan mode on high-risk changes. The answer is to stop applying it where the risk is not.
Single-file changes with no downstream consumers. A plan here adds friction and no signal.
Exploratory spikes or throwaway branches. If the code is not going to production, the plan is not serving production.
Trivial edits: copy changes, comment updates, isolated style fixes. Plan mode for a button label change is how you create a team culture where plan mode becomes synonymous with bureaucracy.
The teams that get this wrong apply plan mode everywhere, watch adoption collapse within a month, then abandon it entirely. Plan mode is not a blanket safety net. It is a targeted intervention for high-risk changes, and treating it otherwise defeats the point.
Building the tooling layer
A team running plan mode across ten or more engineers benefits from three pieces of tooling: a /plan slash command that drops any engineer into plan mode without configuration overhead, a PR hook that requires a linked plan file as part of the submission checklist, and a drift report that tracks how often the final PR deviates from the approved plan. Drift patterns tell you which types of changes are systematically underplanned. Our AI Automation Services include this workflow instrumentation as part of the standard engagement.
Most Australian engineering teams that commit to plan mode discipline ship the policy in under a quarter and see PR rework drop within the first month. Not because Claude gets better. Because the review conversation moves from 'why did this break' to 'we saw this coming'. If you want to work through what adoption looks like for your specific stack, book a workflow conversation and we can map it out.



