Most AI coding agents share a single failure mode. They finish a task without confirming that the finish line was actually crossed. The agent reports success, the tests were never run, and a broken change reaches production anyway. For Australian engineering teams putting Claude Code into real delivery work, that distance between "I think it works" and "here is the proof it works" decides whether you have a teammate or a liability.
Why agentic coding fails quietly
The problem is rarely the model's raw ability. It is the absence of a verification step the agent cannot skip. Left to its own judgement, an agent optimises for looking finished rather than for being correct. A recent set of patterns shared through the AI Builder Club community addresses exactly this. The patterns come from Fable's internal build system and ship as a workflow plugin called FableCodex. That plugin targets Codex rather than Claude Code, but the habits behind it map directly onto how Claude Code already operates, and you can adopt them today without installing anything. The five rules below are the adaptation we use with Claude Code on client work.
The five rules, mapped to Claude Code
Each rule turns a vague intention into a concrete artefact the agent has to produce. Artefacts are checkable. Intentions are not.
Rule 1: Write the goal before you start
Before any action, write the goal to a file: what you are trying to achieve, the constraints, and the success criteria. In Claude Code, begin each session by creating a GOAL.md and asking the model to re-read it at every step. This one file stops the scope drift that turns a tidy two-hour task into a sprawling afternoon of half-finished edits. When the agent wanders, the goal ledger pulls it back.
Rule 2: Document findings before you act
Before touching a file, the agent should write down what it found: what the code does now, what is relevant to the change, and what might break. A Claude Code hook can require a short findings note before any edit is allowed. The discipline is the same one good senior engineers apply on instinct. Read the room before you rearrange the furniture.
Rule 3: Classify the task before you code
Ask the agent to label the work first: bug fix, feature, refactor, or documentation. Each type needs a different proof of correctness. A bug fix needs a regression test that fails before the change and passes after. A refactor needs evidence that behaviour did not change. A feature needs a test of the new path. Classification tells the agent which kind of verification to run, so it stops guessing.
Rule 4: Verify with evidence, not opinion
This is the rule that closes the gap. The agent must show test output, lint results, a type check, or a screenshot. Not a sentence claiming the change is correct, but the actual proof. Configure a pre-commit hook that refuses a commit unless real test output is attached. "I believe this passes" does not get to count as passing.
Rule 5: Read before you write
Before editing a file, the agent reads it in full and summarises what it currently does. Most agentic coding errors start in the moment a model edits a file it only half understood. The read step is cheap. The bug it prevents is not.
The five-step pattern in one loop
Strung together, the rules form a single loop the agent runs on every task:
Write the goal. Constraints and success criteria, in a file the model re-reads each step.
Read and document findings. Understand the code before changing it.
Classify the task. Pick the verification the task type demands.
Do the work. Make the change once the first three steps are done.
Verify with evidence. Tests, lint, type checks, screenshots, not assertions.
None of this is exotic. It is the checklist a careful engineer already follows. The value of writing it down as rules is that Claude Code will follow a written rule every time, where a tired human skips a step on a Friday afternoon.
Does this slow Claude Code down?
The obvious objection is speed. Does forcing the agent to write a goal file, document findings, and produce test output make it slower? Per task, slightly. Across a fortnight, no. The minutes you spend on the loop are far cheaper than the hours you lose chasing a silent failure through a staging environment two days later. Slower per step, faster to a change you can actually ship, is the trade every experienced team makes.
What it is worth to an Australian team
The economics are simple. A single faulty merge that reaches production can cost an Australian SaaS business around $45,000 once you add incident response, customer rework, and the engineering hours lost to firefighting. A mid-sized Sydney team shipping dozens of agent-assisted changes a week does not need many of those to wipe out the time Claude Code saved them. Put the verification loop in place and the failure rate drops, because every change arrives with proof attached. The teams we work with treat the loop as non-negotiable before Claude Code touches anything near production, and the payback on a few hours of setup is usually counted in the tens of thousands of dollars across a single quarter.
If you want help wiring these patterns into your team's Claude Code setup, book a brainstorm with Automata AI, and we will map the verification loop to your stack.
Patterns surfaced via the AI Builder Club community in June 2026, drawn from Fable's build system and its FableCodex plugin. FableCodex targets Codex; the Claude Code mapping above is Automata AI's adaptation and is community-reported rather than an official integration.



