Vibe Coding vs Agentic Engineering: Karpathy's Distinction, Applied to AU Engineering Teams

At Sequoia Ascent earlier this year, Andrej Karpathy drew a line that most engineering teams are still pretending does not exist. He separated vibe coding (loose, conversational, prompt-driven code generation) from agentic engineering: structured, governed, repeatable workflows where AI agents operate with defined scope, audit trails, and failure modes. Both are real. Both are useful. But they belong in different parts of your organisation.

For Australian engineering leaders rolling out Claude, this distinction is not academic. It determines your governance posture, your security exposure, and whether your AI investment produces $45,000-worth of shipped features or $45,000-worth of accumulating technical debt.

What Karpathy Actually Said

Karpathy's framing was precise. Vibe coding is the experience of talking to an AI, accepting its suggestions, and shipping something that mostly works without deeply reading the output. You describe the vibe of what you want; the model fills the implementation. It is fast, fun, and genuinely useful for low-stakes prototyping.

Agentic engineering is different in kind, not degree. The AI does not just autocomplete: it plans, calls tools, reads and writes files, queries APIs, loops until a goal condition is met, and potentially hands off to other agents. The engineer's job shifts from writing code to specifying goals, designing evaluation criteria, and reviewing agent behaviour.

The error most AU CTOs make is treating these as a spectrum rather than two separate modes. They trial vibe coding on an internal tool, see a $30K productivity gain, then assume the same approach will work on a customer-facing data pipeline. It will not, and the failure mode is usually silent data corruption or a missed compliance requirement, not a broken build.

Where Vibe Coding Earns Its Keep

Vibe coding is genuinely excellent for a specific set of jobs. The rule of thumb: if a human would throw the output away within 90 days anyway, vibe coding is the right tool.

Throwaway prototypes and internal demos. A working prototype that takes two hours instead of two weeks changes how product conversations happen. The prototype gets replaced; the conversation changes direction. That is the point.
One-off data transformation scripts. Migrating a legacy CSV format, reshaping an export from an old SaaS tool, parsing a supplier's idiosyncratic JSON. These are run once, reviewed once, and discarded. Vibe coding is ideal.
Developer productivity tooling for internal use. Slack bots, internal dashboards, test data generators. If the blast radius of a bug is one team and the fix is a ten-minute redeploy, the governance overhead of full agentic engineering is not justified.
Spike solutions and architecture exploration. You want to know if a third-party API will support your use case before committing to a design. Vibe code the spike, read the output carefully, and throw it away before writing the real implementation.

Australian teams using Claude via Claude.ai or the API are already capturing this value. In a mid-sized Melbourne software consultancy we worked with, developers cut their prototyping cycle from three days to four hours on average, a saving worth roughly $120K annually across a twelve-person engineering team.

Where Agentic Engineering Is the Only Defensible Choice

Once you move into production code, regulated data, or multi-step processes where errors compound, the informal approach of vibe coding becomes a liability. APRA-regulated institutions, AUSTRAC reporting pipelines, and Privacy Act-covered data stores all require something closer to agentic engineering: defined inputs and outputs, explicit tool permissions, structured logging, and human approval gates at appropriate steps.

Claude's agentic capabilities (tool use, computer use, multi-agent orchestration) are engineered for this mode. You give Claude a specific goal, constrained tools, and a clear stopping condition. The agent executes, logs every step, and returns a structured result your team can audit. If something goes wrong, you have a trace.

Agentic engineering looks different from vibe coding in practice:

Scope is specified, not implied. The agent has access to exactly the tools it needs, and no more. A document processing agent reads from a nominated S3 bucket and writes to a nominated output path. It does not have general file system access.
Evaluation criteria are written before the agent runs. If you cannot describe what a correct output looks like before the agent starts, you are not ready to deploy it. This is the discipline that separates engineering from experimentation.
Human-in-the-loop gates are explicit, not accidental. For high-stakes decisions (approving a contract, publishing a customer communication, submitting a regulatory filing), the agent prepares a draft and pauses for human sign-off. This is not a limitation; it is the design.
Failure modes are documented and handled. What does the agent do when an API call fails? When the input data is malformed? When a downstream service is unavailable? These are engineering questions, not prompting questions.

How AU CTOs Should Structure a Claude Rollout

Most AU engineering teams will benefit from running both modes simultaneously, but keeping them structurally separate. Here is the rough shape of what works.

Start with a vibe coding sandbox. Give every engineer a Claude licence and three months to build whatever internal tools feel useful. No approval required, no production access, no customer data. The goal is fluency. Teams that skip this step and go straight to agentic engineering often build overly rigid workflows because the engineers have not yet developed intuition for what Claude does well.

Identify the three highest-value agentic candidates. After the sandbox phase, look at where engineers are spending time on repetitive, multi-step work that touches production systems. Document processing, compliance checking, customer onboarding workflows, and data quality pipelines are common candidates in Australian financial services and professional services firms.

Treat the first agentic project as infrastructure. The patterns you establish (how agents are scoped, how tool permissions are managed, how outputs are logged, how human review gates are implemented) will be reused across every subsequent agent. A Sydney-based fintech we worked with spent $1.2M on a first agentic implementation that felt expensive until they realised the second and third agents cost 20% of the first because the patterns were already in place.

Write your governance framework before your third agent. You do not need a full AI governance policy before your first agent. You do need one before your third. By that point you will have enough operational experience to write rules that reflect reality rather than anxiety.

The Integration Question

One question every CTO asks: can vibe-coded outputs be promoted into agentic systems? The short answer is yes, but not directly. Vibe-coded scripts need to go through a proper engineering review, not because the code is necessarily wrong, but because it was written without the structured thinking that agentic engineering requires. The review is where you add the error handling, the logging, the scope constraints, and the evaluation criteria that make agentic operation safe.

Think of it as the difference between a sketch and a working drawing. The sketch communicates intent fast. The working drawing is what you actually build from. Claude helps with both, but they serve different purposes and the sketch is not the first draft of the working drawing. It is a different document entirely.

What This Means for Your Team Right Now

If you are an AU engineering leader trying to make sense of the current moment, Karpathy's framing gives you a useful diagnostic. Look at your current Claude usage. If most of it is conversational and exploratory (developers asking Claude to help with specific functions or explain a codebase) you are in the vibe coding zone. That is fine, and likely valuable, but you are not yet getting the compound returns that agentic engineering delivers.

If you want to move toward agentic engineering, the right first step is not buying more tooling. It is writing down, precisely, what you want an agent to do: its goal, its inputs, its tools, its outputs, and what a correct result looks like. If that document takes less than an hour to write, the task is probably simple enough that vibe coding is sufficient. If it takes a day, you are describing something worth engineering properly.

We work with Australian engineering teams at both ends of this spectrum, helping organisations pick up vibe coding fluency quickly and designing agentic architectures that can pass APRA and Privacy Act scrutiny. If you want to think through where your team sits and what the right next move is, book a session, with no sales pitch, just a structured conversation.

Vibe Coding vs Agentic Engineering: Karpathy's Distinction, Applied to AU Engineering Teams

What Karpathy Actually Said

Where Vibe Coding Earns Its Keep

Where Agentic Engineering Is the Only Defensible Choice

How AU CTOs Should Structure a Claude Rollout

The Integration Question

What This Means for Your Team Right Now

Ready to move from AI pilot to production?

More from the blog

Reflecting on How Your Team Uses Claude: Turning Everyday AI Time Into Measurable ROI

Claude Cowork Setup Gold Coast: Local Options and Costs

Claude Cowork Setup Perth: Local Options and Costs