GPU Bills, Engineers, and Hidden Costs of Open Source AI

Open source AI is marketed as the budget option, and the sticker price really is zero. Download the weights, stand up a server, and you are running a capable model without paying a vendor a cent. The running cost is another story. Australian businesses that self-host tend to learn quickly that the licence fee was never the expensive part.

This guide walks through the full cost stack of self-hosted open source AI for an Australian SMB or mid-market team: the lines everyone budgets for, the lines that stay hidden until launch day, and how the total compares with a managed model like Claude.

The costs everyone budgets for

These show up in the first draft of any self-hosting budget, so most teams plan for them reasonably well.

GPU rental or purchase. A single A100-class card rents for roughly $2,500 to $4,500 a month in Australian cloud regions, and serious workloads need more than one.
Cloud storage and bandwidth for model weights, logs, and embeddings, which grow faster than most teams expect.
Monitoring and alerting tools so someone knows when inference falls over at 2am.
A staging environment, because testing model changes in production is how outages happen.

If this were the whole list, the case for open source would be simple. It is not.

The costs that arrive after you commit

The second category lands after the infrastructure is live and the team is committed, which is exactly why it hurts. These lines rarely appear in the pilot budget.

Engineer salaries to keep the stack healthy day to day, including on-call cover and leave.
Downtime and lost work when a node fails mid-task and the team waits while it is rebuilt.
Security and compliance work under Australian law, which is ongoing rather than one-off.
Re-testing and re-integration every time the model or serving framework updates, which in the open source world is every few weeks.

Each of these is manageable on its own. Together they form a standing operational tax that has to be paid every month the system stays up, whether or not anyone is getting value from it that month.

The salary line that dominates the budget

A single skilled AI infrastructure engineer in Sydney costs around $180,000 a year fully loaded, and the market for them is tight. One hire can dwarf every compute line in your plan combined. Most self-hosted stacks of any real size need at least one such person, and a stack that matters to the business needs cover for when that person is on leave, sick, or resigns.

That last point deserves emphasis. A self-hosted model maintained by one engineer is a key-person risk, not a capability. If they leave, the business is left running production infrastructure nobody understands. This is the cost open source rarely advertises, and it is the one that most often turns a cheap plan into an expensive one.

Compliance is a running cost, not a checkbox

Self-hosting moves the full compliance burden onto your team. Under the Privacy Act, you are responsible for how personal information flows through your model, where it is stored, and who can access it. If you operate in financial services, APRA's CPS 234 expectations around information security apply to the infrastructure you now run. The Notifiable Data Breaches scheme means a misconfigured inference server is not just an engineering problem but a reportable incident risk.

None of this is a reason to avoid open source. It is a reason to price the audits, penetration testing, access reviews, and documentation as recurring work. Budget $20,000 to $40,000 a year for a modest setup, more if you are regulated.

What the same work costs on Claude

For comparison, a 20-person Australian team running on a managed Claude plan spends in the order of $18,000 to $25,000 a year, with no GPUs, no serving stack, and no specialist hire. API workloads scale with usage, so a pilot can start at a few hundred dollars a month and grow only if it earns its keep. Security, uptime, and model updates are the vendor's problem, covered by the security posture we discuss with clients rather than your on-call roster.

Put plainly: the managed option usually costs less than the salary line of the self-hosted option before the self-hosted option has bought a single GPU hour.

When self-hosting still makes sense

There are real cases where open source earns its place, and an honest cost review should name them.

Hard data residency requirements where no managed offering meets the obligation.
A narrow, stable, high-volume task where per-token economics genuinely beat a managed model at your measured volume.
An existing platform team that already runs GPU infrastructure for other workloads.

A fuller cost checklist before you sign anything

Use this before approving any self-hosted plan, so the hidden lines surface while you can still change course.

Compute, including idle hours and peak headroom, not just the average load.
People, including recruitment time, on-call, leave cover, and the cost of replacing them.
Compliance, security testing, and Privacy Act obligations as annual recurring lines.
Upgrade and re-testing effort for every model and framework release you adopt.
The managed-model benchmark: what the same work costs on Claude at your real volume.

For most Australian SMBs, running this checklist turns a tempting zero-dollar story into a real budget, and that budget usually points to a managed model as the cheaper and calmer choice. We run this comparison for clients with their actual workloads and report back plainly, including the cases where open source wins. If you want the numbers for your business, book a brainstorm session with us.

GPU Bills, Engineers, and Hidden Costs of Open Source AI

The costs everyone budgets for

The costs that arrive after you commit

The salary line that dominates the budget

Compliance is a running cost, not a checkbox

What the same work costs on Claude

When self-hosting still makes sense

A fuller cost checklist before you sign anything

Ready to move from AI pilot to production?

More from the blog

Claude vs Kimi K3: Why Benchmark Parity Doesn't Mean Business Parity

Stop Sharing Claude Max Logins: How Australian Teams Should Provision Claude Code

Open-Source Voice AI Economics: What Voxtral and Open TTS Mean for Australian Call-Handling Costs