Blog

RAG With Open Source Models: An Australian Setup Guide

June 2026 · 5 min read · Technical

Hand-drawn illustration of a person examining a pile of papers with a magnifying glass
← Back to all posts

Retrieval-augmented generation, or RAG, lets an AI model answer questions from your own documents instead of guessing from its training data. For most Australian businesses it is the cheapest way to make any model genuinely useful on company knowledge, whether that model is open source or a managed one like Claude.

This guide maps the pieces of a working RAG setup, the decisions that matter for an Australian build, and the realistic costs. It is written for a technical owner or team lead who wants a grounded plan rather than a vendor pitch.

What RAG actually does

When someone asks a question, a RAG system first searches your document store for the most relevant passages, then hands those passages to the model along with the question. The model answers from the retrieved text rather than from memory. Done well, that means answers grounded in your contracts, policies and procedures, with far fewer invented facts.

The quality of a RAG system is decided mostly by the retrieval step, not the model. A modest model fed exactly the right passages beats a frontier model fed the wrong ones, which is why the build order matters so much.

The core pieces

A RAG build has four standard components, and each one affects answer quality.

  • A document store, with files chunked into passages and indexed sensibly

  • A retriever that finds the chunks relevant to each question, usually a mix of vector and keyword search

  • A model that writes the answer using only the retrieved chunks

  • An evaluation loop that measures answer quality against real questions from your team

Teams rush the chunking and skip the evaluation set, then blame the model when answers disappoint. Spend the effort where the value sits: sensible chunking that follows the natural sections of your documents, a retriever tuned on your data, and a set of 50 to 100 real questions with known answers to test against before anyone calls the system finished.

Where open source models fit

Open source models in 2026 are genuinely capable, and at the embedding and retrieval layer they are often the right choice. Embedding models are small, run cheaply on your own infrastructure, and swapping one for another later is a contained change rather than a rebuild.

The answering model is a separate decision. Self-hosting a large open model means provisioning GPUs, patching, monitoring and capacity planning, and those costs land on your team every month. For a business with strong infrastructure skills and high volume, that can pay off. For most Australian SMBs, a managed model costs less once engineering time is counted honestly. We build the retrieval layer to be portable, then default to Claude for the answering step because it follows instructions reliably and sticks to the retrieved passages instead of improvising around them.

  • Use open source embeddings where data must stay on your own infrastructure

  • Keep the retrieval layer portable so the answering model can change later

  • Choose the answering model on accuracy against your evaluation set, not on a leaderboard

Doing it well in Australia

Local context shapes a RAG build in ways a generic tutorial will miss.

  • Keep the document store and vector index in an Australian region when contracts or client data are involved

  • Treat personal information in line with the Privacy Act, including retrieval logs, which often reproduce sensitive passages verbatim

  • Control which documents each user can retrieve; access rules belong in the retriever itself, not just the front end

  • Keep an audit trail of what was retrieved for each answer; APRA-regulated firms will want this for any customer-facing use

That last point catches teams out. The model's answer is only half the record. If a dispute arises, you want to be able to show which passages the system retrieved and why it retrieved them.

What it costs and what it returns

A capable RAG build for an Australian SMB typically lands between $20,000 and $45,000. That covers chunking and indexing the document set, retriever tuning, an evaluation harness and a working interface. Running costs after that are modest at typical SMB volumes: a few hundred dollars a month for hosting and model calls.

The return shows up as time. If a 20-person Sydney team saves two hours per person each week by asking the knowledge base instead of interrupting a colleague, that is worth roughly $60,000 a year at average professional rates, so a serious build recovers its cost well inside the first year.

Common mistakes to avoid

The same few failures account for most disappointing RAG projects, and all of them are avoidable with a careful start.

  • Choosing the model first and treating retrieval as plumbing

  • Chunking documents at arbitrary sizes instead of natural sections

  • Skipping the evaluation set, so quality is judged by anecdote

  • Ignoring permissions, so any staff member can retrieve any document

  • Leaving the index stale while the source documents change weekly

  • Self-hosting on principle when a managed model is cheaper in practice

Key takeaways

  • Retrieval quality, not model brand, decides RAG quality

  • Open source fits best at the embedding and infrastructure layers; pick the answering model on measured accuracy

  • Australian builds need data residency, Privacy Act handling and per-user access rules from day one

  • Budget $20,000 to $45,000 for a serious build and measure the hours it gives back

Talk to a Claude specialist

Automata AI is a Sydney based consultancy that builds retrieval first and picks the model second, with Claude as the default for grounded, reliable answers on Australian business data. If you are weighing up a RAG build, book a short brainstorm and we will map the fastest path to a system your team actually trusts.

Ready to move from AI pilot to production?

We help mid-market Australian businesses deploy AI automations that actually reach production and deliver measurable ROI.