Back to the JournalAI & Accuracy

What Happens When AI Gets Bank Reconciliation Wrong?

AI bank reconciliation errors are real — here's exactly how they happen, who's responsible, and why AI-assisted coding is more auditable than manual.

DH
Daniel Hu
Compliance lead · 24 May 20267 min read
Last reviewed against current ATO guidance: 23 May 2026. Always confirm current thresholds, rates, and dates at ato.gov.au.

AI gets bank reconciliation wrong when it assigns an incorrect account code, GST treatment, or categorisation to a transaction — either because the transaction is ambiguous, the training data didn't cover a similar pattern, or the rules governing that transaction changed and the model hasn't been updated to reflect it. It is not a hypothetical risk. It happens. The more important question is: what systems exist to catch it, who bears responsibility when it slips through, and whether AI-assisted reconciliation is actually riskier than doing it by hand.

This post answers those questions directly.


How AI Bank Reconciliation Actually Works — Three Layers

Most AI reconciliation tools, including Reconlink, do not rely on a single model. They use a layered approach that mirrors how an experienced bookkeeper escalates uncertainty.

Layer 1: Rules

The first pass is deterministic. Hard rules handle transactions that are unambiguous: a direct debit from a known payroll provider maps to wages; a payment to a known utility maps to utilities. Rules are fast, consistent, and correct — until an edge case falls outside their scope.

Rules fail when the transaction description is non-standard, when a known payee changes the nature of what they're billing for, or when a new vendor appears that no rule covers. That's where layer two takes over.

Layer 2: Machine Learning Classification

When no rule matches with sufficient confidence, a trained ML classifier takes the transaction and predicts the most likely account code based on patterns learned from historical data. This is where the bulk of AI bank reconciliation happens.

ML errors occur in two ways. The first is false confidence: the model assigns a high confidence score to the wrong category because the transaction superficially resembles a different one in its training data. The second is correct uncertainty: the model flags a low confidence score on a transaction it genuinely cannot categorise well. The second case is healthy — that's the system working as intended. The first case is the one that requires active mitigation.

Layer 3: LLM-Assisted Review

For transactions that are structurally unusual — descriptive but not matching known patterns, complex split-coding candidates, or transactions with potential regulatory nuance — a large language model can be used to reason about the context. This layer adds interpretive capability that pure classification lacks. But it also introduces its own failure mode: LLMs can generate confident-sounding rationales for incorrect conclusions. This is why LLM outputs are never posted automatically. They feed into the human review queue.


What the Human Review Step Looks Like in Practice

When Reconlink's confidence scoring places a transaction below the posting threshold, it moves to a review queue visible to the assigned bookkeeper or team lead. The reviewer sees:

  • The transaction description and amount
  • The AI's suggested categorisation and confidence score
  • The reasoning the AI used (where the LLM layer was involved)
  • Any similar transactions previously coded by the firm and how they were handled

The reviewer approves, overrides, or escalates. If they override, their correction is recorded and — depending on settings — can be used to improve future suggestions for similar transactions.

This is not a rubber-stamp step. A trained bookkeeper reviewing a queue of flagged transactions is exactly the control you want on ambiguous items. The AI handles the high-volume, high-confidence work. The human handles the edge cases.


The Liability Question: Who Is Responsible When AI Miscodes a GST Transaction?

This is the question we hear most often from bookkeepers and CA firms considering AI reconciliation tools, and it deserves a direct answer.

The bookkeeper or firm remains professionally responsible for the accuracy of a client's accounts. An AI tool does not change this. The Tax Agent Services Act 2009 (Cth) and the obligations of registered BAS agents and tax agents apply to the person or firm lodging or advising on the accounts — not to the software used to prepare them.

This means two things. First, any AI reconciliation tool used in professional practice must be used with appropriate review. Posting AI-generated codings without human oversight is not a defensible practice. Second, the existence of a human review step — and the audit trail that accompanies it — is your professional protection.

If a GST transaction is miscoded and it reaches the BAS, the responsibility lies with the agent who certified that BAS. The question then is whether the agent followed a reasonable process. Using AI with documented review workflows and complete audit logs is a reasonable process. Relying on memory and manual spreadsheets with no trail is often less defensible than it appears.

The practical implication: use AI reconciliation with human review on, keep your audit logs, and document your workflows. That is a stronger position — professionally and with the ATO — than manual coding with no audit trail.


AI Bank Reconciliation Accuracy: The Confidence Scoring System

Reconlink assigns a confidence score to every AI-generated categorisation. Scores are expressed as a percentage representing the model's certainty about the suggested code.

The firm's administrator can set a posting threshold — for example, transactions above 95% confidence post automatically; transactions below that threshold route to review. Conservative practices can set this threshold higher, which increases review volume but reduces the risk of auto-posted errors. Higher-volume practices with well-trained rule sets can lower the threshold once they've validated the model's accuracy on their own transaction history.

Transactions that score very low are not silently ignored — they appear in the review queue with a flag indicating high uncertainty. A transaction the AI cannot confidently categorise does not disappear; it waits for a human.


The Audit Trail: Every Decision Is Traceable

The audit trail is not just a compliance feature. It is the mechanism that makes AI reconciliation more defensible than manual reconciliation.

Every transaction processed by Reconlink carries a full history: the original bank feed entry, the AI's suggestion and confidence score, any human review action taken, the identity of the reviewer, and the timestamp of each event. If a transaction is overridden, the override is logged against the original AI suggestion — you can see both.

This means that if the ATO asks how a particular transaction was coded, you can produce a complete record: what the AI suggested, what the human decided, and when. Compare this to a manual coding workflow where the only record is the final journal entry. The AI workflow produces more evidence, not less.

Audit logs in Reconlink are retained for a minimum of seven years, consistent with ATO record-keeping requirements.


Why AI-Assisted Coding Is More Auditable Than Unchecked Manual Coding

Manual reconciliation has an auditing problem that often goes unacknowledged: when a staff member codes a transaction incorrectly, there is frequently no record of the reasoning, no confidence score, and no systematic review trigger. The error is as invisible as the correct coding next to it.

AI reconciliation with a human review layer adds structure to a process that is otherwise largely invisible. Every AI suggestion is documented. Low-confidence items are surfaced rather than guessed at. Reviewer decisions are attributed and timestamped. Patterns in errors can be identified and corrected at the rule level rather than discovered at year-end.

The risk with AI is not that it creates new categories of error that manual processes avoid. The risk is that volume and speed can amplify errors if review controls are not properly configured. Properly configured — with appropriate confidence thresholds, active review queues, and trained reviewers — AI reconciliation produces a more complete and reviewable record than the manual equivalent.


Frequently Asked Questions

What happens if the AI categorises a transaction incorrectly and it's already been posted?

Any posted transaction can be corrected in the normal way — through the reconciliation interface or your accounting platform. The correction is logged in the audit trail. Reconlink does not prevent corrections; it records them. If a systematic error is found (for example, a rule that was misconfigured), the affected transactions can be identified via the audit log and corrected in bulk.

Does using AI reconciliation affect my obligations as a registered BAS agent?

No. Your obligations under the Tax Agent Services Act 2009 (Cth) and the TPB's Code of Professional Conduct apply regardless of the tools you use. Using AI does not create additional obligations, but it does not reduce existing ones either. You remain responsible for the accuracy of work you produce. AI is a tool that assists your work; it does not replace your professional judgement or liability.

Can I see what the AI was "thinking" when it made a suggestion?

For transactions processed by the LLM layer, Reconlink displays the reasoning the model used alongside the suggestion. For ML-layer categorisations, the confidence score and the closest matching historical transactions are shown. You will not see a black-box result with no explanation.

How does Reconlink handle transactions it has never seen before — new payees, unusual descriptions?

New payees and unusual transaction descriptions fall through the rules layer and are assessed by the ML classifier. If confidence is low, they route to the human review queue. A bookkeeper reviews the transaction and codes it manually. That manual coding is then available as a data point for future similar transactions from the same payee.

Is it safe to use AI reconciliation for clients with complex GST situations — partial input tax credits, mixed-use assets, and so on?

Complex GST situations benefit from higher confidence thresholds and active review. Reconlink does not automatically post transactions involving certain flagged GST codes without confirmation. For clients with complex structures, we recommend configuring the platform to route all GST-coded transactions to review until you have validated the model's accuracy on that client's transaction history. Your firm's rules can be configured per client.


Summary

AI bank reconciliation errors are real, they occur at specific points in a layered system, and they are mitigated — not eliminated — by confidence scoring and human review. The professional responsibility for reconciled accounts stays with the registered agent. The audit trail produced by a well-configured AI system is more complete and reviewable than the record most manual workflows produce.

The right question is not whether to avoid AI because it makes mistakes. Manual processes make mistakes too, silently and without audit trails. The right question is whether the review controls are properly configured and whether your team is using them.


For more on how AI fits into Australian bookkeeping practice, see our post on AI bookkeeping in Australia. Questions about Reconlink's accuracy controls or data handling? Contact us at security@reconlink.com.au.

Run your practice on ReconLink.

Bank reconciliation that codes itself, BAS export ready for your tool of choice, and a client portal that ends the email chain.