TL;DR
In 2026, AI in financial reporting moved from a footnote in your SOX walkthrough to its own line of audit inquiry. Three changes drove this: the PCAOB’s amended AS 2201 and AS 2101 take effect for fiscal years beginning on or after December 15, 2026; Big Four firms now train audit staff specifically to scrutinize AI-touched controls; and continuous control monitoring is replacing point-in-time evidence.
External auditors are asking 12 recurring questions about AI in financial reporting workflows:
- Where in your financial reporting process does AI make or influence a decision?
- Walk me through how this AI-touched control operates, in plain language.
- How is the AI’s decision logic version-controlled?
- Show me the audit trail for a specific decision.
- How do you know the AI is doing what your documentation says it does?
- What happens when the AI is uncertain or wrong?
- Who has access to change the AI, and how is that access reviewed?
- How is AI-generated evidence itself verified?
- How do you handle changes to the underlying model?
- What is the boundary between AI judgment and human judgment in this process?
- How would you detect if the AI started behaving differently?
- If we identified a deficiency in this AI control, what is your remediation path?
The underlying question behind all 12 is whether your AI behaves like a control (governed, deterministic, evidenced, version-controlled) or like a tool. Deterministic, neurosymbolic AI is auditable because it was built to behave like a control. Probabilistic AI is harder to audit because it was built to behave like a tool. In 2026, that distinction is the audit.
A year ago, “we use AI for that” was a footnote in a SOX walkthrough. In 2026, it is the walkthrough.
Three things changed at once. The PCAOB’s amended AS 2201 and AS 2101 take effect for audits of fiscal years beginning on or after December 15, 2026, formalizing a top-down, risk-based approach to integrated audits. The EU AI Act moved from headlines to enforcement. And Big Four firms started training audit staff specifically to scrutinize AI-generated evidence, AI-touched controls, and AI-driven exception resolutions.
For finance and IT leaders, this means one thing. If an AI agent or automation platform touches any process within scope of internal controls over financial reporting (ICFR), your auditor is going to ask about it. Probably in detail. Probably with follow-up questions you have not prepared for.
This post walks through the 12 questions external auditors are actually asking about AI automation in 2026, what they expect to see, and the kind of evidence that satisfies a Big Four senior manager on the first pass instead of the third. If you are also weighing how the underlying platform choice shapes those answers, our deep dive on finance automation in 2026 (Kognitos vs. traditional RPA) is a useful companion read.
A note up front. This is not legal or audit advice. The specifics of your control environment, your auditor, and your industry will shape the questions you face. But the 12 categories below are what we see consistently across customer audit cycles, and they map directly to what PCAOB-aligned firms are now trained to test.
Why this matters more in 2026 than it did in 2025
Before the questions, the context. Three shifts are reshaping how AI automation gets audited:
1. AS 2201’s expanded benchmarking provision. The amended standard says that for fully automated application controls, if the ITGCs over the underlying system (particularly change management and access) are effective and the control logic has not changed, an auditor can conclude the control remains effective without repeating prior-year operating effectiveness testing. This is a gift to deterministic, version-controlled, English-as-code automation. It is a problem for probabilistic AI whose “logic” is implicit in model weights that change without a change log.
2. The death of the point-in-time screenshot. Auditors in 2026 are trained to spot AI-manipulated images. The standard for application-level evidence is shifting toward continuous control monitoring (CCM) with NTP-synced timestamps, DOM snapshots, user attribution, and verifiable execution logs. Screenshots in a SharePoint folder will not survive a 2026 audit cycle.
3. AI as a SOX-relevant access risk. Boards have spent years asking whether people have too much access to financial systems. In 2026, the harder question is whether AI agents do, and whether you can prove it to regulators, auditors, and your board. This is exactly the architectural question we explored in AI Governance Is Not a Checklist. It’s an Architectural Choice.
With that as background, here are the 12 questions.
The 12 questions your SOX auditor will ask
1. “Where in your financial reporting process does AI make or influence a decision?”
This is the scoping question, and it is the question most companies get wrong. Auditors are not asking where you have “deployed AI.” They are asking where AI participates in a process within ICFR scope. That includes AI that summarizes, classifies, recommends, routes, drafts, reconciles, or flags — not just AI that approves or pays.
What satisfies them. A current, dated AI inventory mapped to your process risk taxonomy. For each AI touchpoint: the system, the model or platform, the type of decision it influences, the financial assertion it relates to (existence, completeness, valuation, rights and obligations, presentation and disclosure), and the human control that follows it if any. If you cannot produce this in under 30 minutes, your auditor will assume the inventory is incomplete.
Kognitos angle. Every Kognitos automation maps to a named English-language process with a documented purpose, a defined data scope, and an explicit list of systems it touches. The inventory exists by construction, not by quarterly hunt-and-gather.
2. “Walk me through how this AI-touched control operates, in plain language.”
PCAOB walkthroughs are designed to confirm that the auditor understands the control as it actually operates, not as it is documented. AI-touched controls fail walkthroughs most often when the operator cannot explain the AI’s reasoning in plain language, or when the system’s behavior in the walkthrough does not match the documentation.
What satisfies them. A walkthrough script that traces a real transaction from initiation to recording, with each AI step described in plain English and tied to a specific log entry. Auditors do not want “the model recommended this.” They want “the system applied the 3-way match policy that says ‘an invoice matches a PO when the vendor, total, and PO number agree within 2% tolerance,’ and recorded this transaction as a match because [specific values].”
Kognitos angle. English-as-code is built for this question. The policy that runs the automation is the same English the auditor reads in the walkthrough. There is no translation layer between “what the AI does” and “what the AI does.”
3. “How is the AI’s decision logic version-controlled?”
This is the AS 2201 benchmarking question in disguise. If your auditor can establish that (a) the underlying ITGCs are effective and (b) the AI’s decision logic has not changed since prior-year testing, they can rely on prior-year operating effectiveness conclusions. If they cannot establish (b), every change to the model or to the prompt re-opens operating effectiveness testing.
What satisfies them. A version control system for the AI’s decision logic with dated entries, change descriptions, the requestor, the approver, and a link to the change ticket. Critically, this needs to cover not just code changes but prompt changes, model upgrades, fine-tuning events, and any change to the data the system uses to reason.
Kognitos angle. Every change to a Kognitos automation is logged with timestamp, author, approver, and a plain-English diff of what changed. Model upgrades are explicit events with their own audit trail.
4. “Show me the audit trail for a specific decision.”
This is the test of last resort. The auditor picks a transaction, often one they think will be hard, and asks you to reproduce exactly how the AI arrived at its conclusion. They want timestamps. They want inputs. They want the specific rule applied. They want the resulting action. They want it all linked to the user or agent that took it.
What satisfies them. A full execution log per transaction that includes: timestamp, triggering event, inputs received, the specific rule or policy invoked, the AI’s reasoning expressed in plain language, the action taken, the system of record updated, and the human (if any) who reviewed or approved the result. If your AI platform’s audit trail says “Decision: APPROVED. Confidence: 94%.” you do not have an audit trail. You have a guess with a number attached to it.
Kognitos angle. This is the central design promise. Every Kognitos decision is a deterministic execution of a stated English policy, logged with the inputs, the rule, the reasoning, and the action, end to end. We do not output confidence scores in place of explanations.
5. “How do you know the AI is doing what your documentation says it does?”
This is the testing-of-design question. AS 2201 requires that the design of a control be evaluated, not just its operation. Auditors want to know how you confirm that the AI’s actual behavior matches its documented intent.
What satisfies them. Documented test cases that cover both expected behavior and known edge cases, run on a defined cadence (typically quarterly for SOX-relevant controls), with results retained as evidence. The strongest evidence package includes both positive tests (the AI did the right thing on this case) and negative tests (the AI correctly refused or escalated this case). Probabilistic AI systems have a harder time providing the negative case, because their “refusal” is itself probabilistic.
Kognitos angle. Kognitos automations are auto-tested before deployment with simulated scenarios and edge cases. Test results are versioned alongside the automation itself.
6. “What happens when the AI is uncertain or wrong?”
This is the exception handling question, and it is where the gap between “agentic AI” marketing and SOX-defensible automation becomes most visible. Auditors want to see that the system has a defined behavior in failure cases, not an undefined one.
What satisfies them. A documented exception taxonomy with: the conditions that trigger an exception, the automated handling of each condition (escalate, retry, reject, route), the human review path if escalation is required, the SLA for human resolution, and the evidence retained for each exception. Bonus points if the system can explain the exception in plain language to the human reviewer rather than just flagging it.
Kognitos angle. Kognitos’s Resolution Agent explains exceptions in plain English, including what went wrong, why, and what options exist for resolution. The interaction log is itself audit evidence.
7. “Who has access to change the AI, and how is that access reviewed?”
Standard ITGC access management, applied to AI. Auditors will ask about provisioning, deprovisioning, periodic access reviews, privileged access governance, and segregation of duties for everyone who can change an AI’s decision logic, prompts, training data, integrations, or deployment status. This includes humans and any other AI agents with administrative permissions.
What satisfies them. Access lists with role-based justification, dated quarterly access reviews, evidence of timely deprovisioning when roles change, and a documented separation between development, testing, and production environments. AI agents with administrative access should be treated as privileged users for audit purposes.
Kognitos angle. Kognitos supports SOC 2 Type II, ISO 27001, GDPR, and HIPAA-aligned access controls (see our Trust portal), with role-based permissions and quarterly access review workflows that produce review evidence by default.
8. “How is AI-generated evidence itself verified?”
This is the 2026 question. As AI-generated screenshots, AI-summarized reports, and AI-drafted memos enter audit packages, auditors are asking whether the evidence itself is trustworthy. Big Four firms are specifically training staff to scrutinize AI-generated evidence quality and independence.
What satisfies them. Evidence with NTP-synced timestamps, DOM snapshots when applicable, clear attribution of the user or automated agent that produced it, and a chain of custody from production to the audit file. AI-summarized evidence should be paired with the underlying raw evidence the AI summarized.
Kognitos angle. Kognitos’s execution logs are timestamped, attributed, and cryptographically verifiable as part of the platform’s tamper-evident audit log. The summary an auditor reads is generated from the same underlying log they can drill into.
9. “How do you handle changes to the underlying model?”
The most underestimated question. If your AI runs on a third-party model (OpenAI, Anthropic, Google, Meta), the model itself can change without your change management process being involved. An auditor will ask how you detect and govern that.
What satisfies them. A documented model governance policy that covers: which models are approved for SOX-relevant processes, how model version pinning is enforced, how model upgrades are tested before promotion to production, how model deprecation is handled, and how the company stays informed of provider-side changes. “We use the latest version of GPT” is not a satisfactory answer.
Kognitos angle. Kognitos pins model versions per automation, tests behavior before promoting model upgrades to production, and maintains documented behavior baselines so model drift is detectable.
10. “What’s the boundary between AI judgment and human judgment in this process?”
This is the segregation-of-duties question reframed for AI. Auditors are increasingly asking where the AI’s authority ends and the human’s begins, and whether that boundary is enforced by system controls or only by policy.
What satisfies them. A documented decision-authority matrix that names: which decisions the AI may take autonomously, which require human review before commit, which require human approval before action, and which require multi-party human approval. The matrix should be enforced in the system itself, not just in a SharePoint document. “The AI is supposed to escalate over $10K” is policy. “The AI cannot post a journal entry over $10K without supervisor approval” is a control.
Kognitos angle. Decision-authority boundaries in Kognitos are expressed in the same English-as-code policy that drives the rest of the automation. The boundary is part of the program, not a note in a procedure manual.
11. “How would you detect if the AI started behaving differently?”
The drift detection question. Auditors know that even deterministic systems can drift if their inputs change, if model versions change silently, or if a hidden prompt is altered. They want to know how you would know.
What satisfies them. A continuous control monitoring (CCM) program that tracks: AI decision distributions over time, exception rates, escalation rates, and a defined set of canary transactions whose behavior should not change. The program should alert when distributions move outside defined tolerances. Reactive detection (“we noticed in Q3 that the AP automation started flagging more invoices”) is not enough.
Kognitos angle. The Kognitos Consumption Dashboard tracks decision distributions, exception rates, and resolution patterns in real time, with alerts on drift outside defined tolerances. This is the same telemetry our HAL Auto-Monitor uses to flag behavioral changes before they become control deficiencies.
12. “If we identified a deficiency in this AI control, what’s your remediation path?”
The endgame question. Auditors want to know that, if they identify a deficiency, you can actually fix it. This is harder than it sounds for probabilistic AI, because fixing a behavior often requires retraining, which itself introduces new behavior changes.
What satisfies them. A documented remediation playbook that includes: how you would isolate the affected transactions, how you would correct the AI’s behavior, how you would test the correction, how you would document the change in your change management system, and how you would communicate the deficiency to management and (if material) to the audit committee. The playbook should be specific enough that a new SOX manager could execute it.
Kognitos angle. Deterministic English-as-code changes are precise, testable, and reversible. A deficiency in a Kognitos automation is fixed by editing the English policy, testing the change, and deploying it through documented change management. No retraining, no model drift, no probabilistic side effects.
What this list does not tell you
A few honest caveats.
This list is not exhaustive. Your auditor will have firm-specific testing approaches, industry-specific concerns, and entity-specific risk assessments that drive additional questions. Treat this as the floor, not the ceiling. If you operate in banking, financial services, or insurance, expect additional scrutiny around model risk management (SR 11-7) layered on top of these 12.
This list is auditor-side, not regulator-side. The EU AI Act, SEC cyber disclosure rules, and emerging state-level AI legislation add separate questions that your general counsel and CISO will own. Many overlap with SOX. Some do not. Our broader take on compliance automation covers where these regimes converge.
This list assumes AI is in scope. The first conversation with your auditor should be whether and how AI is in scope at all. Many AI use cases (for example, AI used only for internal productivity, not touching financial data or controls) are out of scope. Confirm scope first. Build evidence second.
The harder question underneath the 12 questions
If you read the 12 questions carefully, they all point to the same underlying request. The auditor wants to know whether your AI behaves like a control or like a tool.
A control is governed, deterministic, evidenced, version-controlled, testable, monitored, and remediable. A tool is whatever you set up last quarter and hope still works.
| Audit dimension | Probabilistic AI (LLM-as-control) | Deterministic, neurosymbolic AI |
|---|---|---|
| Walkthrough explanation | “The model recommended this” | Plain-English policy the auditor reads directly |
| Decision logic | Implicit in model weights | Explicit, inspectable, version-controlled |
| Audit trail per decision | Confidence score, no reasoning | Inputs, rule invoked, reasoning, action, attribution |
| AS 2201 benchmarking | Difficult — logic changes with each model version | Eligible — pinned model, stable English policy |
| Negative test cases | Refusal is itself probabilistic | Deterministic escalation paths, versioned tests |
| Change management | Silent model upgrades break the change log | Pinned model versions, English-language diffs |
| Remediation path | Retrain — introduces new behavior changes | Edit English policy, test, deploy through change mgmt |
| SOX walkthrough outcome | Risk of repeat findings & expanded testing | Benchmarkable, prior-year reliance possible |
The reason probabilistic AI is hard to audit is not that it’s bad technology. It’s that it was built to be a tool. The reason deterministic, neurosymbolic AI is auditable is that it was built from the start to behave like a control. The English policy is the documentation. The execution log is the evidence. The version control is the change history. The deterministic execution is the operating effectiveness.
That alignment between how the system is built and how an auditor evaluates it is not a feature. It’s an architecture choice. And in 2026, it’s the choice that separates the AI you can put in front of an auditor from the AI you have to explain away.
How Kognitos helps finance teams stay audit-ready
Kognitos is the deterministic, neurosymbolic AI platform that the 12 questions above were essentially designed around. Our customers in finance and accounting, banking, insurance, and healthcare deploy Kognitos specifically because every automation:
- Is described in plain English the auditor can read
- Executes deterministically against that English policy
- Logs every decision with full reasoning and attribution
- Version-controls every change to the logic
- Tests itself against documented edge cases before production
- Pins model versions and detects drift
- Produces tamper-evident audit evidence by default
- Maps cleanly to AS 2201’s expanded benchmarking provisions
If you are preparing for a 2026 audit cycle and want to see what the 12 answers look like in production, see how TTX uses Kognitos for finance & accounting automation or book a working session with a Kognitos solutions engineer on your highest-risk process.
