Episode 92 — Plan an AI audit: scope, criteria, stakeholders, and timing choices (Domain 3A)
In this episode, we shift from operating and securing A I systems to the discipline of auditing them, which is really about asking structured questions and proving the answers with evidence. Many brand-new learners assume an audit is a surprise inspection where someone tries to catch mistakes, but a well-run audit is closer to a careful investigation that helps an organization understand what is true, what is risky, and what needs to improve. Planning is where most audits either succeed or fall apart, because the plan determines what you will look at, what you will ignore, who you will talk to, and what you will consider acceptable. Artificial Intelligence (A I) environments make planning more important because the systems involve models, data, prompts, integrations, and vendors, and you can waste a lot of time if you do not decide what matters most. By the end of this lesson, you should be able to explain how to set audit scope, choose criteria, identify stakeholders, and make smart timing choices that produce a useful result.
Before we continue, a quick note: this audio course is a companion to our course companion books. The first book is about the exam and provides detailed information on how to pass it best. The second book is a Kindle-only eBook that contains 1,000 flashcards that can be used on your mobile device or Kindle. Check them both out at Cyber Author dot me, in the Bare Metal Study Guides Series.
An audit plan starts with a clear purpose, because without a purpose you cannot judge what is in scope or what evidence matters. The purpose might be to evaluate whether an A I assistant protects sensitive data, whether a model used for decisions is governed appropriately, or whether vendor controls meet the organization’s requirements. For beginners, it helps to see purpose as the audit’s north star, because every decision you make later should align with it. If the purpose is to assess data protection, you will focus heavily on data flows, access boundaries, and retention practices. If the purpose is to assess safety and misuse risk, you will focus on monitoring, guardrails, and incident response readiness. If the purpose is to assess governance, you will focus on approvals, change management, documentation, and accountability. Planning is not about trying to cover everything an A I system could ever do, because that is impossible in one audit, and it is also not about picking a narrow topic that avoids uncomfortable findings. It is about selecting a purpose that matches real business risk and then building a plan that can actually be executed.
Scope is the next decision, and scope is simply the boundary that says what the audit includes and what it excludes. Beginners often treat scope as a technical diagram of systems, but scope is really a decision about risk focus. A I scope can be defined by product, by business process, by model, by data source, by environment, or by time period, and each choice changes the kind of evidence you will gather. A useful scope statement names the system or process being audited, describes the major components included, and clarifies what is intentionally excluded so stakeholders do not assume coverage that is not there. Scope also includes whether you are looking at development, testing, and production, because controls that exist in production may not exist in development, and incidents often begin where controls are weaker. If the A I system relies on vendors, scope should clarify whether the audit includes vendor controls directly, the organization’s vendor management process, or both. A good plan makes scope narrow enough to be feasible and wide enough to be meaningful.
A practical way to teach yourself scope is to think in terms of boundaries where influence and impact occur. A I systems have influence points where inputs can steer behavior, data can reshape outputs, and configuration changes can alter what the model is allowed to do. They also have impact points where outputs affect decisions, where tools trigger actions, or where sensitive data might be exposed. When you plan scope, you want to include the influence points that are most likely to be abused and the impact points that would cause the most harm if something goes wrong. That might mean including the prompt interface, the retrieval connectors, and the model configuration layer, even if the underlying servers are not the main risk. It might also mean including the logging and monitoring systems, because without them you cannot validate whether controls are working. Beginners sometimes scope only what they can easily see, like an application screen or a model endpoint, and they miss the hidden control planes like prompt templates, system instructions, and service accounts. A strong scope decision deliberately reaches into those hidden areas because that is where evidence of control strength usually lives.
Criteria is the next major planning choice, and criteria means the standard you will use to judge whether something is acceptable. In everyday language, criteria answers the question: compared to what. Without criteria, an audit becomes a collection of opinions, and opinions do not hold up well when decisions are contested. Criteria can come from internal policies, external regulations, contracts, risk management expectations, or industry frameworks, and often you use a mix. For beginners, the easiest way to understand criteria is to imagine a grading rubric, because you need to know what you are grading against before you can score anything. If the organization has an A I policy that requires access reviews and monitoring, those requirements become criteria. If the system handles personal data, privacy requirements may become criteria. If a vendor contract promises certain protections, those promises become criteria. A good audit plan names the criteria sources and explains why they apply to the system being audited, so the results feel grounded rather than arbitrary.
Criteria for A I audits also needs to be specific enough to test, which is where many plans get weak. If the criterion is stated as security is adequate, you cannot prove or disprove it, and you will end up debating wording instead of examining evidence. Strong criteria read more like expectations for behavior, such as access is least privilege, changes are reviewed, sensitive data is protected, and incidents can be detected and contained quickly. Then the audit plan identifies what evidence would demonstrate that behavior, like access logs, change records, policy enforcement settings, monitoring alerts, and incident response procedures that have been exercised. A I adds complexity because some risks are behavioral, such as whether prompts can bypass guardrails, so criteria must include not only configuration and documentation but also testing and observation of system behavior. That does not mean the auditor becomes an engineer, but it does mean the auditor plans for evidence that reflects reality, not just written intentions. If the criteria cannot be tested, the audit will struggle to produce defensible conclusions.
Stakeholders are the people who can explain the system, provide evidence, and act on findings, and identifying them early is one of the most practical parts of audit planning. In A I environments, stakeholders often include more groups than beginners expect, because the system may involve engineering, data science, security operations, privacy, legal, procurement, and the business owners who rely on the outputs. Stakeholders are not just interview targets; they are also the owners of controls and the owners of risk decisions. A plan should identify who owns the model behavior, who owns the data sources, who owns monitoring, who owns identity, and who owns vendor relationships, because each of those areas may have different evidence and different decision authority. If the plan does not identify the right stakeholders, the audit may collect incomplete evidence or misunderstand key design choices. It is also important to identify who will receive the results, because the communication style and level of detail may differ for technical teams versus business leadership. Good planning prevents the common failure where findings are delivered to people who cannot fix them.
Stakeholder planning also includes how you will collaborate without losing independence, which is an important concept for beginners. An audit should be fair and evidence-based, not hostile, and that fairness often improves the quality of information you receive. At the same time, the auditor’s job is not to accept explanations without evidence, and planning helps set that expectation early. When you engage stakeholders, you clarify what evidence you will need, how confidentiality will be handled, and what the timeline looks like, so people can prepare rather than scramble. A I systems are often evolving quickly, so stakeholders may be nervous that the audit will slow them down or misinterpret experimental features. A good plan addresses this by scoping carefully, focusing on risk, and agreeing on what version of the system is being examined. When stakeholders understand the goal is to reduce risk and improve trust, they are more likely to provide clear evidence and meaningful context. That cooperation does not remove scrutiny, but it makes scrutiny more accurate.
Timing choices are the fourth big planning area, and timing is not only about calendar dates, but about where the system is in its lifecycle. Auditing an A I system before it goes live is different from auditing it after it has been running for months, because the evidence available and the risks present are different. Pre-launch audits may focus more on design controls, approvals, and readiness, while post-launch audits can examine real usage patterns, monitoring alerts, and incident handling records. Timing also affects how disruptive the audit will be, so planning should align audit activities with business realities, such as avoiding peak periods when the team is releasing major features. Another timing choice is whether to audit during a model upgrade or a connector rollout, because those moments can reveal how change management and testing are handled. A plan should also consider the stability of the system, because if a system is changing daily, you need to define a snapshot period so evidence remains consistent. Good timing choices improve audit accuracy and reduce operational friction.
Timing also matters because A I risk can change quickly when new data sources or capabilities are added. A model that is safe as a standalone chat system might become much higher risk when it gains access to internal documents or the ability to trigger actions through tools. When you plan, you decide whether the audit will evaluate the current configuration only, or whether it will also evaluate the process for approving and controlling future changes. That decision affects evidence needs, because evaluating the current state requires configuration records and logs, while evaluating the change process requires change tickets, approvals, and testing documentation. Beginners sometimes think auditing is only about the present, but strong audits also examine how the organization keeps the system safe as it evolves. This is especially important in A I because the technology can drift, and drift creates risk when controls do not keep up. A timing-aware plan may include reviewing recent changes and using them as examples to test whether governance and controls function in practice. That gives the audit depth without trying to predict every future feature.
Once scope, criteria, stakeholders, and timing are set, the plan needs a clear evidence strategy, because evidence is what turns auditing into something more than discussion. Evidence can include documents, system configurations, logs, approvals, contracts, monitoring records, and interview statements, but the plan should prioritize objective evidence over opinions. Interviews are valuable for understanding intent and process, yet interviews must be validated through artifacts that show what actually happens. In A I audits, evidence often includes model version records, prompt configuration histories, retrieval source lists, access permissions, and telemetry that shows how users interact with the system. Evidence might also include incident records or user-reported issues, which demonstrate how the organization responds when something goes wrong. A well-planned audit also anticipates that some evidence may be sensitive, so it defines how evidence will be handled, stored, and minimized to reduce exposure. Beginners should remember that asking for evidence is not an accusation; it is the normal method of proving claims. A plan that identifies evidence upfront reduces confusion and speeds execution.
A useful beginner habit is to plan for traceability, meaning you can connect each finding back to a criterion and a piece of evidence. Traceability is what makes an audit result defensible when someone asks why you reached a conclusion. If you say access controls are weak, you should be able to point to the criterion that defined expected access control behavior and the evidence that showed excessive permissions or missing reviews. If you say monitoring is incomplete, you should be able to point to the criterion that required detection of misuse and the evidence that logs or alerts were missing for key behaviors. In A I, traceability also helps avoid vague debates about whether the model is safe, because you can ground the discussion in observable facts like configuration settings, change records, and actual usage telemetry. Planning for traceability also helps with remediation, because stakeholders can see exactly what needs to be improved and why it matters. When you plan well, you are not just preparing to criticize; you are preparing to help the organization make targeted improvements that reduce risk.
Another planning decision that beginners often underestimate is how to handle boundaries with vendors and shared responsibility. If an A I service is hosted by a vendor, the organization may not have direct access to internal vendor logs, infrastructure controls, or model training pipelines. The audit plan should address this by defining what evidence can be obtained from the vendor, what assurances come from contracts, and what compensating controls exist on the customer side. For example, if you cannot inspect vendor internal access controls, you may evaluate the customer’s data minimization strategy, their credential scoping, and their monitoring of usage patterns. You may also evaluate the vendor management process, such as how the organization reviews vendor security evidence and how it responds to vendor changes. Planning for vendor boundaries prevents the common audit failure where the auditor requests evidence that cannot be provided, then misinterprets the absence as negligence rather than as a visibility limitation. A good plan turns visibility limits into explicit scope decisions and evidence strategies. That keeps the audit fair and focused on real risk reduction.
As we wrap up, remember that planning an A I audit is not paperwork that happens before the real work, because the plan is what turns the real work into something coherent and valuable. Scope defines what you will examine and keeps the audit feasible while still meaningful. Criteria define how you will judge what you find so the results are based on standards rather than impressions. Stakeholders ensure you reach the people who can explain the system, provide evidence, and fix issues, while preserving the auditor’s independence and clarity. Timing choices align the audit with the system lifecycle so the evidence reflects reality and the audit can influence risk at the right moment. When those four planning elements are done well, the audit is far more likely to produce findings that are defensible, actionable, and connected to real business risk. That is the heart of Domain 3A, and it is the bridge from understanding A I controls to evaluating them in a disciplined, evidence-driven way.