Episode 59 — Retest and document fixes so AI vulnerabilities stay closed (Task 7)
When an Artificial Intelligence (A I) system behaves in a way that surprises people, the root cause often traces back to design decisions that were made quietly, early, and then never challenged again. Design decisions are not only about how the model is built; they are about what the system is trying to accomplish, what it assumes about the world, and how it will decide whether it is doing a good job. For brand-new learners, this can feel abstract at first, because objectives and assumptions sound like paperwork words. The reality is that objectives and assumptions are the steering wheel of the project, and success criteria are the speedometer and fuel gauge that tell you whether you are driving safely. If the steering is pointed toward the wrong destination, even a powerful model will cause harm. If the gauges are wrong, the organization will believe everything is fine until the system fails in public.
Before we continue, a quick note: this audio course is a companion to our course companion books. The first book is about the exam and provides detailed information on how to pass it best. The second book is a Kindle-only eBook that contains 1,000 flashcards that can be used on your mobile device or Kindle. Check them both out at Cyber Author dot me, in the Bare Metal Study Guides Series.
A practical way to think about auditing design is to treat it like reading the rules of a game before you judge whether the game is being played fairly. If the rules are unclear, the players will interpret them differently, and disagreements will turn into conflict. In an A I solution, objectives are the rules that define what the system is for, and what outcomes it is allowed to optimize. Assumptions are the invisible rulebook pages that explain what the designers believe about the data, the users, and the environment. Success criteria are the scoreboard, and if the scoreboard measures the wrong thing, teams will win on paper while losing in real life. Auditing design is about making those rules visible so the organization can evaluate whether they are ethical, compliant, and realistic. This is why design auditing matters even before you look at training data or model performance, because design decisions determine what counts as success and what gets ignored.
Start with objectives, because an objective is not the same as a vague business goal like improve efficiency or reduce cost. An objective should describe what the A I system will produce, how that output will be used, and what good outcomes should result for the organization and for the people affected. If the objective is to route support tickets, the outcome should involve correct routing that improves resolution, not just faster routing that sends people to the wrong place. If the objective is to flag unusual activity, the outcome should involve accurate early warning that reduces harm, not a flood of false alarms that trains staff to ignore alerts. Auditing objectives means asking whether the objective is specific enough to create boundaries, and whether those boundaries prevent function creep. A model that begins as a helper can quietly become a gatekeeper if the objective never clearly states how the output should and should not be used.
Objectives also carry ethical weight, because the chosen objective can shape who benefits and who is burdened. If a system’s objective is to reduce labor cost, the easiest path might be automation that shifts errors onto users who must fight for corrections. If the objective is to maximize throughput, the system might prioritize speed over accuracy, and the people harmed will be those whose cases are complicated or atypical. Auditing objectives includes asking whose experience is included in the definition of success and whose experience is treated as a rounding error. It also includes checking whether the objective is aligned with legal and policy constraints, such as non-discrimination expectations or privacy limits, because an objective that implicitly encourages profiling can clash with obligations. A strong design objective accounts for impact, defining acceptable tradeoffs rather than pretending tradeoffs do not exist. If an objective can only be achieved by violating privacy expectations or creating unfair outcomes, the objective itself needs redesign.
Once objectives are clear, the next design element to audit is assumptions, because assumptions are the quiet statements that the team often believes without realizing it. A common assumption is that the training data represents the real world, when in fact it may represent only what was measured, what was recorded, or what was historically enforced. Another assumption is that users will interpret model outputs correctly, even though people often over-trust confident language and under-trust uncertainty. Another assumption is that the environment will stay stable, even though behavior patterns drift and new contexts appear. Assumptions can also be about fairness, such as believing a certain variable is neutral when it actually acts as a proxy for sensitive traits. A beginner-friendly audit approach is to ask what must be true for the system to work as intended and then challenge whether those conditions are reliably true in real life. If the system requires perfect data quality, perfect user behavior, and a stable environment, it is designed for a fantasy world, not the real one.
Assumptions about data deserve special scrutiny because they influence both model behavior and compliance risk. A design might assume that missing data is random and harmless, but missing data often clusters in certain groups because of access differences, language barriers, or uneven documentation. A design might assume that labels represent truth, when labels may reflect unequal attention, unequal reporting, or subjective judgment. A design might assume that historical outcomes are fair to use as targets, when those outcomes may encode past discrimination or resource allocation patterns. Auditing these assumptions means asking what the data actually measures and what it fails to measure, and whether those gaps could create systematic error for certain people. It also means checking whether the design includes controls to detect when data assumptions stop holding, such as monitoring for distribution shifts or rising missingness. When assumptions are documented honestly, they can be tested and monitored. When assumptions are hidden, they turn into surprises later.
Another design area that hides assumptions is the decision threshold or decision rule, meaning how model outputs become actions. Even if the model provides a score, the design must decide what score triggers intervention, escalation, denial, or approval. A common assumption is that a single threshold is fair for everyone, but different groups may have different baseline patterns, which can make a fixed threshold unfair or ineffective. Another assumption is that the costs of false positives and false negatives are symmetric, when they often are not. In a fraud setting, a false positive might block a legitimate user and create hardship, while a false negative might allow a loss that is absorbed by the organization. In a safety setting, a false negative might create serious harm, which changes the ethical balance. Auditing design decisions here means asking how thresholds were chosen, whose harm was considered, and whether the system includes a safe path for contesting or reviewing decisions. If thresholds are chosen purely for convenience, the system can become a machine that distributes harm unevenly.
Now focus on success criteria, because this is where many organizations unintentionally teach their teams to chase the wrong result. Success criteria are the measurable conditions that define whether the system is acceptable to deploy and acceptable to keep running. If the only success criterion is average accuracy, the system can still fail for important subgroups and still be declared successful. If the success criterion is reduced time per task, the system can increase errors and still be celebrated. A good design includes success criteria that reflect both performance and safety, meaning the system must achieve useful outcomes while staying within ethical and compliance boundaries. This is where Key Performance Indicators (K P I s) and Key Risk Indicators (K R I s) become practical, because they translate design intent into ongoing signals. Auditing success criteria means checking whether they match the objective, whether they are sensitive to uneven harm, and whether they are tied to actions when they deteriorate. Criteria that do not drive decisions are not real criteria.
Success criteria should also include a definition of acceptable failure, because no A I system is perfect, and pretending otherwise sets the organization up for denial when problems appear. Acceptable failure definitions include how often errors can occur, what types of errors are tolerable, and what must happen when errors are detected. In higher-impact use cases, acceptable failure thresholds should be tighter, and safeguards should be stronger, such as requiring human review for uncertain cases or providing appeal mechanisms for affected people. Auditing this aspect of design means asking whether the organization has defined what it will do when the system is wrong, not just what it hopes will happen. It also means checking whether the success criteria include user experience measures that detect harm, such as complaint rates, correction rates, or escalation rates, because those can reveal failure even when technical metrics look fine. A system that is technically accurate but socially harmful is still failing. Design success criteria should make that visible.
A strong audit also checks whether success criteria were defined before the team saw performance results, because criteria created after results can become a way to justify deployment. When teams fall in love with a model, they may lower the bar until the model clears it, which is a governance failure disguised as flexibility. Auditing should look for evidence that the organization set acceptance criteria early, based on purpose and impact, and then measured the model against those criteria. If criteria changed, the audit should ask why, who approved the change, and whether the change increased risk. This is not about punishing iteration; it is about preventing the organization from slowly normalizing lower safety and fairness standards. A healthy program allows learning, but it keeps core safeguards stable unless there is a clear, documented reason to adjust them. If success criteria are constantly moving, it becomes impossible to prove that governance is real.
Design auditing should also examine the human role around the model, because objectives and success criteria often assume human review will catch problems. Humans can be a safety net, but only if the workflow makes meaningful review possible. If staff are overloaded, they may rubber-stamp model outputs, turning an advisory system into an automated decision engine. If the interface presents the model output as authoritative, users may stop thinking critically and may feel punished for disagreeing. Auditing design decisions here means asking what humans are expected to do, how much time they have, what training they receive, and what happens when they override the model. It also means asking whether the organization measures whether human review is actually happening, because assumed review that does not occur is one of the most common silent failures. Design should include mechanisms that support healthy skepticism, such as highlighting uncertainty or requiring justification for high-impact actions. Without that, the system can push risk onto humans while pretending humans are in control.
Another important design topic is scope boundaries, because a model that works well in one context can fail badly in another. Design should specify where the system is allowed to be used, what types of inputs it expects, and what outputs should not be used for. For example, an internal summarization tool might be safe for generic documents but unsafe for personal records if it can leak sensitive information. A triage model might be appropriate for low-impact sorting but inappropriate for denying access without review. Auditing scope includes checking whether the objective and success criteria are tied to that scope, and whether the organization has controls to prevent expansion without review. Scope drift is a governance problem because it changes the risk level without changing safeguards. A beginner-friendly test is to ask what would happen if someone used the system for a different decision than intended and whether the design prevents that misuse or detects it quickly. A design that assumes perfect use is a design that will fail in real environments.
Design decisions should also consider explainability and contestability in proportion to impact, not because every system must explain itself in deep technical detail, but because people need a way to understand and challenge decisions that affect them. Explainability in practical terms means the organization can describe what factors generally influence outcomes, what limitations exist, and what steps a person can take when they believe the system is wrong. Contestability means there is a path for human review and correction, especially when the system affects opportunities or imposes burden. Auditing here means asking what explanation is provided, to whom, and in what language, and whether it is consistent with what the system actually does. It also means checking whether the organization measures the effectiveness of these mechanisms, such as how often appeals occur and whether they lead to corrections. Design that lacks contestability can turn small model errors into serious harm because people have no route to recovery. Responsible design treats recovery as part of the system, not as an exception.
Privacy and security assumptions are also embedded in design decisions, especially when the system uses personal data or stores prompts and outputs. The design should specify what data is collected, what data is excluded, how long data is retained, and who can access it. It should also specify whether data can be used to improve the model and under what permissions and restrictions. Auditing these design choices means checking whether they align with consent, purpose limits, and minimization, and whether they are enforceable through controls rather than relying on informal promises. It also means checking whether the design anticipates misuse, such as users entering sensitive information into prompts or attempting to extract private content from the model. Security-oriented design choices might include access restrictions, logging, and monitoring for abuse patterns. If privacy and security are treated as external add-ons rather than design requirements, the system is likely to accumulate hidden risk. End-to-end compliance depends on these early decisions being explicit and enforced.
A valuable way to audit design is to look for the evidence trail that shows objectives, assumptions, and success criteria were documented, reviewed, and approved at the right times. Evidence might include a design document that states the purpose and boundaries, a risk assessment that identifies foreseeable harms, and an approval record that ties acceptance criteria to governance decisions. It should also include documentation of key assumptions, such as data representativeness, label meaning, and human review expectations, along with plans to monitor when those assumptions stop holding. If assumptions are not documented, they cannot be tested. If success criteria are not documented, they can be rewritten later to justify outcomes. Auditing evidence is not about generating paperwork; it is about proving that responsible thinking happened before deployment and continues after deployment. A mature organization can point to specific decisions that changed design, such as removing a risky feature, narrowing scope, or adding an appeal path, because that is what ethics and governance look like in action.
To close, auditing A I design decisions is the craft of making the invisible visible so the organization can govern its systems with honesty and control. Objectives should be specific enough to set boundaries, aligned with ethical and legal constraints, and defined in terms of outcomes that matter to people, not just to efficiency. Assumptions should be surfaced and challenged, especially assumptions about data quality, representativeness, label truth, human review, stability of the environment, and the meaning of model outputs. Success criteria should reflect both performance and risk, be defined early, be sensitive to subgroup harm, and be tied to actions when metrics move in the wrong direction. Good design also includes scope limits, meaningful human roles, contestability, and privacy and security controls that are built in rather than bolted on. When you audit design with this mindset, you prevent a common failure where a model is technically impressive but ethically fragile, because it was aimed at the wrong goal, built on unspoken assumptions, and judged by the wrong scoreboard.