How many of your agency’s AI systems qualify for high-impact AI classification under OMB M-25-21? Not the number you reported in last year’s use case inventory under M-24-10. The number that actually qualifies today, under the single standard the Office of Management and Budget (OMB) put in place with OMB M-25-21 on April 3, 2025.
Most agencies have not answered that question with precision. They migrated their existing M-24-10 risk classifications forward, relabeled them, and called it compliant. That approach misses what M-25-21 actually changed. The new framework does not refine the old categories. It replaces them with a materially different analytical structure, and the classification process is where agencies either build durable governance or inherit the same accountability gaps they have been carrying for years.
The classification decision triggers every downstream requirement: pre-deployment testing, an AI impact assessment, ongoing monitoring, human training and oversight, human oversight and intervention mechanisms, remedies or appeals processes, and end-user feedback consultation, plus the public inventory entry that auditors, oversight bodies, and OMB will review. Get the classification wrong, and every governance structure built on top of it is misaligned. The §5 definition and the §4(a) principal-basis threshold are the foundation. Both require closer analysis than most agencies have applied.
M-25-21 implements EO 14179 (“Removing Barriers to American Leadership in Artificial Intelligence,” January 23, 2025). EO 14148 (January 20, 2025) first rescinded the Biden-era EO 14110; EO 14179 followed three days later as the affirmative replacement that directed OMB to revise M-24-10 and M-24-18 within 60 days. M-25-21 is the product of that direction.
Classify an AI system as high-impact by applying M-25-21’s §5 definition: does the system’s output serve as a “principal basis for decisions or actions” with legal, material, binding, or significant effect on any of the six enumerated categories? An outcome-proximity analysis (examining how close the AI output sits to the consequential action) is a practical lens for applying that “principal basis” threshold. That framing is an analytical tool, not a term in the memorandum. The AI governance board (CFO Act agencies) conducts the analysis under M-25-21 §3(a)(ii); the CAIO oversees the determination process and tracks high-impact use cases centrally under §3(a)(i)(F).
What “Principal Basis” Means for High-Impact AI Classification
The operative threshold in M-25-21 §4(a) and §5 is “principal basis for decisions or actions that have a legal, material, binding, or significant effect on rights or safety.” That phrase carries more analytical weight than most agencies are giving it. An AI system does not become high-impact simply because it operates in a sensitive domain. It becomes high-impact because its outputs, in the normal course of operations, serve as the driving basis for consequential decisions affecting the people or resources in that domain, whether or not a human nominally reviews those outputs before the action is taken (M-25-21 §4(a)).
Applying the Principal-Basis Threshold
A practitioner applying M-25-21’s §4(a) threshold can use an outcome-proximity analysis as a working heuristic: how close is the AI output to the final decision or action? That framing is an analytical aid for operationalizing the “principal basis” requirement. It does not appear in the memorandum text. An AI system that screens benefits eligibility and produces a determination a caseworker accepts without independent review meets the §4(a) threshold: the AI output is, functionally, the principal basis for the decision. An AI system that summarizes regulations for an analyst who then exercises independent judgment does not: the AI output is one input among several, not the principal basis.
This analysis requires examining actual operational behavior, not intended design. A system designed as a decision-support tool often functions as a decisioning tool in practice. Case volume, staffing ratios, and organizational culture can all compress the intended human check into a perfunctory review. Classification based on design intent, without examining operational reality, produces systematically incorrect results.
Material Versus Informational Effects
Not every AI system that touches a regulated domain qualifies as high-impact. M-25-21 §5 draws the line at outputs that serve as a “principal basis” for decisions with legal, material, binding, or significant effect. This excludes purely informational applications. An AI chatbot that answers questions about federal program eligibility criteria is informational. An AI system that determines whether a specific applicant meets those criteria and produces a determination that drives the agency’s access decision is a candidate for high-impact classification. The distinction is whether the system’s output characterizes the world in general or drives consequential outcomes for specific individuals.
That line is meaningful in practice because informational AI systems operate under M-25-21’s streamlined framework. They still require governance, but the documentation, testing, and oversight obligations are calibrated to a lower risk profile. Misclassifying a system that meets the §4(a) principal-basis threshold as informational does not reduce the compliance obligation. It just means the obligation goes unmet.
The audit fix. For each AI system under review, document three facts: what the system’s output is, who receives that output, and what typically happens between output delivery and the consequential action. If the answer to the third question is “nothing substantial,” treat the system as a high-impact classification candidate and proceed through the full §5 category analysis.
The Six High-Impact Categories Under M-25-21 §5
M-25-21 §5 enumerates six categories of consequential federal activity where AI outputs can meet the high-impact threshold. A system qualifies if its output serves as a principal basis for decisions or actions with the specified effect on any one of the six. A single category match triggers high-impact classification; multiple matches do not increase the compliance tier.
For governance design, practitioners commonly group these six categories into four analytical domains: individual rights, government services access, personal safety, and sensitive federal resources. That grouping is a practitioner scaffold for structuring classification reviews and governance programs; it is not terminology from the memorandum. The authoritative definition is the six-category enumeration in §5.
Category 1: Civil Rights, Civil Liberties, or Privacy
The first §5 category covers AI systems whose outputs serve as a principal basis for decisions or actions affecting an individual or entity’s civil rights, civil liberties, or privacy. This category applies regardless of whether the person is a federal employee, contractor, benefit applicant, or member of the public interacting with a federal program.
Concrete examples include systems that make or inform determinations about law enforcement stops, surveillance authorization, background check outcomes, employment eligibility, or immigration status. The common thread is that the AI output drives a decision affecting a legal standing that the individual has a recognized right to contest or protect. NIST AI RMF (NIST AI 100-1) recommends governance structures proportionate to harm severity for systems in this category; the framework remains voluntary and does not independently require specific controls.
Category 2: Access to Education, Housing, Insurance, Credit, Employment, and Other Programs
The second §5 category covers AI systems whose output serves as a principal basis for decisions affecting whether individuals receive federal benefits, program participation, or government-administered support. Benefits eligibility screening, grant application scoring, healthcare program enrollment determinations, and social services case management fall into this category when the AI output materially drives the access decision.
This category carries particular weight because denial of access often produces cascading harm. A wrongful benefits denial does not simply delay a payment. It may prevent medical care, housing stability, or other outcomes that compound over time. Classification decisions here should weigh not just the probability of error but the severity of the downstream consequence if an error occurs.
Category 3: Access to Critical Government Resources or Services
The third §5 category covers AI systems affecting access to critical government resources or services more broadly. This includes government facilities access control, agency IT system access, and program infrastructure beyond the benefit programs captured in category two. M-25-21’s Section 6 presumptive list includes access controls to government facilities and adjudication of critical federal services as presumptive high-impact categories.
Category 4: Human Health and Safety
The fourth §5 category covers AI systems whose outputs serve as a principal basis for decisions or actions affecting human health and safety. Emergency response dispatch systems, clinical decision-support tools used in federal healthcare facilities, and food safety mechanisms all qualify when AI outputs drive safety-critical decisions.
The health and safety category often involves the highest-consequence scenarios in an agency’s AI portfolio. An incorrect output does not produce a document error or an administrative delay. It produces a physical outcome. NIST AI RMF recommends rigorous pre-deployment testing for safety-critical AI, including adversarial testing, failure mode analysis, and performance thresholds designed to trigger human escalation. Federal agencies adopting AI RMF operationalize these recommendations as program requirements through agency policy; the framework itself does not mandate them.
Category 5: Critical Infrastructure or Public Safety
The fifth §5 category covers AI systems affecting critical infrastructure or public safety functions, including traffic control systems, emergency services dispatch, fire and life safety systems, and physical infrastructure monitoring. M-25-21’s Section 6 presumptive list treats safety-critical functions of critical infrastructure as automatically presumed high-impact.
Category 6: Strategic Assets or Resources
The sixth §5 category covers AI systems that manage, protect, or make decisions about strategic federal assets and resources, including high-value property and information marked as sensitive or classified by the federal government. This category includes AI systems in cybersecurity operations centers, classified information management, national security analysis, and critical infrastructure protection.
This category is the one most likely to be underclassified. Agencies sometimes treat cybersecurity AI tools as purely technical systems rather than governance-relevant AI, and route them outside the classification process entirely. M-25-21 §5 does not support that interpretation. Any AI system making principal-basis decisions about federal resource protection qualifies for the §5 category analysis.
The audit fix. Run each AI system through M-25-21’s §5 six-category sequence in writing. Document which categories apply, which do not, and the specific reason. A system that clears all six without triggering any does not require high-impact governance under M-25-21. A system that triggers one category requires the full §4(b) governance framework. Keep this analysis in the system’s official record for governance board review and CAIO oversight per §3(a)(i)(F).
M-25-21’s classification structure is not a simplified version of the M-24-10 framework. It is a more operationally precise one. M-24-10’s bifurcated categories invited definitional disputes that delayed governance action. M-25-21’s §4(a) “principal basis” threshold and §5 six-category definition route every system through a single analytical test. An outcome-proximity analysis (how close the AI output sits to the consequential action) is a practical way to apply that test. The point is speed to accurate classification, not reduced accountability. “Outcome proximity” is a practitioner’s working framework for §4(a), not a term in the memorandum.
Governance Requirements for High-Impact AI Systems
Classification as high-impact triggers M-25-21’s seven minimum risk management practices under §4(b). Agencies must document implementation of these practices within 365 days of the April 3, 2025 issuance (by April 3, 2026) and be prepared to report to OMB as part of periodic accountability reviews or the annual AI use case inventory (M-25-21 §4(a)(i)). For new systems, these practices apply before deployment. For existing systems already in production, the 365-day documentation window applies. These are baseline obligations under OMB’s executive-branch authority. Agencies that deploy high-impact AI without them carry compliance risk visible to OMB, IG review, and congressional oversight. Where a practice cannot be met, M-25-21 §4(a)(ii) requires CAIO-approved waiver documentation (non-delegable), not silent omission.
1. Pre-Deployment Testing
High-impact AI systems require pre-deployment testing that develops test plans and risk mitigation measures reflecting expected real-world outcomes (M-25-21 §4(b)(i)). The testing record must demonstrate that the system performs as intended across the population of inputs it will encounter in operation, including edge cases and adversarial inputs where relevant. Where the agency does not have access to underlying AI source code, models, or data, the agency must use alternative test methodologies such as querying the AI service and observing outputs or providing evaluation data to the vendor and obtaining results.
NIST AI RMF (NIST AI 100-1) provides methodology for structuring pre-deployment testing plans, recommending documentation of the test population, performance metrics, acceptable thresholds, and test results. A testing record that documents only that testing occurred, without recording what was tested, how it was measured, and what the results were, does not satisfy M-25-21 §4(b)(i).
2. AI Impact Assessment
High-impact AI systems require a completed AI impact assessment before deployment, updated periodically throughout the system’s lifecycle (M-25-21 §4(b)(ii)). The assessment must address at minimum: the intended purpose and expected benefit with specific metrics; data quality and fitness for purpose; potential impacts on privacy, civil rights, and civil liberties; reassessment schedules; related costs analysis; results of independent review by a reviewer not involved in development; and risk acceptance with a signature from the accepting official. The independent reviewer’s comments must be included in the documentation and shared with the risk-accepting official when the determination is made.
This is the practice most commonly missing from agency high-impact AI portfolios. A system with pre-deployment testing records but no completed AI impact assessment does not satisfy §4(b). The assessment is a distinct deliverable, not a component of testing.
3. Ongoing Monitoring
High-impact AI systems require ongoing performance monitoring after deployment (M-25-21 §4(b)(iii)). The monitoring obligation does not expire when a system passes pre-deployment testing. AI system performance can degrade as operational data distributions shift, as the affected population changes, or as the system encounters inputs outside its training distribution. Monitoring must be designed to detect unforeseen circumstances, changes to the AI system after deployment, or changes to the context of use. Agencies must implement appropriate mitigations and, where possible, develop processes enabling traceability and transparency.
Effective monitoring defines the metrics being tracked, the frequency of assessment, the threshold that triggers review, and the escalation path when performance falls below threshold. Monitoring logs should feed directly into the governance review cycle. A system with pre-deployment testing records but no post-deployment monitoring plan has satisfied one of seven requirements.
4. Human Training and Assessment
M-25-21 §4(b)(iv) requires agencies to ensure sufficient and periodic training, assessment, and oversight for operators of AI to interpret and act on the AI’s output and manage associated risks. Training must be conducted on a periodic basis, as determined by the agency, and must be specific to the AI system or service being operated and how it is being used. An agency that deploys a high-impact system without documenting that operators are trained to interpret its outputs appropriately has a governance gap regardless of how sophisticated the system is.
5. Human Oversight, Intervention, and Accountability
M-25-21 §4(b)(v) requires agencies to ensure human oversight, intervention, and accountability suitable for high-impact use cases. Where practicable and consistent with existing agency practices, agencies must ensure the AI has an appropriate fail-safe that minimizes the risk of significant harm. Human oversight that exists only on paper does not satisfy this requirement. The governance board’s review of a system’s human oversight mechanism should include evidence of actual reviews and intervention records, not just a description of the review process.
Decision chain documentation, describing how the AI output connects to the downstream action, who receives it, and what authority that person holds, supports the accountability record for this requirement. It is a sub-component of §4(b)(v)’s oversight and accountability obligation, not a separate enumerated practice.
6. Remedies or Appeals
M-25-21 §4(b)(vi) requires agencies to ensure that individuals affected by AI-enabled decisions have access to a timely human review and a chance to appeal any negative impacts. Where an agency already has an appeals or human review process in place (such as appeals of adverse actions), it may extend or adapt that process to cover AI-driven decisions, consistent with applicable law. The remedy process must be designed to avoid placing unnecessary burdens on the individual.
7. End-User and Public Feedback
M-25-21 §4(b)(vii) requires agencies to provide an option for end users and the public to submit feedback on the AI use case, where appropriate, in the design, development, and use of the AI, and to use such feedback to inform agency decision-making regarding the system. This obligation goes beyond post-deployment monitoring: it creates a structured channel for the people affected by high-impact AI to provide input into how the system is governed.
The audit fix. Build a governance package for each high-impact AI system containing all seven §4(b) deliverables: the pre-deployment testing record; the completed AI impact assessment with independent reviewer comments; the ongoing monitoring plan with current performance data; documentation of operator training and periodic assessment; the human oversight mechanism description with intervention logs; the remedies or appeals process documentation; and the end-user and public feedback channel. If any of the seven documents does not exist or has not been updated in the past 12 months, treat that system as carrying a material governance gap. Bring it to the AI governance board for remediation before the April 3, 2026 documentation deadline.
Migrating from M-24-10 to M-25-21 Classification
Agencies that classified AI systems under M-24-10’s rights-impacting and safety-impacting categories need to migrate those classifications to the M-25-21 framework. The migration is not purely administrative. The §4(a) “principal basis” threshold and §5 six-category definition will produce different results for some systems, and identifying those differences is where the compliance risk lives. But reclassification is only one of several net-new obligations M-25-21 introduces. Agencies must also:
- Submit a Compliance Plan within 180 days of issuance and every two years thereafter through 2036 (M-25-21 §3(b)(ii))
- Participate in the CAIO Council convened by OMB within 90 days of issuance (M-25-21 §3(c)(i))
- Designate a CAIO at the required grade level within 60 days (M-25-21 §3(a)(i))
- For CFO Act agencies, convene a governance board with mandatory cross-functional representation within 90 days (M-25-21 §3(a)(ii))
A migration that only reclassifies systems without producing the Compliance Plan deliverable and establishing the CAIO Council participation structure misses half of what M-25-21 requires.
Systems That May Change Classification Status
Three categories of systems warrant particular attention in the migration analysis. First, systems that were classified as rights-impacting or safety-impacting under M-24-10 but whose outputs are primarily informational. These systems may qualify for M-25-21’s streamlined framework if the §4(a) principal-basis analysis confirms they do not serve as the driving basis for a decision with the requisite legal, material, binding, or significant effect. The governance overhead reduction is real, but the reclassification must be documented.
Second, systems that were not classified as high-impact under M-24-10 but that operate in the strategic assets and resources category under M-25-21 §5. M-24-10’s framework did not give that category equivalent weight. AI systems managing cybersecurity operations, classified data, or critical infrastructure that escaped M-24-10’s categories should go through a fresh §5 analysis.
Third, systems that have materially changed in capability or operational scope since their M-24-10 classification. An AI tool that was a narrow decision-support tool in 2024 and has since expanded to cover a broader population or a broader decision set needs to be reclassified under current operational facts, not the facts at original deployment. M-25-21 §5’s definition of “significant modification” triggers reassessment requirements for any update that meaningfully alters the AI’s impact.
The Governance Board’s Role in Migration
For CFO Act agencies, the AI governance board conducts the classification analysis (M-25-21 §3(a)(ii)). In the migration context, that means the board must formally review each existing system, apply the M-25-21 §5 framework and §4(a) principal-basis threshold, and produce a documented classification determination. The CAIO oversees the determination process, tracks high-impact use cases centrally, and is responsible for providing determinations to OMB upon request (M-25-21 §3(a)(i)(F)). Systems that carried over from M-24-10 without a formal board review under the new framework have not been classified under M-25-21. They have been assumed classified, which is a different and weaker position.
Boards that produce thorough written analyses with explicit application of the §5 six-category definition and the §4(a) principal-basis threshold create a defensible record. Boards that produce thin approvals do not. If a system is later found misclassified, the inquiry will start with the classification record.
The audit fix. Schedule a formal M-24-10 migration review session with the AI governance board. For each system previously classified under M-24-10, the board should produce a written analysis applying the M-25-21 §5 six-category definition and §4(a) principal-basis threshold. The CAIO should review and approve or reject each determination under §3(a)(i)(F). Complete this review before the public AI use case inventory submission. In parallel, submit the agency Compliance Plan within 180 days of April 3, 2025 per §3(b)(ii), and confirm CAIO Council participation is established. The inventory reflects current classification status; the board review and Compliance Plan together create the full governance record supporting it.
| Classification Factor | High-Impact Indicator | Non-High-Impact Indicator | Evidence to Collect |
|---|---|---|---|
| M-25-21 §4(a) principal-basis threshold | AI output serves as a principal basis for the consequential decision; human review is perfunctory or absent | AI output is one input among several; human exercises independent judgment before the action is taken | Workflow documentation, staffing ratios, review time logs, operational behavior assessment |
| §5 Category 1: Civil rights, civil liberties, privacy | Output drives a decision affecting legal standing, civil liberties, or protected personal information | Output provides general legal information without individual determination | Description of affected population, legal authority applied, privacy impact assessment |
| §5 Category 2: Access to education, housing, insurance, credit, employment, and other programs | Output drives whether an individual receives federal benefits or program access | Output provides program information without eligibility determination | Decision authority documentation, program participation records |
| §5 Categories 3-4: Critical government resources/services; human health and safety | Output drives access decisions for critical services or influences physical safety outcomes in real time | Output informs safety research or administrative planning without operational use | Operational context, integration with safety-critical systems, service access records |
| §5 Categories 5-6: Critical infrastructure/public safety; strategic assets or resources | Output affects protection, access, or management of critical infrastructure or classified/high-value federal assets | Output supports administrative functions with no direct resource or infrastructure access | Asset classification level, data access permissions, system integrations, sensitivity markings |
| Governance documentation required | High-impact: all seven §4(b) practices documented by April 3, 2026 | Non-high-impact: standard inventory entry with use case description and basic risk documentation | Seven §4(b) documents for high-impact; use case inventory record for non-high-impact |
| Classification authority | AI governance board (CFO Act agencies) conducts analysis; CAIO oversees determination process and tracks centrally per §3(a)(i)(F) | Same classification authority applies; same documentation standards | Board meeting minutes, CAIO oversight record, dated determination, central tracking entry |
High-impact AI classification under M-25-21 is not a paperwork exercise. It is a risk allocation decision. Get the classification right, and the governance structure the agency builds is proportionate to actual risk. Get it wrong, and the agency either over-governs low-risk systems while under-governing high-risk ones, or carries material compliance gaps into the public inventory that oversight bodies will find. The M-25-21 §5 definition and §4(a) “principal basis” threshold are the analytical tools. An outcome-proximity analysis (how close the AI output sits to the consequential action) is a practical way to apply those tools. It is not a named test in the memorandum. The AI governance board (CFO Act agencies) and the CAIO are the accountable parties. Both need to produce written records that hold up under examination, not approvals that read like they were signed without analysis.
Frequently Asked Questions
What is the high-impact AI classification standard under M-25-21?
M-25-21 §5 defines high-impact AI as systems whose output “serves as a principal basis for decisions or actions with legal, material, binding, or significant effect” on any of six enumerated categories: civil rights, civil liberties, or privacy; access to education, housing, insurance, credit, employment, and other programs; access to critical government resources or services; human health and safety; critical infrastructure or public safety; or strategic assets or resources. The AI governance board (CFO Act agencies) conducts the classification analysis under §3(a)(ii); the CAIO oversees the determination process and tracks classifications centrally under §3(a)(i)(F). High-impact systems require all seven §4(b) minimum risk management practices, documented by April 3, 2026.
How does M-25-21 differ from M-24-10 on AI classification?
M-24-10 used two separate categories: rights-impacting AI and safety-impacting AI. M-25-21 replaces both with a single standard: high-impact AI, defined by the §4(a) “principal basis” threshold against §5’s six enumerated categories. Some systems that were classified under M-24-10 will reclassify under M-25-21, in either direction. M-25-21 also introduces net-new obligations beyond reclassification: Compliance Plans (§3(b)(ii)), CAIO Council participation (§3(c)(i)), a CFO Act AI governance board requirement (§3(a)(ii)), and seven minimum risk management practices (§4(b)) that are more comprehensive than M-24-10’s category-specific checklists.
Who has authority to classify an AI system as high-impact?
For CFO Act agencies, the AI governance board conducts the classification review and produces the written determination under M-25-21 §3(a)(ii). The CAIO oversees the determination process, tracks high-impact use cases centrally, and provides determinations to OMB upon request under §3(a)(i)(F). The CAIO also holds authority to waive one or more §4(b) minimum practices for a specific system after making a written, context-specific risk assessment, a responsibility that cannot be delegated (M-25-21 §4(a)(ii)).
What are the seven governance requirements for high-impact AI systems?
M-25-21 §4(b) enumerates seven: (1) pre-deployment testing with documented risk mitigation plans; (2) an AI impact assessment covering intended purpose, data quality, civil rights impacts, independent review, and risk acceptance; (3) ongoing monitoring for performance and adverse impacts; (4) human training and periodic assessment for AI operators; (5) human oversight, intervention, and accountability mechanisms with fail-safe provisions where practicable; (6) consistent remedies or appeals for individuals affected by AI-enabled decisions; and (7) end-user and public feedback mechanisms with results incorporated into agency decision-making. All seven must be documented by April 3, 2026.
Does a non-high-impact AI system require any governance under M-25-21?
Non-high-impact AI systems still require governance under M-25-21, scaled to the lower risk profile. At minimum, non-high-impact systems must appear in the annual AI use case inventory with a documented classification basis. The AI governance board (CFO Act agencies) must still conduct the classification review. The full seven §4(b) minimum practices do not automatically apply to non-high-impact systems at the same depth, but agencies should apply risk-proportionate governance and document it.
How does the principal-basis threshold work in practice?
M-25-21 §4(a) asks whether the AI output serves as the “principal basis” for a decision or action with legal, material, binding, or significant effect. An outcome-proximity analysis (examining how close the AI output sits to the consequential action) is a practitioner heuristic for applying that threshold. A system whose output a human decision-maker accepts without substantive independent review likely meets the §4(a) threshold regardless of how the system was designed. Agencies should examine actual operational behavior, not intended design, because case volume, staffing, and organizational culture regularly compress intended human review into a perfunctory step. This analytical framing is not a named test in M-25-21.
How does high-impact AI classification feed into the public AI use case inventory?
The annual AI use case inventory required under M-25-21 §3(b)(v) reflects each system’s current classification status under §5. High-impact systems must appear with their §4(b) governance documentation current. The inventory is a public accountability document; OMB and oversight bodies use it to assess agency compliance. Classification analyses conducted under the §5 six-category definition and overseen by the CAIO per §3(a)(i)(F) become the official basis for each inventory entry.
What role does NIST AI RMF play in M-25-21 classification?
NIST AI RMF (NIST AI 100-1, January 2023) is a voluntary framework. M-25-21 does not mandate it by name. Federal AI program leads commonly adopt AI RMF’s MAP function to structure context assessments for §5 category analysis and the MANAGE function to operationalize §4(b)’s ongoing monitoring and oversight requirements. The framework provides useful methodology, but compliance with M-25-21 does not require an AI RMF adoption decision. Agencies should document whichever risk assessment methodology they use to satisfy the §4(b) practices.
Subscribe to The Authority Brief for next week’s analysis.