5 HIPAA AI Violations Auditors Find (And How to Fix Them)

How many AI tools process protected health information (PHI) in your organization right now? Not the ones your compliance team approved. All of them. The AI scribe your physicians adopted six months before anyone signed a Business Associate Agreement (BAA). The ChatGPT prompts your billing team uses to draft appeals. The ambient listening tool your telehealth provider embedded without notifying you.

Industry research indicates the majority of workers across sectors use unapproved AI tools on the job, and healthcare is no exception. HHS Office for Civil Rights (OCR) has made risk analysis failures a recurring focus across recent enforcement settlements, citing the control in the majority of Resolution Agreements. The gap between “we think we have 47 approved applications” and “network traffic shows 112 AI-related API connections” defines the enforcement exposure most healthcare organizations carry without knowing it.

Five specific HIPAA AI violations appear in nearly every healthcare audit. Each carries penalties up to $2,190,294 per violation category per year (2026 inflation-adjusted figures effective January 28, 2026). Each originates from the same structural failure: AI adoption outpacing governance infrastructure.

The five most common HIPAA AI violations auditors identify are: missing Business Associate Agreements with shadow AI tools (the disclosure prohibition is at 164.502(e); BAA contracts standard at 164.308(b)(1)); improper de-identification creating re-identification risk (164.514(b)); data integrity failures from AI hallucinations (164.312(c)(1)); broken subcontractor BAA chains (164.308(b)(2)); and absent audit logging for AI interactions (164.312(b)).

Editor’s Note (March 2026): The HIPAA Security Rule is under active revision. HHS published a Notice of Proposed Rulemaking (NPRM) on January 6, 2025 (90 Fed. Reg. 898), proposing mandatory controls for all systems processing electronic protected health information (ePHI), including AI systems, with required encryption, multi-factor authentication, and annual penetration testing. The final rule is expected mid-2026. This article reflects current enforceable requirements and will be updated when the final rule publishes. For the full proposed rule analysis, see HIPAA Security Rule 2026: What the Proposed Overhaul Changes.

Violation 1: Missing BAAs with Shadow AI Tools

The most frequent HIPAA AI violation starts with invisible adoption. Clinical staff, billing teams, and operations personnel deploy AI tools to summarize notes, draft communications, and reconcile records. None of these tools have a signed Business Associate Agreement. Use of these tools to process PHI without a BAA is an impermissible disclosure under 164.502(e); the underlying Business Associate contracts standard is at 164.308(b)(1).

The Scale of Shadow AI in Healthcare

Multiple industry analyses document that the majority of workers across sectors use unapproved AI tools on the job. Healthcare is no exception, driven by clinical documentation burdens and staffing shortages. Critically, business associate breaches accounted for the majority of healthcare records exposed in 2025, outpacing direct covered entity exposures. Business associate violations represent the largest single category of healthcare data exposure.

Which AI Tools Sign a BAA (And Which Do Not)

AI Platform	BAA Tier	Key Restriction
OpenAI (ChatGPT)	Enterprise / API	Free and Plus: no BAA, data trains models
Anthropic (Claude for Healthcare)	Enterprise (launched January 2026)	Consumer Claude.ai and Pro: no BAA
Google (Gemini)	Workspace Business Standard+ / Enterprise	Consumer Gemini: no BAA
Microsoft (Copilot)	Azure OpenAI Service	Consumer Copilot: not HIPAA-eligible
Otter.ai	Enterprise (contact vendor to confirm BAA availability)	Free and Pro: no BAA; verify current BAA terms with vendor before authorizing PHI use

The pattern across every vendor is identical: free and standard tiers do not sign BAAs, do not disable model training on input data, and do not provide the encryption, access controls, or audit logging HIPAA requires. The enterprise tier, at 3x to 10x the cost, includes these safeguards. Employees choosing the free tier are choosing the non-compliant tier. What I see consistently is that nobody chooses the non-compliant tier on purpose. The free version is the default, and the protected health information follows the path of least resistance straight onto it.

The audit fix. Conduct a shadow AI discovery within 14 days. (1) Survey employees anonymously: “Name every AI tool you use for work tasks.” (2) Analyze firewall and DNS logs for connections to AI-related domains (api.openai.com, api.anthropic.com, generativelanguage.googleapis.com). (3) Cross-reference discovered tools against your BAA registry. (4) For each tool without a BAA: upgrade to an enterprise tier with a signed BAA, or block access at the network perimeter. Document the discovery process and results as HIPAA risk assessment evidence (164.308(a)(1)(ii)(A)).

Violation 2: The Re-Identification Risk

Employees believe removing patient names from data makes it safe for AI processing. HIPAA’s de-identification standard requires removing all 18 identifiers under the Safe Harbor method (164.514(b)(2)), not one or two obvious ones. Research has demonstrated high re-identification accuracy rates on datasets de-identified under Safe Harbor when those datasets are combined with auxiliary public data, the mosaic effect described below.

The 18 Identifiers Most Teams Overlook

HIPAA’s Safe Harbor method mandates removal of 18 specific data elements: names, geographic subdivisions smaller than a state, dates (except year) related to an individual, phone numbers, fax numbers, email addresses, Social Security numbers, medical record numbers, health plan beneficiary numbers, account numbers, certificate/license numbers, vehicle identifiers and serial numbers, device identifiers and serial numbers, web URLs, IP addresses, biometric identifiers, full-face photographs, and any other unique identifying number (164.514(b)(2)).

Most clinical staff remove the obvious identifiers: patient name and date of birth. They leave medical record numbers, device IDs, zip codes, and dates of service. An AI tool combining zip code, age range, and diagnosis code against public datasets reconstructs patient identity through the mosaic effect: individually harmless data points combining to uniquely identify an individual.

When “De-Identified” Data Becomes PHI Again

Data Input to AI Tool	Employee Assumption	Actual HIPAA Risk
“Patient 12345” (name removed)	Safe: name stripped	Violation: medical record numbers are PHI (164.514(b)(2)(i)(H))
Zip code + age + diagnosis	Safe: general data only	Violation: mosaic effect enables re-identification at high accuracy rates documented in re-identification research
Clinical notes (names redacted)	Safe: direct identifiers removed	Violation: contextual clues (rare conditions, specific procedures, dates) enable re-identification
Device serial number	Safe: not patient data	Violation: device identifiers are one of the 18 Safe Harbor elements

The audit fix. Implement a de-identification validation checkpoint for all AI data inputs. (1) Create a pre-submission checklist mapping all 18 Safe Harbor identifiers. (2) Train clinical and operations staff on identifiers beyond names and dates: medical record numbers, device IDs, IP addresses, and biometric data. (3) For any AI use case requiring patient data, engage a qualified expert for Expert Determination under 164.514(b)(1), documenting statistical and scientific methods confirming re-identification risk is “very small.” (4) Block copy-paste of unvalidated clinical data into any AI tool at the endpoint level.

Violation 3: Data Integrity Failures from AI Hallucinations

HIPAA requires covered entities to implement controls protecting the integrity of ePHI (164.312(c)(1)). AI scribe tools, clinical summarization engines, and diagnostic assistants introduce an integrity threat: hallucinated content entering permanent medical records. A 2025 clinical study documented a 1.47% hallucination rate and a 3.45% omission rate in AI-generated clinical notes (npj Digital Medicine 2025, doi:10.1038/s41746-025-01670-7). ECRI has flagged misuse of AI chatbots as a top health technology hazard for 2026.

How Hallucinations Enter the Medical Record

An AI scribe tool listens to a patient encounter and generates a clinical note. The note includes a medication the physician never prescribed, a diagnosis code the patient does not have, or a procedure recommendation the clinician did not make. If the note enters the EHR without clinician review, the hallucinated content becomes part of the permanent record.

The liability falls on the provider, not the AI vendor. The covered entity is responsible for the integrity of ePHI under HIPAA, regardless of the tool generating it (164.312(c)(1)). A hallucinated medication entry creating an adverse drug interaction becomes both a patient safety event and a HIPAA integrity violation.

The “Human in the Loop” Requirement

No AI-generated clinical content should enter an EHR without a timestamped review by a licensed clinician. The review creates two audit artifacts: evidence of human oversight for malpractice defense, and evidence of integrity controls for HIPAA compliance. Document the review policy, the clinician’s attestation timestamp, and the edit history showing pre-review and post-review versions of the AI-generated note.

The audit fix. Enforce a mandatory human-in-the-loop policy for all AI-generated clinical content. (1) Configure EHR workflows to flag AI-generated notes as “pending clinician review” with a visual indicator. (2) Require a timestamped attestation from a licensed clinician before the note becomes part of the permanent record. (3) Maintain an edit log showing the original AI output and all clinician modifications. (4) Audit a random 5% sample of AI-generated notes monthly for hallucination detection. (5) Document the policy as evidence of HIPAA integrity controls under 164.312(c)(1).

How Do Broken Subcontractor BAA Chains Expose Covered Entities to AI Violations?

A covered entity signs a BAA with an AI scribe vendor. The vendor processes PHI through the covered entity’s authorized workflow. Behind the scenes, the vendor routes data through OpenAI’s API, an AWS inference endpoint, and a third-party speech-to-text service. None of these sub-processors have a BAA with the vendor. The entire chain breaks.

HIPAA requires business associates to obtain “satisfactory assurances” from their own subcontractors through written agreements containing the same protections as the primary BAA (164.308(b)(2), 164.314(a)(2)(i)). A single missing link in the subcontractor chain exposes the covered entity to liability for the entire data flow. The hardest part of this conversation, in my experience, is that covered entities sign the primary BAA and believe the chain is sealed. The sub-processor question never comes up until an auditor asks for evidence.

The Three Questions Before Signing Any AI Vendor

“Name every sub-processor handling our data.” Require the vendor to disclose every third-party service, API, and infrastructure provider processing PHI. Amazon Web Services, Microsoft Azure, Google Cloud, OpenAI API, and Anthropic API are common in the AI vendor stack.
“Provide a copy of your BAA with each sub-processor.” A verbal assurance is not sufficient. Request the executed BAA or the relevant sections confirming PHI protections, data handling, breach notification, and return/destruction obligations.
“Confirm your model does not train on our data.” Model training on PHI is permitted only to the extent the BAA explicitly authorizes the use (164.504(e)(2)(i)(A)). Most BAAs do not authorize it. The answer for HIPAA compliance must be confirmed in the BAA text.

The audit fix. Map every AI vendor’s subcontractor chain within 30 days. (1) Send a formal sub-processor disclosure request to every AI vendor with a signed BAA. (2) Request executed BAA evidence for each disclosed sub-processor. (3) Verify model training exclusions: confirm in writing the vendor does not use PHI for model training, fine-tuning, or performance improvement. (4) Add sub-processor disclosure and BAA verification to your annual AI governance review cycle. (5) Document the chain as evidence for HIPAA compliance audits under 164.308(b)(2).

Violation 5: Missing Audit Logs for AI Interactions

HIPAA’s audit control standard under 164.312(b) requires logging every access to ePHI. When an employee uses an AI tool to process patient data, every interaction (input, output, and data access event) requires a corresponding audit trail. Most organizations deploying AI tools in healthcare have no logging infrastructure for these interactions.

What Auditors Request (And What CTOs Lack)

A HIPAA auditor examining AI tool usage asks for three artifacts: (1) access logs showing which users interacted with the AI tool and when, (2) input/output logs documenting the data sent to and received from the AI platform, and (3) configuration evidence confirming the tool’s security settings (encryption, access controls, data retention). Without these logs, the organization cannot demonstrate compliance with the audit control standard.

Most enterprise AI platforms include logging capabilities at higher tiers. ChatGPT Enterprise provides admin-level conversation logs. Claude Enterprise includes audit logging. Google Workspace AI features integrate with Google Workspace’s audit infrastructure. The logging exists. Organizations fail to enable, configure, and retain it.

Building the AI Audit Trail

Centralize AI interaction logs in your existing SIEM or log management platform. Treat AI tool access with the same logging rigor applied to EHR access. Set retention periods matching your HIPAA Security Rule documentation retention requirement: minimum six years under 164.316(b)(2)(i). Monitor for anomalies: bulk data inputs, unusual access times, and interactions from unauthorized accounts.

The audit fix. Enable and centralize audit logging for every approved AI tool within 14 days. (1) For each AI platform with a BAA, enable the highest available logging tier (admin logs, conversation logs, API call logs). (2) Configure log forwarding to your SIEM or centralized log platform. (3) Set retention to match your HIPAA Security Rule documentation retention requirement (minimum six years per 164.316(b)(2)(i)). (4) Establish a monthly review cadence for AI access anomalies: bulk data inputs, off-hours access, and interactions from non-authorized users. (5) Document logging configuration as HIPAA audit control evidence under 164.312(b).

Every one of these violations follows the same trajectory I have seen play out in healthcare audits: the AI tool arrives first, the governance question arrives later. The five HIPAA AI violations described here share a common root cause: organizations adopted AI tools faster than they built governance infrastructure. The remedy is not blocking AI. It is governed adoption: BAAs for every tool processing PHI, de-identification validation before data enters any AI platform, human-in-the-loop controls for clinical AI output, verified subcontractor chains, and audit logging enabled from day one. Risk assessment failures appear as a recurring finding across HHS OCR enforcement actions and risk assessment deficiencies. Build the AI governance layer now or explain its absence to the auditor later.

Frequently Asked Questions

What are the most common HIPAA AI violations?

The five most common HIPAA AI violations are: missing Business Associate Agreements with shadow AI tools (impermissible disclosure under 164.502(e); BA contracts standard at 164.308(b)(1)); improper de-identification exposing re-identification risk (164.514(b)); data integrity failures from AI hallucinations in clinical records (164.312(c)(1)); broken subcontractor BAA chains where the vendor’s AI provider lacks a BAA (164.308(b)(2)); and absent audit logging for AI interactions (164.312(b)).

Does ChatGPT require a HIPAA BAA?

Every AI tool processing protected health information requires a signed Business Associate Agreement before any PHI enters the platform. OpenAI offers BAAs for eligible ChatGPT Enterprise and API customers. Free and Plus tiers do not include BAAs, and OpenAI uses input data from these tiers for model training. Using ChatGPT Free or Plus with patient data creates an impermissible disclosure under 164.502(e). See our full ChatGPT HIPAA compliance analysis for tier-by-tier guidance.

What is the penalty for HIPAA AI violations?

HIPAA civil monetary penalties follow a four-tier structure with 2026 inflation-adjusted figures (effective January 28, 2026, per HHS’ 2019 Notification of Enforcement Discretion, 84 FR 18151): Tier 1 (lack of knowledge) $145 to $36,505.50 per violation (annual cap $36,505.50 per violation category); Tier 2 (reasonable cause) $1,461 to $73,011 per violation (annual cap $146,053); Tier 3 (willful neglect, corrected within 30 days) $14,602 to $73,011 per violation (annual cap $365,052); Tier 4 (willful neglect, not corrected) $73,011 to $2,190,294 per violation (annual cap $2,190,294). HHS OCR settlement amounts from recent Resolution Agreements range from under $100,000 to several million dollars depending on scope and willfulness.

How does AI create re-identification risk under HIPAA?

Machine learning models combine individually non-identifying data points (zip code, age range, diagnosis code, admission date) to uniquely identify individuals through the mosaic effect. Research demonstrates high re-identification accuracy rates on datasets meeting HIPAA Safe Harbor de-identification standards when combined with publicly available data. HIPAA 164.514(b) requires removing all 18 identifiers, not selectively stripping obvious ones.

What is the “human in the loop” requirement for AI in healthcare?

The HIPAA integrity standard under 164.312(c)(1) requires covered entities to implement controls protecting the accuracy of ePHI. For AI-generated clinical content, this means mandatory clinician review before AI output enters the permanent medical record. The clinician must attest with a timestamp, the review must be documented, and edit logs must preserve the original AI output alongside the clinician’s modifications.

Which AI vendors sign HIPAA BAAs?

OpenAI (Enterprise/API), Anthropic (Claude for Healthcare, launched January 2026, and enterprise deployments), Google (Workspace Business Standard/Enterprise), and Microsoft (Azure OpenAI Service) offer BAAs at enterprise tiers. Consumer and standard tiers from these vendors do not include BAAs. Verify BAA availability for every AI vendor before authorizing the tool for PHI processing.

What audit logs does HIPAA require for AI tools?

HIPAA 164.312(b) requires audit controls recording and examining access to ePHI. For AI tools, this includes: user access logs (who used the tool and when), input/output logs (data sent to and received from the AI platform), and configuration evidence (encryption status, access controls, data retention settings). Retain logs for a minimum of six years per the Security Rule documentation retention requirement at 164.316(b)(2)(i).

How do subcontractor BAA chains work with AI vendors?

HIPAA 164.308(b)(2) requires business associates to obtain written assurances from their subcontractors. When an AI scribe vendor routes data through OpenAI’s API, AWS infrastructure, or a third-party speech-to-text service, each sub-processor must have a BAA with the vendor containing the same PHI protections. Request sub-processor disclosure and BAA evidence from every AI vendor in your supply chain.

Subscribe to The Authority Brief for next week’s analysis.

Bottom Line Up Front