Redacting Medical Records: HIPAA, GDPR, and Practical Steps
A GP surgery responding to a records request from a patient's solicitor needs to remove third-party personal data and clinician notes that aren't part of the request. A US healthcare provider sharing records between facilities must strip identifiers under HIPAA Safe Harbor rules. Different regulations, similar practical challenge - and the same consequences if something is missed.
By RedactProof Editorial Team Β· 18 Feb 2026
Two frameworks, overlapping obligations
Medical record redaction sits at the intersection of health data regulation and general data protection law. The specifics depend on jurisdiction.
In the United States, HIPAA (Health Insurance Portability and Accountability Act) governs the use and disclosure of protected health information (PHI). The HIPAA Privacy Rule requires covered entities - healthcare providers, health plans, and their business associates - to remove or protect individually identifiable health information before sharing it for purposes that don't qualify as treatment, payment, or healthcare operations.
In the UK and EU, medical records fall under GDPR as special category data (Article 9), which carries stricter processing requirements than ordinary personal data. Health data includes anything relating to a person's physical or mental health, including records of healthcare provision. The UK Data Protection Act 2018 adds further conditions for processing this data.
For organisations that operate across jurisdictions, or handle records from international patients, both frameworks may apply simultaneously. A UK-based clinical trial involving US participants, for instance, might need to satisfy both HIPAA de-identification standards and GDPR's data minimisation principles.
What HIPAA requires for de-identification
HIPAA provides two methods for de-identifying protected health information under 45 CFR 164.514.
The Safe Harbor method requires removing 18 specific identifier types: names, geographic data smaller than a state, dates (except year) for individuals over 89, phone numbers, fax numbers, email addresses, Social Security numbers, medical record numbers, health plan beneficiary numbers, account numbers, certificate/licence numbers, vehicle identifiers, device identifiers, web URLs, IP addresses, biometric identifiers, full-face photographs, and any other unique identifying number. After removing these 18 categories, the covered entity must also have no actual knowledge that the remaining information could identify an individual.
The Expert Determination method uses a qualified statistical expert to certify that the risk of identifying any individual from the data is "very small." This method is more flexible but requires specialist expertise and documentation.
For most day-to-day redaction work in healthcare settings, Safe Harbor's 18-category checklist is the practical approach. It's prescriptive - you know exactly what to look for.
What GDPR requires for health data
GDPR doesn't provide a numbered checklist equivalent to HIPAA Safe Harbor. Instead, it applies principles.
When disclosing medical records - whether in response to a SAR, for insurance purposes, or as part of legal proceedings - organisations must apply data minimisation (Article 5(1)(c)): only share the personal data that's necessary for the stated purpose. Third-party personal data in the records (other patients mentioned in notes, family members, referring clinicians) generally needs redacting unless there's a lawful basis for including it.
For a GP surgery responding to a solicitor's request for a patient's records, this means: provide the patient's own records, redact other patients' names or details that appear in those records, and consider whether clinician identities need redacting (this depends on the specific context and any relevant professional guidance).
The practical redaction process for medical records
Medical records present specific challenges that general office documents don't.
Handwritten notes are common in healthcare settings, particularly in older records. These need scanning and OCR before any automated detection can work. OCR accuracy on handwriting varies - printed forms with handwritten entries tend to process better than free-form clinical notes. Manual review is especially important for handwritten content.
Clinical terminology mixed with personal data complicates detection. A sentence like "Patient was referred to Dr Martinez at the Royal Free Hospital on 15 March 2024 for assessment following road traffic accident" contains a clinician name, a facility name, a date, and clinical information. Which parts need redacting depends on the specific purpose and legal basis for the disclosure.
Structured medical records (electronic health records exported as PDFs) tend to have personal data in predictable locations - header fields, patient demographics sections, referral details. Unstructured records (scanned letters, clinical notes, discharge summaries) scatter personal data unpredictably throughout the text.
A practical workflow for medical record redaction:
- Consolidate all responsive records into a digital review set (scan paper records)
- Run OCR on any scanned/image-based documents
- Use automated PII detection to flag the standard categories
- Manually review for clinical-context identifiers that automated tools may miss
- Apply pixel-burn redaction
- Verify the output by checking for recoverable text
- Strip all document metadata
Common gaps in healthcare redaction
Dates are the most frequently under-redacted element in medical records. HIPAA Safe Harbor requires removing all date elements (except year) for individuals, including admission dates, discharge dates, date of birth, and dates of death. Under GDPR, whether a specific date needs redacting depends on context - but appointment dates, referral dates, and treatment dates are often identifying in combination with other information.
Patient reference numbers and NHS numbers appear in headers, footers, and cross-references throughout records. They're easy to catch on page 1 and easy to miss on page 47. Automated detection that scans the entire document catches these consistently.
Third-party information is routinely overlooked. A discharge summary might mention the patient's next of kin by name, include the referring GP's personal mobile number, or reference another patient's case in a clinical note. Each of these is third-party personal data requiring separate consideration.
Frequently Asked Questions
Do I need to redact the clinician's name from medical records?
It depends on the context. Under HIPAA, healthcare provider names are not among the 18 Safe Harbor identifiers, so they generally don't require removal for de-identification of patient data. Under GDPR, a clinician's name is personal data, but there may be a lawful basis for including it - particularly if the records are being shared for treatment-related purposes. When disclosing records externally for non-treatment purposes, consider whether the clinician's identity is necessary for the recipient's purpose. Consult your organisation's data protection officer for specific guidance.
Can automated tools handle HIPAA Safe Harbor redaction?
Automated PII detection tools can identify most of the 18 Safe Harbor categories - names, dates, phone numbers, email addresses, Social Security numbers, and similar structured identifiers. They're less reliable with free-text geographic identifiers (descriptions of locations rather than formatted addresses) and with photographs. Tools like RedactProof that detect 40+ PII types cover the standard HIPAA categories, but a manual review remains necessary to satisfy the Safe Harbor requirement that the entity has "no actual knowledge" that remaining information could identify someone.
What's the penalty for improper medical record redaction?
Under HIPAA, penalties for impermissible disclosure of PHI range from $137 to $68,928 per violation, with an annual cap of $2,067,813 per identical provision (figures adjusted for inflation as of 2024). The HHS Office for Civil Rights has imposed multi-million dollar settlements in cases involving systematic failures. Under GDPR, data breaches involving special category data can attract fines up to 4% of global annual turnover or EUR 20 million, whichever is greater. The ICO has issued fines to NHS trusts and healthcare organisations for data breaches involving patient records.
Redact with confidence
RedactProof detects PII across your documents without uploading them. Start with a free account.