Redaction is the process of removing or obscuring specific content from a document before producing it — protecting privileged material in partially-privileged documents, removing PII before regulatory production, redacting confidential business terms when producing contracts, or applying court-ordered redactions to sealed material. Done correctly, the redaction is permanent and the redacted content is unrecoverable. Done incorrectly, it’s a high-profile data exposure.
What gets redacted
Common redaction categories:
- Privileged attorney-client communications within otherwise-responsive documents
- PII (personal information) — Social Security Numbers, account numbers, dates of birth, medical record numbers, home addresses
- PHI (protected health information) under HIPAA
- Trade secrets and proprietary information in regulatory or court filings
- Third-party confidential information subject to NDAs or protective orders
- Identifying information of minors, victims, or witnesses under court order
The categorization matters because each has its own production standard. PII redaction in a regulatory production must be exhaustive (any missed SSN is a breach); privilege redaction in litigation can be more targeted (over-redacting is its own problem).
The two ways redaction goes wrong
Cosmetic redaction (the worst)
The redaction looks black on screen but is actually selectable text underneath. Copy-paste recovers the redacted content; PDF text extraction recovers it; OCR running over the printed version recovers it. Multiple high-profile court filings and DOJ productions have been embarrassed this way.
The fix: every redaction must remove the text from the document, not just visually obscure it. Redaction tools that “burn in” the redaction (replacing text with literal black bars in the underlying document) are required.
Metadata leakage (the second worst)
The visible content is redacted but the metadata isn’t. File names, document properties, comments, tracked changes, and revision history all carry content that wasn’t redacted. Producing a Word document with redacted body but full revision history is a content leak.
The fix: scrub metadata as part of the redaction workflow. Most redaction tools have a metadata-removal step; verify it’s enabled.
How AI changes redaction
Pre-AI, redaction was line-by-line attorney work. Now:
- Auto-detection of categories. AI flags every SSN, account number, DOB, and email address in a document set in seconds. Manual review focuses on judgment calls (is this name a third-party confidential reference, or is it the named party?).
- Bulk PII redaction. Productions involving thousands of documents with PII concerns (employment records, healthcare records, regulatory inquiries) become tractable when AI handles the mechanical detection.
- Privilege-redaction assistance. When a document is partially privileged, AI suggests the specific spans to redact based on the privilege determination, rather than the attorney marking each by hand.
Relativity, Everlaw, and DISCO all ship AI-assisted redaction; specialist tools (Adobe Acrobat Pro, Foxit, native tools in eDiscovery platforms) handle the mechanical redaction once the spans are identified.
How to operationalize
- Define the redaction policy per matter. What categories, what standard (over- vs under-redact, balanced), what verification process. Document before review starts.
- Use real redaction tools, not visual cover-up. Adobe Acrobat Pro, Foxit, native eDiscovery platform redaction. Never “draw a black box in PowerPoint” or equivalent.
- Verify the redaction. After redaction, attempt to extract text and search the document. Verify nothing extracts. Sample 10% of redacted documents per production.
- Scrub metadata. As a separate step in the workflow, verify metadata is clean — file properties, revision history, comments, tracked changes, embedded objects.
- AI flags, attorney decides. AI surfaces candidate redactions with confidence scores; attorney reviews borderline calls and approves the final marking. Always.
- Audit log every redaction. Who approved what redaction in what document, on what basis. Critical for defending the production against post-hoc challenges.
Common pitfalls
- Cosmetic redaction reaching production. The most expensive error in the entire workflow.
- Metadata leakage. Particularly common when producing native files (Word, Excel) instead of converted images.
- Over-redaction. Redacting more than required draws challenges and motion practice; opposing counsel argues you’re hiding more than the privilege calls warrant.
- Inconsistent redaction across the production. The same email redacted in one custodian’s set but not another’s looks like an inadvertent production.
- Forgetting third-party confidentiality. Documents containing confidential information about non-parties may need redaction even if no privilege or PII concern applies.
Related
- Privilege review — the upstream process that identifies what to redact
- eDiscovery — the broader workflow redaction sits inside
- Privilege log format — documents withheld documents that aren’t redacted
- Relativity — most-deployed platform with native redaction tooling