ooligo
claude-skill

Clause extraction for any contract with Claude

Difficulty
beginner
Setup time
20min
For
legal-ops · in-house-counsel · paralegal · contract-manager
Legal Ops

Stack

A Claude Skill that takes any contract .docx or .pdf and produces a structured JSON file with the key clauses extracted — governing law, liability cap, indemnification, term, auto-renewal, termination triggers, payment terms, IP ownership, confidentiality term, and any custom fields you define. Useful for populating CLM metadata, building clause libraries from inherited contracts, and contract portfolio analysis.

What you’ll need

  • Claude Code or Claude.ai with custom Skills enabled
  • The contract file as .docx or .pdf
  • Optional: a CSV or JSON list of custom clauses to extract beyond the defaults

Setup

  1. Drop the Skill. Place clause-extraction.skill into your Claude Code skills directory (~/.claude/skills/) or upload to Claude.ai project. The Skill exposes one callable function: extract_clauses.
  2. Optionally configure custom clauses. Edit clauses_to_extract.json in the Skill directory to add fields beyond the defaults (e.g., data_residency_clause, change_of_control_clause, most_favored_customer_clause).
  3. Test on a known contract. Run on a contract whose clause values you already know. Verify the extracted JSON matches your manual extraction.
  4. Run at scale. Point the Skill at a folder of contracts; it processes them in batch and writes one JSON file per contract.

How it works

The Skill takes a contract file and:

  1. Parses the document. Extracts text from .docx or .pdf, normalizes formatting, identifies section boundaries.
  2. Locates each target clause. Uses the contract structure (heading conventions, common clause language) to find each clause section.
  3. Extracts the value. For numeric clauses (liability cap, payment terms), extracts the dollar/percentage/days value. For categorical clauses (governing law, term length), extracts the named jurisdiction or duration. For boolean clauses (auto-renewal yes/no), extracts true/false plus the relevant clause text.
  4. Outputs structured JSON. One key per clause, with both the extracted value and the source clause text for verification.

Output format

{
  "contract_file": "vendor_msa_2026.docx",
  "extracted_at": "2026-05-03T14:22:00Z",
  "clauses": {
    "governing_law": {
      "value": "Delaware",
      "source_text": "This Agreement shall be governed by..."
    },
    "liability_cap": {
      "value": "12 months fees",
      "source_text": "In no event shall either party's liability..."
    },
    "term_length_months": {
      "value": 36,
      "source_text": "The initial term of this Agreement..."
    },
    "auto_renewal": {
      "value": true,
      "renewal_term_months": 12,
      "notice_period_days": 90,
      "source_text": "This Agreement shall automatically renew..."
    }
  }
}

Where it fits

Three primary use cases:

  • CLM data backfill. Pull historical contract clauses into Ironclad or another CLM when migrating from a flat-file repository.
  • Clause library construction. Extract every “liability cap” clause across your contract portfolio to see your team’s actual negotiating range vs the playbook’s stated positions.
  • Acquisition due diligence. Run on the target’s contract repository to surface change-of-control, assignment, and most-favored-customer clauses at scale before a deal closes.

Watch-outs

  • Heading-light contracts. Contracts without clear section headings (some older or short-form contracts) extract less reliably. Verify outputs on a sample.
  • Defined-term resolution. When a clause references a defined term (“as set forth in Schedule A”), the Skill returns the reference, not the resolved value. Manual review still required for cross-referenced clauses.
  • Not legal advice. Extraction is mechanical pattern-matching; it doesn’t interpret the legal effect of unusual clause language. Pair with attorney review for material decisions.