ooligo
claude-skill

Diversity slate auditor with Claude

Difficulty
advanced
Setup time
45min
For
recruiter · sourcer · talent-acquisition · dei-leader
Recruiting & TA

Stack

A Claude Skill that audits a candidate slate (the recruiter’s intended interview lineup, or the full sourced pool, or the application pool) against the role’s relevant labor-market reference pool, surfaces composition gaps, and emits a structured audit record — without running statistical inference on individual candidates and without recommending which candidates to add or drop. The output is decision support for the recruiter and the DEI lead, not an automated decision system.

When to use

  • You’re cutting a slate from a sourced pool to send to the hiring manager and want to know whether the slate’s composition reflects the role’s relevant labor-market pool before you send it.
  • You’re closing a quarter and need an aggregated audit across roles for the DEI program review.
  • You’re preparing a NYC Local Law 144 bias-audit submission and need an internal pre-check of slate composition before the formal independent audit.

When NOT to use

  • Identifying individual candidates’ protected-class membership. The skill processes aggregated, self-reported demographic data only. It refuses to infer demographics from name, photo, school, or any candidate-level signal.
  • Auto-rejecting candidates to “rebalance” a slate. Rejecting a candidate to hit a composition number is reverse discrimination and triggers the same legal exposure as the original imbalance. The skill surfaces the gap; the fix is upstream (sourcing channels, search query, JD language), not at the slate-cut step.
  • Composition data the candidates didn’t consent to. Self-ID data has its own consent flow under the candidate-authorization the firm’s ATS captures (Ashby, Greenhouse, Lever all expose this). The skill processes only the data the candidate agreed to share, in aggregate.
  • Single-role slates of fewer than 5 candidates. The smaller the slate, the less the audit signal means. The skill warns at sizes below 5; refuses to compute composition stats below 3.

Setup

  1. Drop the bundle. Place apps/web/public/artifacts/diversity-slate-auditor-skill/SKILL.md into your Claude Code skills directory.
  2. Configure the reference pool source. The skill needs a reference pool for comparison — usually BLS occupational employment statistics (free, public), augmented with industry-specific data where available. The reference-pool selector in references/1-reference-pools.md documents which BLS table maps to which role family.
  3. Wire the ATS export. Ashby and Greenhouse both expose self-ID exports via their APIs (Ashby /candidate.list with self-id columns; Greenhouse applications endpoint with EEOC fields). The skill reads the export; it does not call the ATS directly. This separation means the data minimization happens at export time and the skill never sees raw candidate records.
  4. Set the slate-size guardrails. Default: warn at sizes below 5, refuse at sizes below 3. Tune per role family if your team’s typical slate sizes differ.
  5. Dry-run on a closed slate. Audit the slate from a role you closed last quarter. Compare the skill’s gap analysis to your DEI lead’s read of the same slate. The skill surfaces composition deltas; whether those deltas matter is a judgment call the skill does not make.

What the skill actually does

Six steps. The skill is structured to keep the inference at the aggregate level — never at the candidate level — and to surface gaps without recommending interventions, because the right intervention varies by gap source and is not the slate-cut step.

  1. Load the slate (the candidates you intend to interview, or the sourced pool, or the application pool — depending on what the recruiter wants to audit). The skill expects an aggregate-level export: per-candidate self-ID is read but only used to compute aggregates; no per-candidate analysis is emitted.
  2. Load the reference pool for the role family. BLS occupational employment statistics are the default; the mapping from role family to BLS table lives in references/1-reference-pools.md. Industry-specific reference pools (e.g. Stack Overflow Developer Survey for software engineering) can be substituted by the recruiter.
  3. Compute composition deltas at the slate vs. reference-pool level. For each demographic dimension the slate has self-ID data on (gender, race/ethnicity per EEOC categories, veteran status, disability status — only the dimensions the firm collects), compute the slate’s percentage and the reference pool’s percentage. Compute the absolute delta.
  4. Surface gaps per dimension with a confidence band. A delta of 5pp on a slate of 50 means more than the same delta on a slate of 8. The confidence band reflects the slate size and the reference pool’s specificity.
  5. Surface upstream gap candidates. For each surfaced delta, list 3-5 likely upstream causes the recruiter can investigate — sourcing channel mix, search query language (the Boolean search builder fairness pre-flight catches some of these), JD language, hiring-manager language in the screen. Do NOT rank or recommend; list candidates for the recruiter and DEI lead to investigate.
  6. Emit audit record. A signed JSONL line with slate composition, reference pool used, computed deltas, and the skill’s version. No PII. The audit record is what makes a NYC LL 144 submission or an internal DEI review defensible.

Cost reality

Per slate audit, on Claude Sonnet 4.6:

  • LLM tokens — 5-10k input (slate aggregates + reference-pool table + skill instructions) and 2-3k output (per-dimension gap analysis + upstream candidates). Roughly $0.05-0.10 per audit.
  • Reference-pool data — BLS data is free. Stack Overflow Developer Survey is free. Industry-specific datasets vary; the BLS-only path costs $0.
  • Recruiter / DEI-lead time — the win. Composition audits are usually skipped because they’re tedious; the skill makes the audit the default cost rather than an extra step. Expect 5-10 minutes per slate to read the audit, plus 20-40 minutes per quarter to investigate the surfaced upstream gap candidates.
  • Setup time — 45 minutes once for the reference-pool mapping and ATS export wiring.

Success metric

Track three things, monthly, not per-slate:

  • Composition delta drift over time — does the slate-vs-reference-pool gap narrow on tracked roles? If it doesn’t, the upstream interventions aren’t working.
  • Sourcing-channel mix shift — when the audit surfaces a sourcing-channel gap candidate, does the channel mix actually shift in the next quarter? If sourcing keeps recommending the same channels, the audit’s upstream surface isn’t reaching sourcing.
  • NYC LL 144 / internal DEI audit gap — when the formal annual bias audit happens, do its findings match what the slate-by-slate audits surfaced through the year? If the formal audit surfaces gaps the slate audits missed, the reference-pool mapping or the dimensions tracked are incomplete.

vs alternatives

  • vs ATS-native diversity dashboards (Greenhouse Inclusion, Ashby’s diversity reporting). ATS-native dashboards show composition; they don’t compute reference-pool deltas or surface upstream candidates. Pick ATS-native if you only need reporting. Pick the skill if you need decision support per slate.
  • vs Crosschq Diversity / SeekOut DEI / Eightfold’s diversity layer. Those are deeper products with their own reference pools and analysis layers. Pick them if budget supports the platform play and you want a managed product. Pick the skill if you want the audit logic in your repo, the reference-pool mapping you control, and the audit record portable.
  • vs hand-computed composition stats. Hand-computed is fine for the once-a-year DEI review but slips at slate-cadence; nobody hand-computes per slate. The skill makes the audit cheap enough to run on every slate.
  • vs no audit at all. The default, and the legal exposure under NYC LL 144 (annual bias audit required for AI tools used in NYC hiring). The skill is the cheapest defensible posture.

Watch-outs

  • Reverse discrimination from “rebalancing.” Guard: the skill never recommends adding or dropping individual candidates. Adjusting a slate by removing candidates to hit composition numbers is reverse discrimination and creates the same legal exposure as the original imbalance. The audit surfaces; the fix is upstream.
  • Inferring demographics from candidate signals. Guard: the skill processes only self-ID data the candidate consented to share. It refuses to infer race/ethnicity from name, gender from pronouns, age from graduation year, or any candidate-level inference. The reference pools used for comparison are aggregate statistics, not candidate-level features.
  • Small-slate noise. Guard: slate sizes below 5 produce a warning header on the audit; below 3 the skill refuses to compute composition stats.
  • Stale reference pools. Guard: the reference-pool mapping in references/1-reference-pools.md carries a last_verified date per source. Sources older than 18 months trigger a warning to refresh the mapping.
  • Audit trail tampering. Guard: audit records are append-only JSONL with the skill version embedded. Modification breaks the file’s signing chain. Routine audit-record retention should be at least as long as the firm’s hiring-record retention (typically 2-7 years).
  • DEI-data exfiltration risk. Guard: the audit record contains aggregates and deltas, not per-candidate fields. The skill refuses to write per-candidate self-ID data into the audit record.

Stack

The skill bundle lives at apps/web/public/artifacts/diversity-slate-auditor-skill/ and contains:

  • SKILL.md — the skill definition
  • references/1-reference-pools.md — the role-family-to-reference-pool mapping (BLS, Stack Overflow Developer Survey, etc.)
  • references/2-audit-record-format.md — the literal output format for the JSONL audit record

Tools the workflow assumes you use: Claude (the model), Ashby or Greenhouse (the ATS, for the self-ID export). For the parallel sourcing-channel audit, see the Boolean search builder — its fairness pre-flight catches some upstream gap causes.

Related concepts: diversity recruiting, AI screening bias, structured interviewing.

Files in this artifact

Download all (.zip)