claude-skill

Audit an ABM list against an ICP rubric with Claude

Difficulty

intermediate

Setup time

30-60 min

For

revops

RevOps

Stack

A Claude Skill that takes an ABM target list and an ICP rubric and returns a per-account defect report — every account that fails gets a defect code from a defined taxonomy (wrong-size, wrong-industry, wrong-geo, stale-data, low-intent, missing-field), a quality tier (Q1 through Q4), a list-level quality score, and a ranked remediation queue. The bundle ships at apps/web/public/artifacts/abm-list-quality-audit-skill/ and contains SKILL.md plus three reference templates the user adapts before first run.

It answers the question that most ABM campaigns skip before launch: “Of the 300 accounts in this list, how many actually meet our ICP, and what exactly is wrong with the ones that don’t?” Without that answer, ABM platform spend — 6sense, Demandbase, LinkedIn matched audiences — goes toward accounts you would never convert, and the campaign’s disappointing results get attributed to message or channel rather than list quality.

When to use

Use this skill before loading any ABM list into a paid-media platform, before assigning named accounts to AEs, and before any campaign launch where the list was assembled more than 90 days ago. ABM lists degrade faster than most RevOps teams realize: headcount data goes stale, funding stages change, companies get acquired, and the ICP rubric itself sometimes shifts without the list being re-evaluated.

The skill is also the right tool for quarterly list hygiene. Run it over your entire ABM universe — not just campaign lists — to find accounts that were added when your ICP looked different and have not been re-evaluated since. The defect-frequency table tells you which enrichment gaps are most common across your universe, which is actionable for whoever owns the Clay enrichment workflow.

Invoke from:

A Clay table where each row is an account, triggered manually before a campaign launch or on a quarterly cron. The skill writes quality_tier and defect_codes back to two Clay columns; downstream automation can filter on these to suppress Q3/Q4 accounts from campaign uploads.
A CSV pre-flight check before import into 6sense or any ABM advertising platform. Running the audit removes accounts you would otherwise pay to target — at typical ABM CPM rates ($20-40 per 1,000 impressions), removing 50 out-of-ICP accounts from a 500-account list cuts waste by 10%.
A Salesforce report-based trigger over named accounts in a segment, writing ABM_Quality_Tier__c and ABM_Defect_Codes__c back to the account record.

When NOT to use

Skip this skill when:

You want to score inbound MQLs. The audit is designed for outbound named-account lists. For inbound lead triage, the lead-scoring-icp-rubric skill is the right tool — it handles the single-lead flow and the borderline escalation logic that matters for inbound.
Your ICP rubric does not exist yet. The skill audits against a rubric you provide. If you have not had the ICP argument — what industries, headcount bands, and geographies you actually win in — that conversation must happen first. Running an audit against a placeholder rubric produces a false sense of rigor.
The list needs deduplification, not auditing. If the goal is to remove current customers, competitors, churned accounts, or GDPR-suppressed contacts, that is a filter operation, not an ICP audit. Run those exclusions before the audit, or the skill will spend tokens scoring companies you already know you want to exclude.
You need to generate the list, not audit it. The skill takes an existing list as input. It does not run TAM discovery or generate net-new accounts. Use a dedicated list-building workflow — Clay plus ICP criteria — to produce the raw list first.
The list has fewer than 20 accounts. Below that size, an experienced RevOps or AE can manually review every account in under an hour. The setup cost of the skill (rubric configuration, defect taxonomy customization) is not worth it.

Setup

Setup takes 30-60 minutes assuming the ICP rubric exists. The rubric argument — aligning RevOps, GTM leadership, and an AE or two on what an A-tier industry and headcount band actually means — takes longer and happens before setup.

Install the Skill. Copy apps/web/public/artifacts/abm-list-quality-audit-skill/SKILL.md and the references/ folder into your .claude/skills/abm-audit/ directory, or upload as a Skill in claude.ai. The frontmatter name and description are the trigger on relevant prompts.
Configure the ICP rubric. Open references/1-icp-rubric-template.md. If your team already uses the lead-scoring-icp-rubric skill, you can reference the same rubric file — the structure is identical. Replace placeholder rows with actual criteria, weights (1-5), and tier values (A / B / C). Fill the hard disqualifiers section. Update “Last edited” — the SHA-256 the skill records in every report footer ensures that stakeholders can tell when the rubric moved.
Configure the defect taxonomy. Open references/2-defect-taxonomy.md. The defect codes themselves are fixed — do not rename them, as downstream parsers key on the code strings. Edit the “Remediation action” column to match your team’s actual process: which Clay column provides headcount re-enrichment, who owns the ZoomInfo subscription, which segment owns the enterprise overflow accounts.
Prepare intent scores (optional but high-value). If you use 6sense or Bombora, export a domain → intent_score map for your account universe and pass it as intent_scores input. This adds low-intent and intent-spike annotations on top of the rubric scores — the intent-spike flag is particularly valuable for Q2 accounts that are in-ICP but borderline, because it surfaces them for prioritization even before re-enrichment.
Set enrichment staleness threshold. Update enrichment_staleness_days to match how aggressively your enrichment layer recycles. Clay + ZoomInfo typically refreshes on a 90-day schedule; if you run monthly enrichment, you may set 45 days. This drives the stale-data defect code.
Test on a known list. Run the skill over 20-30 accounts you know well — a mix of current customers, churned accounts, and prospects of varying quality. Verify that the quality tiers match your team’s intuition. If Q1 accounts are showing defect codes, the rubric is miscalibrated. If obvious out-of-ICP accounts are scoring Q2, the hard disqualifiers or weights need tightening.

What the skill actually does

The skill runs four steps in a fixed order.

Step 1 — hard disqualifier sweep. Before any LLM call, each account is checked against the rubric’s hard disqualifiers: sanctioned country, disqualified industry, headcount below the absolute floor, accounts on the explicit exclusion list (competitors, current customers). Hits receive defect code hd:{reason} and a quality tier of disqualified. This step is deterministic and runs on every account in milliseconds. Why run this first: on a 500-account list, it is common for 5-15% of accounts to be immediate disqualifications — running LLM scoring on those accounts wastes tokens and adds latency without adding information.

Step 2 — per-account ICP rubric scoring. Accounts that cleared the hard disqualifier sweep are scored against each criterion in the rubric. For each criterion, the model emits a tier (A / B / C), a weight (from the rubric), and a one-sentence rationale citing the rubric row. The weighted sum maps to a quality tier: Q1 (score ≥ 8.0), Q2 (6.0-7.99), Q3 (4.0-5.99), Q4 (< 4.0). Failing criteria generate the corresponding defect codes — a headcount score of C on an account below the B-tier floor generates wrong-size:too-small.

Why per-criterion rather than a single blended score: the defect codes that drive the remediation queue require knowing which specific criterion failed, not just that the overall score was low. A Q3 account with missing-field:tech_stack is a different remediation task from a Q3 account with wrong-industry — the first needs enrichment, the second needs removal.

Step 3 — supplemental defect detection. After rubric scoring, the skill checks for defects not covered by the rubric: stale-data (enrichment older than threshold), missing-field:{field} (criteria that could not be scored), low-intent and intent-spike from the provided intent scores. The intent-spike flag can appear even on Q2 accounts — it surfaces accounts where in-market behavior should override the borderline rubric score and trigger direct AE outreach anyway.

Step 4 — list-level aggregation. After per-account scoring, the skill computes the list quality score (Q1% + Q2% - Q3% - 2×Q4%, scaled to 100), the defect frequency table, and the remediation queue. The remediation queue is sorted by estimated re-audit lift: accounts most likely to become Q1 after re-enrichment appear first. A list quality score below 30 is the skill’s go/no-go signal — the recommendation section will say “Do not launch until Q3/Q4 accounts are remediated or removed.”

Cost reality

Per-account token cost depends on rubric size and how much account data is provided. For a typical 6-criterion rubric with structured per-criterion output and one account record at 300-500 tokens of data, expect roughly 1,200-2,000 input tokens and 300-500 output tokens per account. At Claude Sonnet 4.x pricing (approximately $3 per million input and $15 per million output as of early 2026), that is $0.008-0.015 per account.

A 500-account pre-campaign audit costs $4-8 in Claude tokens. A quarterly hygiene pass over a 2,000-account ABM universe costs $16-30. These are smaller than the cost of one misrouted AE sequence. The non-token cost is larger: configuring the rubric and defect taxonomy correctly is a 60-90 minute session; plan for it.

The token cost per account is lower than the lead-scoring skill because ABM accounts typically have richer structured data (fewer missing fields) and the defect codes are more compact than a full per-criterion rationale. If your accounts have many missing fields, more of the processing falls to the supplemental defect step, which is deterministic and free.

Prompt caching of the rubric and defect taxonomy files pays off meaningfully at scale — on a 500-account audit the rubric is loaded once and cached across the full batch. On a 5-account spot-check it does not matter.

Success metric

The primary metric for the audit is list quality score trend: run the audit on the same ABM universe every quarter and track whether the list quality score rises. A rising score means your enrichment cadence is working, your rubric is stable, and your list-building process has tightened. A falling score — or a score that stays flat despite remediation effort — means either the rubric has shifted or the enrichment source is unreliable.

Secondary metric: ABM campaign conversion rate by quality tier. After 90 days of running campaigns against audited lists, compare the conversion-to-opportunity rate for Q1 accounts vs Q2 accounts vs accounts that were remediated from Q3 before being included. Q1 should convert at a higher rate than Q2, and Q2 after remediation should convert at a higher rate than unauditied Q3. If there is no conversion difference between tiers, the rubric is not predictive and needs to be re-argued.

Failure modes

Defect codes that indict the rubric, not the list. If 35% of your list receives wrong-size:too-small, the problem is often the headcount floor in the rubric, not the list. The rubric may have been set when your motion was pure enterprise and has not been updated since you opened an SMB segment. Acting on those defect codes by removing 35% of the list is the wrong move; re-examining the rubric is the right one. Guard: after every audit, check whether any single defect code applies to more than 25% of accounts. If so, review the rubric criterion that generates that code before remediating the list. The audit output’s defect frequency table makes this check easy — the most-common code is always row one of the table.
Stale enrichment producing false negatives on good accounts. An account with a last_enrichment_date of 14 months ago may have tripled headcount, raised a Series B, and added Salesforce to their tech stack since that data was collected. The skill’s Q4 verdict on that account is not a verdict on the company — it is a verdict on your enrichment cadence. Removing or de-prioritizing those accounts before re-enriching them loses real pipeline. Guard: the skill adds stale-data to any account where enrichment exceeds the staleness threshold and notes “scored on potentially stale data” in the rationale. The remediation queue places stale-data + high rubric-score-potential accounts at the top. The standing rule: never remove an account from the list solely because of stale-data; always re-enrich first.
Intent score inflation from single-user behavior. A company in a 6sense “high-intent” segment may be there because one junior analyst at the company read three blog posts. Surfacing that company as intent-spike and routing it to direct AE outreach based on that signal is a false positive that burns AE time. Guard: when intent_scores are provided, the skill displays the raw intent score and the source alongside the intent-spike flag. The standing guidance in the skill output: before acting on any intent-spike signal, verify with 6sense or your ABM platform that the intent activity originates from buying-committee personas — director level and above in relevant functional areas — rather than from a single low-authority user.
Rubric drift invalidating historical audit comparisons. If the rubric changes between the Q2 audit and the Q3 audit, the list quality scores are not comparable — a rising score might just reflect a looser rubric, not actual list improvement. Guard: the skill records the rubric’s SHA-256 in every audit footer. When comparing quarter-over-quarter list quality scores, confirm the rubric SHA-256 is identical. If the rubric changed, re-run the prior quarter’s list against the new rubric before making comparisons. The “Last edited” date in the rubric file and the quarterly calendar reminder to review the rubric work together to make drift visible before it distorts the trend.

vs alternatives

vs manual RevOps review. For a list under 50 accounts, an experienced RevOps analyst with the ICP rubric open can manually review every account in 2-3 hours and produce a better-calibrated result than the skill — humans catch edge cases, like “this company has a weird SIC code but their actual product is clearly in our ICP,” that the skill will miss. Above 150 accounts, manual review becomes inconsistent: the analyst’s ICP intuition drifts between the first account and the 130th. The skill applies the rubric consistently at any list size.

vs 6sense’s built-in account grading. 6sense provides an account fit score based on its proprietary ICP model, trained on companies in your CRM with positive engagement history. It is useful once you have enough CRM history for 6sense to learn from (typically 50-100 closed-won accounts). For teams below that bar, 6sense’s fit model is under-trained and noisy. This skill works from day one because the rubric is hand-authored. The trade-off: 6sense’s model picks up patterns you did not explicitly write down; this skill only knows what you told it. For teams at 50+ closed-won, run both — use 6sense’s score for “what surprises me” and this skill’s defect codes for “what specifically is wrong with the Q3 accounts.”

vs a spreadsheet ICP scoring matrix. Many RevOps teams have a spreadsheet where they rate each account against ICP criteria manually. The spreadsheet approach breaks down at scale (consistency drops above 50 accounts), does not produce a defect taxonomy (it tells you the score, not why it is wrong), and becomes stale the moment the rubric changes because no one updates all the previously scored rows. This skill applies the rubric consistently, names the specific defect, and the SHA-256 mechanism ensures you know when the rubric moved. The spreadsheet is the right tool for the first 20 accounts; the skill is the right tool after that.

Edit this page on GitHub

Files in this artifact

Download all (.zip)

---
name: abm-list-quality-audit
description: Audit an ABM target list against an explicit ICP rubric and return a defect report for every account that fails. Produces a per-account defect taxonomy (wrong-size, wrong-industry, wrong-geo, wrong-funding, tech-mismatch, stale-data, low-intent, missing-field), a list-level quality score, and a prioritized remediation queue. Use before any ABM campaign goes live — not as a substitute for ICP strategy work.
---

# ABM list quality audit

## When to invoke

Invoke before launching any ABM campaign, before loading a list into a paid-media ABM platform, or before assigning named accounts to AEs. The skill takes a structured account list and your ICP rubric and returns a per-account defect report plus a list-level quality score.

The skill is also useful for quarterly list hygiene: run it over your existing ABM universe to find accounts that were added months ago and no longer match the current ICP, or accounts where enrichment has gone stale.

Invoke from:

- A **Clay table** where each row is an account, triggered manually or on a quarterly schedule. The skill writes defect codes and a quality tier back to two columns.
- A **CSV pre-flight check** before import into 6sense, Demandbase, or any ABM advertising platform that charges per account or per impression — running the audit first removes accounts you would pay to target and never convert.
- A **Salesforce report-based trigger** over named accounts in a specified segment, via a custom-code action that calls the skill and writes `ABM_Quality_Tier__c` and `ABM_Defect_Codes__c` back to the account record.

Do NOT invoke this skill for:

- **Scoring individual inbound leads.** The audit is designed for outbound named-account lists, not for triage of inbound MQLs. For inbound scoring, use the lead-scoring-icp-rubric skill.
- **Replacing the ICP strategy session.** The skill audits against a rubric you provide. If the rubric is a proxy for last year's customers, the audit will reproduce last year's biases. Have the ICP argument with your RevOps and GTM leadership before running the audit.
- **Generating net-new accounts.** The skill audits an existing list. It does not generate new accounts or run discovery on the TAM. Use a dedicated list-building workflow (Clay + ICP criteria) to generate the raw list first.
- **Suppression list management.** If the goal is to remove churned customers, competitors, or current customers from the list, that is deduplication, not auditing. Run those exclusion checks before invoking the skill.

## Inputs

Required:

- `account_list` — a structured list of account records. Minimum fields per account: `company_name`, `company_domain`. Strongly preferred: `industry`, `headcount`, `country`, `revenue_band`, `tech_stack` (array), `funding_stage`, `last_enrichment_date`.
- `rubric` — path to or inline contents of the ICP rubric markdown (see `references/1-icp-rubric-template.md`). Must contain explicit criterion + weight + tier-value rows. If the rubric has no weights, the skill refuses to run.

Optional:

- `intent_scores` — a map of `company_domain → intent_score` from 6sense, Bombora, or your ABM platform. When provided, the skill adds a `low-intent` defect code for accounts below your defined intent floor, and an `intent-spike` positive flag for accounts above your hot-intent threshold.
- `enrichment_staleness_days` — integer, default 90. Accounts where `last_enrichment_date` is older than this value receive a `stale-data` defect code. Adjust to match how aggressively your enrichment layer (Clay, ZoomInfo, Apollo) recycles data.
- `list_name` — string. Used to label the audit report. If omitted, defaults to `"Unnamed list — {run_date}"`.

## Reference files

Always load these before running the audit:

- `references/1-icp-rubric-template.md` — the ICP rubric. Same structure as the lead-scoring skill's rubric; shared between the two skills if your team uses both. Weights and tier values must be explicit.
- `references/2-defect-taxonomy.md` — the full defect code vocabulary with definitions, severity levels (P1 / P2 / P3), and the remediation action for each code. Edit this once with your RevOps lead before first use; the codes in the audit output are only as useful as the definitions in this file.
- `references/3-sample-audit-output.md` — a literal example of the full audit report for a 5-account list. Use when wiring downstream parsers or building the CRM writeback.

## Method

The skill runs four steps in order.

### 1. Hard disqualifier sweep (no LLM)

Before any LLM call, check each account against the rubric's hard disqualifiers: sanctioned country, disqualified industry, headcount below floor. Accounts that hit a hard disqualifier receive defect code `hd:{reason}` (e.g. `hd:sanctioned_country`) and a quality tier of `disqualified`. These are deterministic and cheap; they run first so the LLM does not burn tokens on them.

Why deterministic first: same reason as lead scoring — speed and reliability. A hard disqualifier check on 500 accounts takes milliseconds and never hallucinates.

### 2. Per-account ICP rubric scoring

For each account that cleared the hard disqualifier sweep, score against the ICP rubric using the same per-criterion method as the lead-scoring skill (explicit tier + weight + rationale per criterion). The weighted sum maps to a quality tier:

- **Q1** — score ≥ 8.0: in-ICP, meets criteria. No defect codes from rubric scoring.
- **Q2** — score 6.0-7.99: in-ICP with gaps. Defect codes name the specific failing criteria.
- **Q3** — score 4.0-5.99: borderline. Multiple defect codes; recommend enrichment and re-audit before including.
- **Q4** — score < 4.0: out-of-ICP. Recommend removal from the active list; flag for archive.

Why explicit tier thresholds rather than "let the model decide": same reason as lead scoring — the rubric is the source of truth, and the model's job is to apply it, not to re-weight it.

### 3. Supplemental defect detection

After rubric scoring, run supplemental checks that are not covered by the rubric criteria:

- **`stale-data`**: `last_enrichment_date` is older than `enrichment_staleness_days`. The account's rubric score is suspect because the underlying data may be wrong.
- **`missing-field`**: one or more rubric criteria could not be scored because the field was missing from the account record. List the missing field names.
- **`low-intent`**: `intent_scores[domain]` is below the floor defined in the rubric or passed as input. Applied on top of rubric score — a Q1 account with low intent is still in-ICP but is not hot right now.
- **`intent-spike`**: `intent_scores[domain]` is above the hot-intent threshold. A positive flag, not a defect; surfaced to help prioritize outreach even if the rubric score is only Q2.

### 4. List-level quality report and remediation queue

After per-account scoring, aggregate:

- **List quality score**: Q1% + Q2% - Q3% - 2×Q4%. This is a synthetic score intended to give a single number for "how good is this list" at a glance. A score above 60 means the list is predominantly in-ICP; below 30 means the list needs significant remediation before use.
- **Defect frequency table**: counts of each defect code across the list. The most common defect code tells you the single most valuable enrichment or segmentation fix.
- **Remediation queue**: the Q2 and Q3 accounts with `missing-field` or `stale-data` codes, ordered by estimated re-audit lift (accounts most likely to become Q1 after re-enrichment). This is the queue to hand to whoever owns enrichment.

Why a list-level score: individual account scores are useful for routing; the list-level score is useful for the ABM campaign go/no-go decision. If the list score is below 30, the campaign should not launch — the target list is too weak to justify the ABM platform spend.

## Output format

Literal markdown the skill emits for a 5-account list:

```markdown
# ABM list audit — Q3 2026 DACH expansion (run 2026-05-23)

**List quality score:** 52 / 100
**Accounts audited:** 5
**Breakdown:** Q1: 1 · Q2: 2 · Q3: 1 · Q4: 1

## Recommendation

List is marginal (score 52). Do not launch until Q3/Q4 accounts are remediated or removed.
Priority: re-enrich 2 Q2 accounts with missing headcount data; remove 1 Q4 account.

## Per-account results

| Domain | Quality tier | Score | Defect codes |
|---|---|---|---|
| northwind.com | Q1 | 8.6 | none |
| tailspin.io | Q2 | 7.1 | missing-field:headcount, stale-data |
| fabrikam.de | Q2 | 6.3 | wrong-size:too-small, wrong-funding, low-intent |
| contoso.com | Q3 | 5.0 | wrong-industry, tech-mismatch, missing-field:tech_stack |
| adventure-works.com | Q4 | 3.2 | wrong-size:too-large, wrong-geo, missing-field:revenue |

## Defect frequency table

| Defect code | Count | Action |
|---|---|---|
| missing-field:headcount | 2 | Re-enrich via Clay ZoomInfo column |
| stale-data | 2 | Re-run enrichment on accounts with last_enrichment_date > 90 days |
| wrong-size | 2 | Review headcount band in rubric — may be over-restricted |
| wrong-industry | 1 | Confirm industry mapping — SIC code may be miscategorized |
| wrong-geo | 1 | Remove if DACH-only campaign; keep for global list |
| wrong-funding | 1 | Move to pre-series A nurture vs. growth-stage ABM segment |
| tech-mismatch | 1 | Re-enrich tech stack via BuiltWith or Clay; remove if confirmed miss |
| low-intent | 1 | Move to nurture; re-activate when intent signal appears |
| missing-field:tech_stack | 1 | Re-enrich via BuiltWith or Clay tech-stack column |

## Remediation queue (by re-audit lift)

1. tailspin.io — add headcount; re-enrich; likely Q1 after fix.
2. fabrikam.de — low-intent flag only; already in-ICP. Activate when intent spikes.
3. contoso.com — re-enrich tech_stack; confirm industry; may move to Q2.

---
_Rubric SHA-256: 4f9c...a812 | Last edited 2026-05-01 by RevOps_
```

## Watch-outs

- **Defect codes that indict the rubric, not the account.** If 40% of the list has `wrong-size` codes, the problem is often not the list — it is a headcount floor in the rubric that was set when the company was targeting larger enterprises and was never updated after the SMB segment was opened. **Guard:** after every audit, check whether any single defect code applies to more than 25% of accounts. If so, review the rubric criterion that generates that code before remediating the list. The list might be right and the rubric wrong.
- **Stale enrichment masking real ICP fit.** An account's `last_enrichment_date` of 14 months ago means its headcount, funding stage, and tech stack data may all be wrong. A Q4 score on stale data is not a verdict on the account — it is a verdict on your enrichment cadence. **Guard:** the skill adds `stale-data` to any account where enrichment is older than the `enrichment_staleness_days` threshold, and the per-account rationale notes "scored on potentially stale data" for any such account. Do not remove Q4 + `stale-data` accounts; re-enrich them first and re-audit.
- **Intent score inflation from brand-aware accounts.** An account in a 6sense high-intent segment may be there because of one analyst at the company who reads your blog weekly — not because the buying committee is in-market. **Guard:** when `intent_scores` are provided, the skill shows the raw intent score alongside the `intent-spike` flag and names the intent source. Before acting on an `intent-spike` account, verify the intent signal is from buying-committee personas, not from a single low-authority user.

# ICP rubric — TEMPLATE (ABM audit)

> Replace this template's contents with your team's actual ICP rubric.
> The ABM list audit skill scores each account against this rubric.
> Vague rows (no weights, no tier values) cause the skill to refuse the run.
>
> This file can be shared with the lead-scoring-icp-rubric skill — the
> rubric structure is identical. If your team uses both skills, maintain
> one rubric file and reference it from both.

## How the skill reads this file

- Each row in "Criteria" must have an explicit `weight` (1-5) and three tier values
  (A / B / C). Malformed rows cause the skill to return an error.
- "Hard disqualifiers" run as deterministic checks before any LLM call. A single
  hit drops the account to `disqualified` regardless of other criteria.
- "Intent thresholds" are optional — only used when `intent_scores` is passed
  as input. Set these to match your ABM platform's scoring bands.
- The "Last edited" line is hashed into the SHA-256 recorded in the audit footer.

## Criteria

| Criterion | Weight | A (best fit) | B (stretch) | C (poor fit) |
|---|---|---|---|---|
| Industry | 5 | {industries you win in, e.g. Vertical SaaS, FinTech} | {adjacent industries} | {everything else} |
| Headcount | 4 | {core range, e.g. 200-2000} | {stretch range, e.g. 50-200 or 2000-5000} | {below/above stretch} |
| Geo | 3 | {primary regions, e.g. US, UK, DACH} | {secondary regions} | {unsupported regions} |
| Tech stack | 4 | {signals of fit, e.g. Salesforce + HubSpot present} | {one fit signal present} | {no fit signals or competing system} |
| Funding stage | 2 | {preferred stages, e.g. Series B-D, public mid-cap} | {adjacent stages} | {unfit, e.g. pre-seed or mature enterprise} |
| Revenue band | 3 | {ARR or revenue band that matches your ACV, e.g. $10M-$100M ARR} | {adjacent band} | {below minimum or above ceiling} |

## Hard disqualifiers

Single signals that drop an account to `disqualified` regardless of other criteria.
Run as deterministic checks before LLM scoring.

- `country in [{sanctioned or unsupported regions}]`
- `industry in [{disqualified industries — e.g. adult content, gambling if you do not serve them}]`
- `headcount < {absolute floor, e.g. 25}` (if you have one)
- `company_domain in [{explicit exclusion list — competitors, current customers, churned accounts}]`

## Intent thresholds (optional — only used when intent_scores provided)

Used to assign `low-intent` or `intent-spike` flags on top of the rubric score.

| 6sense / Bombora intent score | Flag applied |
|---|---|
| ≥ {hot threshold, e.g. 75} | `intent-spike` |
| {floor, e.g. 35} — {hot threshold - 1} | no flag (normal) |
| < {floor, e.g. 35} | `low-intent` |

## Quality tier thresholds

| Weighted score | Quality tier |
|---|---|
| 8.0 - 10.0 | Q1 (in-ICP, no rubric defects) |
| 6.0 - 7.99 | Q2 (in-ICP with gaps) |
| 4.0 - 5.99 | Q3 (borderline — remediate before use) |
| < 4.0 | Q4 (out-of-ICP — recommend removal) |

## Last edited

{YYYY-MM-DD} — by {RevOps owner name}

# Defect taxonomy — TEMPLATE

> This file defines every defect code the ABM list audit skill can assign.
> Edit the "Remediation action" column to match your team's actual processes
> before first use. The codes themselves are fixed — do not rename them;
> downstream parsers (CRM writeback, Clay columns) key on the code strings.

## How the skill reads this file

- Each defect code has a `severity` (P1 / P2 / P3). P1 defects are show-stoppers
  that mean the account should be removed or quarantined from the campaign until
  fixed. P2 defects are remediable. P3 defects are informational — the account
  can proceed, but the ABM or AE team should be aware.
- The skill emits defect codes in the per-account row and the defect-frequency
  table. It does not emit the full definition — that lives here for the human
  reviewer.

## Defect codes

### Rubric-sourced defects (from ICP scoring)

| Code | Severity | Definition | Remediation action |
|---|---|---|---|
| `wrong-industry` | P1 | Account's industry is in the C-tier or disqualified row of the rubric. | Remove from active list. Archive with `out-of-icp` tag. |
| `wrong-size:too-small` | P1 | Headcount is below the rubric's B-tier floor. | Remove unless a specific exemption applies (e.g. fast-growing startup with known expansion intent). |
| `wrong-size:too-large` | P2 | Headcount exceeds the rubric's B-tier ceiling. | Flag for enterprise segment or remove from SMB/mid-market campaign. |
| `wrong-geo` | P1 | Account's HQ region is not in the rubric's supported geo tiers. | Remove from geo-targeted campaign; keep in global campaigns if you have capacity to serve. |
| `wrong-funding` | P2 | Funding stage is in the C-tier row. | Move to a different campaign segment (pre-series A nurture vs. growth-stage ABM). |
| `tech-mismatch` | P2 | Tech stack has no fit signals from the rubric's tech-stack criterion. | Re-enrich tech stack; confirm via BuiltWith or Clay. If confirmed miss, remove. |

### Supplemental defects (not from rubric scoring)

| Code | Severity | Definition | Remediation action |
|---|---|---|---|
| `stale-data` | P2 | `last_enrichment_date` is older than the `enrichment_staleness_days` threshold. Rubric score is unreliable. | Re-run enrichment on this account before acting on its quality tier. Do not remove solely because of this code. |
| `missing-field:{field}` | P2 | The named field was absent from the account record. The criterion that uses it was scored as C (worst case) by default. | Re-enrich the specific field. Re-audit after enrichment. |
| `low-intent` | P3 | Intent score from the provided `intent_scores` input is below the floor threshold. | Move to nurture or lower-frequency sequence. Do not assign to AE until intent rises. |
| `hd:{reason}` | P1 | Hard disqualifier triggered. `{reason}` is the specific rubric row that matched (e.g. `hd:sanctioned_country`, `hd:competitor`). | Remove immediately. Archive with `disqualified` tag and the `hd:{reason}` code for audit trail. |

### Positive flags (not defects — appear in the per-account row for awareness)

| Code | Definition | Action |
|---|---|---|
| `intent-spike` | Intent score is above the hot-intent threshold. Account is signaling active in-market behavior. | Prioritize for direct AE outreach regardless of rubric tier. Even a Q2 account with `intent-spike` warrants a personalized touch. |

## Severity definitions

- **P1 — Remove:** the account should not be in the active ABM list. Keeping it wastes budget and suppresses campaign performance metrics.
- **P2 — Remediate:** the account may be a valid target but needs data work or segmentation before it can be activated. Hold from campaign activation until the defect is resolved.
- **P3 — Informational:** the account can proceed, but the campaign team should calibrate expectations. No blocking action required.

## Last edited

{YYYY-MM-DD} — by {RevOps owner name}

# Sample audit output — for parser wiring

> A literal example of what the skill emits for a 5-account list. Use
> when wiring the downstream parser: Clay AI column → property mapping,
> Salesforce custom-code action → property writeback, CSV post-processor.
> The schema below is what the skill commits to; the values are illustrative.

## Full audit report

```markdown
# ABM list audit — Q3 2026 DACH expansion (run 2026-05-23)

**List quality score:** 52 / 100
**Accounts audited:** 5
**Breakdown:** Q1: 1 · Q2: 2 · Q3: 1 · Q4: 1

## Recommendation

List is marginal (score 52). Do not launch until Q3/Q4 accounts are remediated or removed.
Priority: re-enrich 2 Q2 accounts with missing headcount data; remove 1 Q4 account.

## Per-account results

| Domain | Quality tier | Score | Defect codes |
|---|---|---|---|
| northwind.com | Q1 | 8.6 | none |
| tailspin.io | Q2 | 7.1 | missing-field:headcount, stale-data |
| fabrikam.de | Q2 | 6.3 | wrong-size:too-small, wrong-funding, low-intent |
| contoso.com | Q3 | 5.0 | wrong-industry, tech-mismatch, missing-field:tech_stack |
| adventure-works.com | Q4 | 3.2 | wrong-size:too-large, wrong-geo, missing-field:revenue |

## Defect frequency table

| Defect code | Count | Action |
|---|---|---|
| missing-field:headcount | 2 | Re-enrich via Clay ZoomInfo column |
| stale-data | 2 | Re-run enrichment — last_enrichment_date > 90 days |
| wrong-size | 2 | Review headcount band in rubric — may be over-restricted |
| wrong-industry | 1 | Confirm industry mapping — SIC code may be miscategorized |
| wrong-geo | 1 | Remove if DACH-only campaign; keep for global list |
| wrong-funding | 1 | Move to pre-series A nurture vs. growth-stage ABM segment |
| tech-mismatch | 1 | Re-enrich tech stack via BuiltWith or Clay; remove if confirmed miss |
| low-intent | 1 | Move to nurture; re-activate when intent signal appears |
| missing-field:tech_stack | 1 | Re-enrich via BuiltWith or Clay tech-stack column |

## Remediation queue (by re-audit lift)

1. tailspin.io — add headcount; re-enrich; likely Q1 after fix.
2. fabrikam.de — low-intent flag only; already in-ICP. Activate when intent spikes.
3. contoso.com — re-enrich tech_stack; confirm industry; may move to Q2.

---
_Rubric SHA-256: 4f9c...a812 | Last edited 2026-05-01 by Sam Patel_
```

## Field contract for parsers

If you build a parser instead of consuming the markdown, these are the stable fields:

### List-level fields

- `list_name` — string
- `run_date` — ISO date string (YYYY-MM-DD)
- `list_quality_score` — integer, 0-100
- `total_accounts` — integer
- `q1_count`, `q2_count`, `q3_count`, `q4_count` — integers
- `recommendation` — string, one paragraph
- `defect_frequency[]` — array of `{defect_code, count, action}`
- `remediation_queue[]` — array of `{domain, rationale, estimated_tier_after_fix}`

### Per-account fields

- `domain` — string, lowercased
- `quality_tier` — enum: `Q1` / `Q2` / `Q3` / `Q4` / `disqualified`
- `score` — float, 0.0 to 10.0
- `defect_codes[]` — array of strings (defect code vocabulary from `references/2-defect-taxonomy.md`)
- `positive_flags[]` — array of strings (e.g. `intent-spike`)
- `rationale[]` — array of `{criterion, weight, tier, reason}` (same structure as lead-scoring skill)
- `data_notes` — string, e.g. "scored on potentially stale data (last_enrichment_date: 2025-02-14)"

### Salesforce CRM writeback mapping

| Audit field | Salesforce field | Field type |
|---|---|---|
| quality_tier | `ABM_Quality_Tier__c` | Picklist (Q1/Q2/Q3/Q4/disqualified) |
| defect_codes[] joined by `, ` | `ABM_Defect_Codes__c` | Text (255) |
| score | `ABM_ICP_Score__c` | Number (decimal, 1 place) |
| run_date | `ABM_Last_Audited__c` | Date |
| positive_flags[] joined by `, ` | `ABM_Intent_Flags__c` | Text (255) |