claude-skill

Audit an ABM list against an ICP rubric with Claude

Difficulty

中級

Setup time

30-60 min

For

revops

RevOps

Stack

ABM ターゲットリストと ICP ルーブリックを受け取り、アカウントごとの欠陥レポートを返す Claude スキルです。基準を満たさないアカウントには、定義された分類体系（wrong-size、wrong-industry、wrong-geo、stale-data、low-intent、missing-field）から欠陥コードが付与され、品質レベル（Q1〜Q4）、リスト全体の品質スコア、優先順位付きの是正キューが生成されます。バンドルは apps/web/public/artifacts/abm-list-quality-audit-skill/ にあり、SKILL.md と、初回実行前にユーザーが適宜修正する 3 つのリファレンステンプレートが含まれています。

ほとんどの ABM キャンペーンがローンチ前にスキップしている問いに答えます：「このリストにある 300 社のうち、実際に ICP に合致しているのは何社で、合致していないものには具体的に何が問題なのか？」この答えなしには、ABM プラットフォーム — 6sense、Demandbase、LinkedIn マッチオーディエンス — への予算が、決してコンバートしないアカウントに流れ続け、キャンペーンの期待はずれな結果がメッセージやチャネルに帰されることになります。本来の原因はリストの品質にあるにもかかわらず。

使うべき状況

ABM リストを有料メディアプラットフォームにロードする前、指名アカウントを AE に割り当てる前、そしてリストが 90 日以上前に作成されたキャンペーンをローンチする前に使用してください。ABM リストは多くの RevOps チームが認識している以上の速さで劣化します：headcount データが陳腐化し、資金調達ステージが変わり、企業が買収され、ICP ルーブリック自体もリストが再評価されないまま変化することがあります。

このスキルは四半期ごとのリスト衛生管理にも適したツールです。キャンペーンリストだけでなく ABM ユニバース全体で実行し、ICP が異なっていた時期に追加されてから再評価されていないアカウントを見つけます。欠陥頻度テーブルは、ユニバース全体でどの種類のエンリッチメントギャップが最も多いかを示し、Clay のエンリッチメントワークフローを担当する人にとってアクショナブルな情報になります。

呼び出し方法：

キャンペーンローンチ前に手動で、または四半期のクーロンで実行される、各行がアカウントの Clay テーブル。スキルは quality_tier と defect_codes を 2 つの Clay カラムに書き戻し、ダウンストリームの自動化がキャンペーンアップロードから Q3/Q4 アカウントを除外するためにフィルタリングできます。
6sense やその他の ABM 広告プラットフォームへのインポート前の CSV プリフライトチェック。監査を実行することで、そうでなければターゲティングに費用を払うことになるアカウントを除去できます — 典型的な ABM CPM レート（1,000 インプレッションあたり $20〜40）では、500 アカウントリストから 50 の ICP 外アカウントを除去することで 10% の無駄が削減されます。
セグメント内の指名アカウントに対する Salesforce レポートベースのトリガーで、ABM_Quality_Tier__c と ABM_Defect_Codes__c をアカウントレコードに書き戻します。

使うべきでない状況

以下の場合はこのスキルをスキップしてください：

インバウンド MQL を評価したい場合。 監査はアウトバウンドの指名アカウントリスト向けに設計されています。インバウンドリードのトリアージには、lead-scoring-icp-rubric スキルが適切なツールです — 単一リードのフローとインバウンドで重要なボーダーラインエスカレーションロジックを処理します。
ICP ルーブリックがまだ存在しない場合。 スキルはあなたが提供するルーブリックに対して監査します。ICP の議論 — どの業種、headcount の範囲、地域で実際に勝てるか — をまだ行っていない場合、その会話を先に行う必要があります。プレースホルダーのルーブリックに対して監査を実行することは、見せかけの厳密さを生み出すだけです。
リストに重複排除が必要で、監査ではない場合。 現在の顧客、競合他社、チャーンしたアカウント、GDPR 抑制済み連絡先を除外することが目的なら、それはフィルタ操作であって ICP 監査ではありません。監査の前にこれらの除外を実行してください。そうしないとスキルが除外したいとすでに分かっている会社のスコアリングにトークンを費やします。
リストを生成する必要があり、監査ではない場合。 スキルは既存のリストを入力として受け取ります。TAM ディスカバリーを実行したり新しいアカウントを生成したりはしません。まず Clay と ICP 基準を使った専用のリスト構築ワークフローで生リストを作成してください。
リストのアカウント数が 20 件未満の場合。 その規模以下では、経験豊富な RevOps または AE が 1 時間以内にすべてのアカウントを手動でレビューできます。スキルの設定コスト（ルーブリック設定、欠陥分類体系のカスタマイズ）は見合いません。

セットアップ

ICP ルーブリックが存在すれば、セットアップは 30〜60 分で完了します。ルーブリックの議論 — RevOps、GTM リーダーシップ、AE 1〜2 名が A ティアの業種と headcount 範囲の実際の意味について合意すること — はより時間がかかり、セットアップ前に行われます。

スキルをインストールする。 apps/web/public/artifacts/abm-list-quality-audit-skill/SKILL.md と references/ フォルダを .claude/skills/abm-audit/ ディレクトリにコピーするか、claude.ai でスキルとしてアップロードします。フロントマターの name と description が関連するプロンプトのトリガーになります。
ICP ルーブリックを設定する。 references/1-icp-rubric-template.md を開きます。チームがすでに lead-scoring-icp-rubric スキルを使用している場合、同じルーブリックファイルを参照できます — 構造は同じです。プレースホルダー行を実際の基準、重み（1〜5）、ティア値（A / B / C）に置き換えます。ハードディスクォリファイアセクションを記入します。「Last edited」を更新してください — スキルが各レポートフッターに記録する SHA-256 により、ステークホルダーがルーブリックが変更されたタイミングを確認できます。
欠陥分類体系を設定する。 references/2-defect-taxonomy.md を開きます。欠陥コード自体は固定です — 名前を変えないでください。ダウンストリームのパーサーはコード文字列に依存しています。「Remediation action」列をチームの実際のプロセスに合わせて編集します：どの Clay カラムが headcount の再エンリッチメントを提供するか、ZoomInfo サブスクリプションを誰が管理するか、エンタープライズオーバーフローアカウントをどのセグメントが担当するか。
インテントスコアを準備する（オプションだが高価値）。 6sense または Bombora を使用している場合、アカウントユニバースの ドメイン → インテントスコア マップをエクスポートして intent_scores として渡します。これにより、ルーブリックスコアに加えて low-intent と intent-spike のアノテーションが追加されます — intent-spike フラグは、ICP 内にあるがボーダーラインな Q2 アカウントに特に価値があります。再エンリッチメント前でも優先順位付けのために表面化されるためです。
エンリッチメント陳腐化閾値を設定する。 エンリッチメント層がデータを再処理する積極性に合わせて enrichment_staleness_days を更新します。Clay + ZoomInfo は通常 90 日サイクルで更新します。月次エンリッチメントを実行している場合は 45 日に設定できます。これが stale-data 欠陥コードを駆動します。
既知のリストでテストする。 よく知っている 20〜30 件のアカウント — 現在の顧客、チャーンしたアカウント、さまざまな品質のプロスペクトの混合 — でスキルを手動実行します。品質レベルがチームの直感と一致しているか確認します。Q1 アカウントに欠陥コードが表示されている場合、ルーブリックの設定が間違っています。明らかに ICP 外のアカウントが Q2 とスコアリングされている場合、ハードディスクォリファイアや重みを調整する必要があります。

スキルが実際に行うこと

スキルは固定された順序で 4 つのステップを実行します。

ステップ 1 — ハードディスクォリファイアスイープ。 LLM コールの前に、各アカウントをルーブリックのハードディスクォリファイアに対してチェックします：制裁対象国、資格剥奪業種、絶対最小値以下の headcount、明示的除外リスト（競合他社、現在の顧客）上のアカウント。一致したアカウントは欠陥コード hd:{理由} と品質レベル disqualified を受け取ります。このステップは決定論的で、各アカウントにミリ秒で実行されます。最初に実行する理由：500 アカウントリストでは、5〜15% のアカウントが即時ディスクォリフィケーションになることが一般的です — これらのアカウントに LLM スコアリングを実行することは、情報を追加せずにトークンを無駄にし、レイテンシーを増やします。

ステップ 2 — アカウントごとの ICP ルーブリックスコアリング。 ハードディスクォリファイアスイープをクリアしたアカウントは、ルーブリックの各基準に対してスコアリングされます。各基準について、モデルはティア（A / B / C）、重み（ルーブリックから）、ルーブリック行を引用した 1 文の根拠を出力します。加重合計が品質レベルにマッピングされます：Q1（スコア ≥ 8.0）、Q2（6.0〜7.99）、Q3（4.0〜5.99）、Q4（< 4.0）。失敗した基準は対応する欠陥コードを生成します — B ティア最小値以下のアカウントの headcount 基準 C スコアは wrong-size:too-small を生成します。

全体的なスコアではなく基準ごとにする理由：是正キューを駆動する欠陥コードは、全体的なスコアが低かったことだけでなく、どの特定の基準が失敗したかを知る必要があります。missing-field:tech_stack の Q3 アカウントは wrong-industry の Q3 アカウントとは異なる是正タスクです — 前者はエンリッチメントが必要で、後者は削除が必要です。

ステップ 3 — 補足欠陥検出。 ルーブリックスコアリング後、スキルはルーブリックでカバーされていない欠陥をチェックします：stale-data（閾値より古いエンリッチメント）、missing-field:{フィールド}（スコアリングできなかった基準）、提供されたインテントスコアからの low-intent と intent-spike。intent-spike フラグは Q2 アカウントにも表示されます — インマーケット行動がボーダーラインなルーブリックスコアを覆し、いずれにしても直接 AE コンタクトをトリガーすべきアカウントを表面化させます。

ステップ 4 — リストレベルの集約。 アカウントごとのスコアリング後、スキルはリスト品質スコア（Q1% + Q2% - Q3% - 2×Q4%、100 にスケール）、欠陥頻度テーブル、是正キューを計算します。是正キューは推定再監査リフト順にソートされます：再エンリッチメント後に Q1 になる可能性が最も高いアカウントが最初に表示されます。リスト品質スコアが 30 を下回ると、スキルの go/no-go シグナルになります — 推奨セクションには「Q3/Q4 アカウントが是正または削除されるまでローンチしないこと」と表示されます。

コストの実際

アカウントあたりのトークンコストはルーブリックのサイズと提供されるアカウントデータの量によって異なります。300〜500 トークンのデータを持つアカウントレコードと、基準ごとの構造化出力を持つ典型的な 6 基準ルーブリックでは、アカウントあたり約 1,200〜2,000 インプットトークンと 300〜500 アウトプットトークンを見込んでください。Claude Sonnet 4.x の価格（2026 年初頭時点でインプット 100 万トークンあたり約 $3、アウトプット 100 万トークンあたり約 $15）では、アカウントあたり $0.008〜0.015 になります。

500 アカウントのプリキャンペーン監査は Claude トークンで $4〜8 かかります。2,000 アカウントの ABM ユニバースに対する四半期衛生管理は $16〜30 かかります。これらは 1 回の誤ってルーティングされた AE シーケンスのコストより小さいです。非トークンコストの方が大きくなります：ルーブリックと欠陥分類体系を正しく設定するには 60〜90 分のセッションが必要です。計画に含めてください。

アカウントあたりのトークンコストはリードスコアリングスキルよりも低くなります。ABM アカウントは通常より豊富な構造化データを持ち（欠損フィールドが少ない）、欠陥コードは完全な基準ごとの根拠よりもコンパクトです。アカウントに多くの欠損フィールドがある場合、より多くの処理が補足欠陥ステップに落ちますが、これは決定論的で無料です。

ルーブリックと欠陥分類体系ファイルのプロンプトキャッシングはスケールで大きな効果を発揮します — 500 アカウントの監査では、ルーブリックが 1 回ロードされてバッチ全体にわたってキャッシュされます。5 アカウントのスポットチェックでは関係ありません。

成功指標

監査の主要指標：リスト品質スコアのトレンド — 同じ ABM ユニバースで四半期ごとに監査を実行し、リスト品質スコアが上昇するかどうかを追跡します。上昇するスコアはエンリッチメントケイデンスが機能しており、ルーブリックが安定しており、リスト構築プロセスが改善されていることを意味します。下降するスコア、または是正努力にもかかわらず横ばいのスコアは、ルーブリックが変化したか、エンリッチメントソースが信頼できないことを意味します。

副次指標：品質レベル別の ABM キャンペーンコンバージョン率。監査済みリストに対してキャンペーンを 90 日間実行した後、Q1 アカウント vs Q2 アカウント vs 追加前に Q3 から是正されたアカウントのオポチュニティへのコンバージョン率を比較します。Q1 は Q2 より高いレートでコンバートすべきであり、是正後の Q2 は未監査の Q3 より高いレートでコンバートすべきです。レベル間にコンバージョンの差がない場合、ルーブリックは予測的ではなく、再議論が必要です。

失敗モード

リストではなくルーブリックを告発する欠陥コード。 リストの 35% が wrong-size:too-small を受け取る場合、問題はしばしばリストではなくルーブリックの headcount フロアにあります。ルーブリックは企業がエンタープライズのみをターゲットにしていた時代に設定され、SMB セグメントが開かれた後も更新されていない可能性があります。これらの欠陥コードに基づいてリストの 35% を削除することは間違ったアクションです；ルーブリックを見直すことが正しいアクションです。Guard: 各監査後、単一の欠陥コードがアカウントの 25% 以上に適用されているかどうかを確認します。もしそうなら、リストを是正する前にそのコードを生成するルーブリック基準を見直します。監査出力の欠陥頻度テーブルがこの確認を簡単にします — 最も一般的なコードは常にテーブルの 1 行目です。
古いエンリッチメントが良いアカウントの偽陰性を生成する。 last_enrichment_date が 14 ヶ月前のアカウントは、そのデータが収集されて以来 headcount が 3 倍になり、シリーズ B を調達し、tech stack に Salesforce を追加している可能性があります。スキルのそのアカウントへの Q4 評価は企業への評価ではありません — 御社のエンリッチメントケイデンスへの評価です。再エンリッチメントの前にそれらのアカウントを削除したり優先度を下げたりすることで、実際のパイプラインが失われます。Guard: スキルは陳腐化閾値を超えるエンリッチメントを持つアカウントに stale-data を追加し、根拠に「潜在的に陳腐化したデータでスコアリングされた」と記載します。是正キューは stale-data かつ高いルーブリックスコアポテンシャルを持つアカウントをトップに配置します。原則：stale-data のみを理由にアカウントをリストから削除してはいけません；必ず最初に再エンリッチメントしてください。
単一ユーザー行動によるインテントスコアの過大評価。 6sense の「高インテント」セグメントにある企業は、その企業の若手アナリスト 1 人がブログ記事を 3 本読んだために存在している可能性があります。そのシグナルに基づいて企業を intent-spike としてフラグし、直接 AE コンタクトにルーティングすることは、AE の時間を消費する偽陽性です。Guard: intent_scores が提供されると、スキルは intent-spike フラグとともに生のインテントスコアとソースを表示します。スキル出力の指針：intent-spike シグナルに基づいて行動する前に、6sense や ABM プラットフォームで、インテントアクティビティが購買委員会のペルソナ — 関連する機能分野でディレクター以上 — から来ていることを確認し、単一の権限の低いユーザーからではないことを確かめてください。
ルーブリックのドリフトが監査の歴史的な比較を無効にする。 Q2 監査と Q3 監査の間でルーブリックが変わると、リスト品質スコアは比較できません — 上昇するスコアは実際のリスト改善ではなく、より緩いルーブリックを反映しているだけかもしれません。Guard: スキルは各監査フッターにルーブリックの SHA-256 を記録します。四半期ごとのリスト品質スコアを比較するとき、ルーブリックの SHA-256 が同一であることを確認します。ルーブリックが変わった場合、比較を行う前に前四半期のリストを新しいルーブリックに対して再実行します。ルーブリックファイルの「Last edited」日付と、ルーブリックをレビューするための四半期カレンダーリマインダーが連携して、ドリフトがトレンドを歪める前に可視化されます。

代替手段との比較

vs 手動 RevOps レビュー。 50 アカウント未満のリストでは、ICP ルーブリックを開いた状態の経験豊富な RevOps アナリストが 2〜3 時間ですべてのアカウントを手動でレビューし、スキルよりも適切にキャリブレーションされた結果を出すことができます — 人間は「この企業は変な SIC コードを持っているが、実際のプロダクトは明らかに ICP の範囲内」のようなエッジケースを捉えますが、スキルはそれを見逃します。150 アカウントを超えると、手動レビューは一貫性を失います：アナリストの ICP 直感が最初のアカウントと 130 番目のアカウントの間でドリフトします。スキルはどのリストサイズでも一貫してルーブリックを適用します。

vs 6sense のビルトインアカウントグレーディング。 6sense はポジティブなエンゲージメント履歴を持つ CRM 内の企業でトレーニングされた独自の ICP モデルに基づいてアカウントフィットスコアを提供します。6sense が学習するのに十分な CRM 履歴がある場合に役立ちます（通常 50〜100 のクローズドウィンアカウント）。その閾値以下のチームでは、6sense のフィットモデルは訓練が不十分でノイズが多くなります。このスキルはルーブリックが手作業で書かれているため、初日から機能します。トレードオフ：6sense のモデルは明示的に書き留めていないパターンを捉えます；このスキルはあなたが教えたことだけを知っています。50 件以上のクローズドウィンを持つチームには両方を使うことをお勧めします — 「何が驚きか」に 6sense のスコアを、「Q3 アカウントの具体的な問題は何か」にこのスキルの欠陥コードを使用してください。

vs スプレッドシートの ICP スコアリングマトリックス。 多くの RevOps チームは ICP 基準に対して各アカウントを手動で評価するスプレッドシートを持っています。スプレッドシートのアプローチはスケールで崩壊し（50 アカウント以上で一貫性が低下）、欠陥分類体系を生成せず（スコアを教えてくれるが、なぜ悪いのかは教えてくれない）、ルーブリックが変わった瞬間に陳腐化します。なぜなら誰も以前にスコアリングしたすべての行を更新しないためです。このスキルはルーブリックを一貫して適用し、特定の欠陥を命名し、SHA-256 メカニズムによりルーブリックがいつ変化したかを把握できます。スプレッドシートは最初の 20 アカウントに適切なツールです；その後はスキルが適切なツールです。

GitHubでこのページを編集

Files in this artifact

Download all (.zip)

---
name: abm-list-quality-audit
description: Audit an ABM target list against an explicit ICP rubric and return a defect report for every account that fails. Produces a per-account defect taxonomy (wrong-size, wrong-industry, wrong-geo, wrong-funding, tech-mismatch, stale-data, low-intent, missing-field), a list-level quality score, and a prioritized remediation queue. Use before any ABM campaign goes live — not as a substitute for ICP strategy work.
---

# ABM list quality audit

## When to invoke

Invoke before launching any ABM campaign, before loading a list into a paid-media ABM platform, or before assigning named accounts to AEs. The skill takes a structured account list and your ICP rubric and returns a per-account defect report plus a list-level quality score.

The skill is also useful for quarterly list hygiene: run it over your existing ABM universe to find accounts that were added months ago and no longer match the current ICP, or accounts where enrichment has gone stale.

Invoke from:

- A **Clay table** where each row is an account, triggered manually or on a quarterly schedule. The skill writes defect codes and a quality tier back to two columns.
- A **CSV pre-flight check** before import into 6sense, Demandbase, or any ABM advertising platform that charges per account or per impression — running the audit first removes accounts you would pay to target and never convert.
- A **Salesforce report-based trigger** over named accounts in a specified segment, via a custom-code action that calls the skill and writes `ABM_Quality_Tier__c` and `ABM_Defect_Codes__c` back to the account record.

Do NOT invoke this skill for:

- **Scoring individual inbound leads.** The audit is designed for outbound named-account lists, not for triage of inbound MQLs. For inbound scoring, use the lead-scoring-icp-rubric skill.
- **Replacing the ICP strategy session.** The skill audits against a rubric you provide. If the rubric is a proxy for last year's customers, the audit will reproduce last year's biases. Have the ICP argument with your RevOps and GTM leadership before running the audit.
- **Generating net-new accounts.** The skill audits an existing list. It does not generate new accounts or run discovery on the TAM. Use a dedicated list-building workflow (Clay + ICP criteria) to generate the raw list first.
- **Suppression list management.** If the goal is to remove churned customers, competitors, or current customers from the list, that is deduplication, not auditing. Run those exclusion checks before invoking the skill.

## Inputs

Required:

- `account_list` — a structured list of account records. Minimum fields per account: `company_name`, `company_domain`. Strongly preferred: `industry`, `headcount`, `country`, `revenue_band`, `tech_stack` (array), `funding_stage`, `last_enrichment_date`.
- `rubric` — path to or inline contents of the ICP rubric markdown (see `references/1-icp-rubric-template.md`). Must contain explicit criterion + weight + tier-value rows. If the rubric has no weights, the skill refuses to run.

Optional:

- `intent_scores` — a map of `company_domain → intent_score` from 6sense, Bombora, or your ABM platform. When provided, the skill adds a `low-intent` defect code for accounts below your defined intent floor, and an `intent-spike` positive flag for accounts above your hot-intent threshold.
- `enrichment_staleness_days` — integer, default 90. Accounts where `last_enrichment_date` is older than this value receive a `stale-data` defect code. Adjust to match how aggressively your enrichment layer (Clay, ZoomInfo, Apollo) recycles data.
- `list_name` — string. Used to label the audit report. If omitted, defaults to `"Unnamed list — {run_date}"`.

## Reference files

Always load these before running the audit:

- `references/1-icp-rubric-template.md` — the ICP rubric. Same structure as the lead-scoring skill's rubric; shared between the two skills if your team uses both. Weights and tier values must be explicit.
- `references/2-defect-taxonomy.md` — the full defect code vocabulary with definitions, severity levels (P1 / P2 / P3), and the remediation action for each code. Edit this once with your RevOps lead before first use; the codes in the audit output are only as useful as the definitions in this file.
- `references/3-sample-audit-output.md` — a literal example of the full audit report for a 5-account list. Use when wiring downstream parsers or building the CRM writeback.

## Method

The skill runs four steps in order.

### 1. Hard disqualifier sweep (no LLM)

Before any LLM call, check each account against the rubric's hard disqualifiers: sanctioned country, disqualified industry, headcount below floor. Accounts that hit a hard disqualifier receive defect code `hd:{reason}` (e.g. `hd:sanctioned_country`) and a quality tier of `disqualified`. These are deterministic and cheap; they run first so the LLM does not burn tokens on them.

Why deterministic first: same reason as lead scoring — speed and reliability. A hard disqualifier check on 500 accounts takes milliseconds and never hallucinates.

### 2. Per-account ICP rubric scoring

For each account that cleared the hard disqualifier sweep, score against the ICP rubric using the same per-criterion method as the lead-scoring skill (explicit tier + weight + rationale per criterion). The weighted sum maps to a quality tier:

- **Q1** — score ≥ 8.0: in-ICP, meets criteria. No defect codes from rubric scoring.
- **Q2** — score 6.0-7.99: in-ICP with gaps. Defect codes name the specific failing criteria.
- **Q3** — score 4.0-5.99: borderline. Multiple defect codes; recommend enrichment and re-audit before including.
- **Q4** — score < 4.0: out-of-ICP. Recommend removal from the active list; flag for archive.

Why explicit tier thresholds rather than "let the model decide": same reason as lead scoring — the rubric is the source of truth, and the model's job is to apply it, not to re-weight it.

### 3. Supplemental defect detection

After rubric scoring, run supplemental checks that are not covered by the rubric criteria:

- **`stale-data`**: `last_enrichment_date` is older than `enrichment_staleness_days`. The account's rubric score is suspect because the underlying data may be wrong.
- **`missing-field`**: one or more rubric criteria could not be scored because the field was missing from the account record. List the missing field names.
- **`low-intent`**: `intent_scores[domain]` is below the floor defined in the rubric or passed as input. Applied on top of rubric score — a Q1 account with low intent is still in-ICP but is not hot right now.
- **`intent-spike`**: `intent_scores[domain]` is above the hot-intent threshold. A positive flag, not a defect; surfaced to help prioritize outreach even if the rubric score is only Q2.

### 4. List-level quality report and remediation queue

After per-account scoring, aggregate:

- **List quality score**: Q1% + Q2% - Q3% - 2×Q4%. This is a synthetic score intended to give a single number for "how good is this list" at a glance. A score above 60 means the list is predominantly in-ICP; below 30 means the list needs significant remediation before use.
- **Defect frequency table**: counts of each defect code across the list. The most common defect code tells you the single most valuable enrichment or segmentation fix.
- **Remediation queue**: the Q2 and Q3 accounts with `missing-field` or `stale-data` codes, ordered by estimated re-audit lift (accounts most likely to become Q1 after re-enrichment). This is the queue to hand to whoever owns enrichment.

Why a list-level score: individual account scores are useful for routing; the list-level score is useful for the ABM campaign go/no-go decision. If the list score is below 30, the campaign should not launch — the target list is too weak to justify the ABM platform spend.

## Output format

Literal markdown the skill emits for a 5-account list:

```markdown
# ABM list audit — Q3 2026 DACH expansion (run 2026-05-23)

**List quality score:** 52 / 100
**Accounts audited:** 5
**Breakdown:** Q1: 1 · Q2: 2 · Q3: 1 · Q4: 1

## Recommendation

List is marginal (score 52). Do not launch until Q3/Q4 accounts are remediated or removed.
Priority: re-enrich 2 Q2 accounts with missing headcount data; remove 1 Q4 account.

## Per-account results

| Domain | Quality tier | Score | Defect codes |
|---|---|---|---|
| northwind.com | Q1 | 8.6 | none |
| tailspin.io | Q2 | 7.1 | missing-field:headcount, stale-data |
| fabrikam.de | Q2 | 6.3 | wrong-size:too-small, wrong-funding, low-intent |
| contoso.com | Q3 | 5.0 | wrong-industry, tech-mismatch, missing-field:tech_stack |
| adventure-works.com | Q4 | 3.2 | wrong-size:too-large, wrong-geo, missing-field:revenue |

## Defect frequency table

| Defect code | Count | Action |
|---|---|---|
| missing-field:headcount | 2 | Re-enrich via Clay ZoomInfo column |
| stale-data | 2 | Re-run enrichment on accounts with last_enrichment_date > 90 days |
| wrong-size | 2 | Review headcount band in rubric — may be over-restricted |
| wrong-industry | 1 | Confirm industry mapping — SIC code may be miscategorized |
| wrong-geo | 1 | Remove if DACH-only campaign; keep for global list |
| wrong-funding | 1 | Move to pre-series A nurture vs. growth-stage ABM segment |
| tech-mismatch | 1 | Re-enrich tech stack via BuiltWith or Clay; remove if confirmed miss |
| low-intent | 1 | Move to nurture; re-activate when intent signal appears |
| missing-field:tech_stack | 1 | Re-enrich via BuiltWith or Clay tech-stack column |

## Remediation queue (by re-audit lift)

1. tailspin.io — add headcount; re-enrich; likely Q1 after fix.
2. fabrikam.de — low-intent flag only; already in-ICP. Activate when intent spikes.
3. contoso.com — re-enrich tech_stack; confirm industry; may move to Q2.

---
_Rubric SHA-256: 4f9c...a812 | Last edited 2026-05-01 by RevOps_
```

## Watch-outs

- **Defect codes that indict the rubric, not the account.** If 40% of the list has `wrong-size` codes, the problem is often not the list — it is a headcount floor in the rubric that was set when the company was targeting larger enterprises and was never updated after the SMB segment was opened. **Guard:** after every audit, check whether any single defect code applies to more than 25% of accounts. If so, review the rubric criterion that generates that code before remediating the list. The list might be right and the rubric wrong.
- **Stale enrichment masking real ICP fit.** An account's `last_enrichment_date` of 14 months ago means its headcount, funding stage, and tech stack data may all be wrong. A Q4 score on stale data is not a verdict on the account — it is a verdict on your enrichment cadence. **Guard:** the skill adds `stale-data` to any account where enrichment is older than the `enrichment_staleness_days` threshold, and the per-account rationale notes "scored on potentially stale data" for any such account. Do not remove Q4 + `stale-data` accounts; re-enrich them first and re-audit.
- **Intent score inflation from brand-aware accounts.** An account in a 6sense high-intent segment may be there because of one analyst at the company who reads your blog weekly — not because the buying committee is in-market. **Guard:** when `intent_scores` are provided, the skill shows the raw intent score alongside the `intent-spike` flag and names the intent source. Before acting on an `intent-spike` account, verify the intent signal is from buying-committee personas, not from a single low-authority user.

# ICP rubric — TEMPLATE (ABM audit)

> Replace this template's contents with your team's actual ICP rubric.
> The ABM list audit skill scores each account against this rubric.
> Vague rows (no weights, no tier values) cause the skill to refuse the run.
>
> This file can be shared with the lead-scoring-icp-rubric skill — the
> rubric structure is identical. If your team uses both skills, maintain
> one rubric file and reference it from both.

## How the skill reads this file

- Each row in "Criteria" must have an explicit `weight` (1-5) and three tier values
  (A / B / C). Malformed rows cause the skill to return an error.
- "Hard disqualifiers" run as deterministic checks before any LLM call. A single
  hit drops the account to `disqualified` regardless of other criteria.
- "Intent thresholds" are optional — only used when `intent_scores` is passed
  as input. Set these to match your ABM platform's scoring bands.
- The "Last edited" line is hashed into the SHA-256 recorded in the audit footer.

## Criteria

| Criterion | Weight | A (best fit) | B (stretch) | C (poor fit) |
|---|---|---|---|---|
| Industry | 5 | {industries you win in, e.g. Vertical SaaS, FinTech} | {adjacent industries} | {everything else} |
| Headcount | 4 | {core range, e.g. 200-2000} | {stretch range, e.g. 50-200 or 2000-5000} | {below/above stretch} |
| Geo | 3 | {primary regions, e.g. US, UK, DACH} | {secondary regions} | {unsupported regions} |
| Tech stack | 4 | {signals of fit, e.g. Salesforce + HubSpot present} | {one fit signal present} | {no fit signals or competing system} |
| Funding stage | 2 | {preferred stages, e.g. Series B-D, public mid-cap} | {adjacent stages} | {unfit, e.g. pre-seed or mature enterprise} |
| Revenue band | 3 | {ARR or revenue band that matches your ACV, e.g. $10M-$100M ARR} | {adjacent band} | {below minimum or above ceiling} |

## Hard disqualifiers

Single signals that drop an account to `disqualified` regardless of other criteria.
Run as deterministic checks before LLM scoring.

- `country in [{sanctioned or unsupported regions}]`
- `industry in [{disqualified industries — e.g. adult content, gambling if you do not serve them}]`
- `headcount < {absolute floor, e.g. 25}` (if you have one)
- `company_domain in [{explicit exclusion list — competitors, current customers, churned accounts}]`

## Intent thresholds (optional — only used when intent_scores provided)

Used to assign `low-intent` or `intent-spike` flags on top of the rubric score.

| 6sense / Bombora intent score | Flag applied |
|---|---|
| ≥ {hot threshold, e.g. 75} | `intent-spike` |
| {floor, e.g. 35} — {hot threshold - 1} | no flag (normal) |
| < {floor, e.g. 35} | `low-intent` |

## Quality tier thresholds

| Weighted score | Quality tier |
|---|---|
| 8.0 - 10.0 | Q1 (in-ICP, no rubric defects) |
| 6.0 - 7.99 | Q2 (in-ICP with gaps) |
| 4.0 - 5.99 | Q3 (borderline — remediate before use) |
| < 4.0 | Q4 (out-of-ICP — recommend removal) |

## Last edited

{YYYY-MM-DD} — by {RevOps owner name}

# Defect taxonomy — TEMPLATE

> This file defines every defect code the ABM list audit skill can assign.
> Edit the "Remediation action" column to match your team's actual processes
> before first use. The codes themselves are fixed — do not rename them;
> downstream parsers (CRM writeback, Clay columns) key on the code strings.

## How the skill reads this file

- Each defect code has a `severity` (P1 / P2 / P3). P1 defects are show-stoppers
  that mean the account should be removed or quarantined from the campaign until
  fixed. P2 defects are remediable. P3 defects are informational — the account
  can proceed, but the ABM or AE team should be aware.
- The skill emits defect codes in the per-account row and the defect-frequency
  table. It does not emit the full definition — that lives here for the human
  reviewer.

## Defect codes

### Rubric-sourced defects (from ICP scoring)

| Code | Severity | Definition | Remediation action |
|---|---|---|---|
| `wrong-industry` | P1 | Account's industry is in the C-tier or disqualified row of the rubric. | Remove from active list. Archive with `out-of-icp` tag. |
| `wrong-size:too-small` | P1 | Headcount is below the rubric's B-tier floor. | Remove unless a specific exemption applies (e.g. fast-growing startup with known expansion intent). |
| `wrong-size:too-large` | P2 | Headcount exceeds the rubric's B-tier ceiling. | Flag for enterprise segment or remove from SMB/mid-market campaign. |
| `wrong-geo` | P1 | Account's HQ region is not in the rubric's supported geo tiers. | Remove from geo-targeted campaign; keep in global campaigns if you have capacity to serve. |
| `wrong-funding` | P2 | Funding stage is in the C-tier row. | Move to a different campaign segment (pre-series A nurture vs. growth-stage ABM). |
| `tech-mismatch` | P2 | Tech stack has no fit signals from the rubric's tech-stack criterion. | Re-enrich tech stack; confirm via BuiltWith or Clay. If confirmed miss, remove. |

### Supplemental defects (not from rubric scoring)

| Code | Severity | Definition | Remediation action |
|---|---|---|---|
| `stale-data` | P2 | `last_enrichment_date` is older than the `enrichment_staleness_days` threshold. Rubric score is unreliable. | Re-run enrichment on this account before acting on its quality tier. Do not remove solely because of this code. |
| `missing-field:{field}` | P2 | The named field was absent from the account record. The criterion that uses it was scored as C (worst case) by default. | Re-enrich the specific field. Re-audit after enrichment. |
| `low-intent` | P3 | Intent score from the provided `intent_scores` input is below the floor threshold. | Move to nurture or lower-frequency sequence. Do not assign to AE until intent rises. |
| `hd:{reason}` | P1 | Hard disqualifier triggered. `{reason}` is the specific rubric row that matched (e.g. `hd:sanctioned_country`, `hd:competitor`). | Remove immediately. Archive with `disqualified` tag and the `hd:{reason}` code for audit trail. |

### Positive flags (not defects — appear in the per-account row for awareness)

| Code | Definition | Action |
|---|---|---|
| `intent-spike` | Intent score is above the hot-intent threshold. Account is signaling active in-market behavior. | Prioritize for direct AE outreach regardless of rubric tier. Even a Q2 account with `intent-spike` warrants a personalized touch. |

## Severity definitions

- **P1 — Remove:** the account should not be in the active ABM list. Keeping it wastes budget and suppresses campaign performance metrics.
- **P2 — Remediate:** the account may be a valid target but needs data work or segmentation before it can be activated. Hold from campaign activation until the defect is resolved.
- **P3 — Informational:** the account can proceed, but the campaign team should calibrate expectations. No blocking action required.

## Last edited

{YYYY-MM-DD} — by {RevOps owner name}

# Sample audit output — for parser wiring

> A literal example of what the skill emits for a 5-account list. Use
> when wiring the downstream parser: Clay AI column → property mapping,
> Salesforce custom-code action → property writeback, CSV post-processor.
> The schema below is what the skill commits to; the values are illustrative.

## Full audit report

```markdown
# ABM list audit — Q3 2026 DACH expansion (run 2026-05-23)

**List quality score:** 52 / 100
**Accounts audited:** 5
**Breakdown:** Q1: 1 · Q2: 2 · Q3: 1 · Q4: 1

## Recommendation

List is marginal (score 52). Do not launch until Q3/Q4 accounts are remediated or removed.
Priority: re-enrich 2 Q2 accounts with missing headcount data; remove 1 Q4 account.

## Per-account results

| Domain | Quality tier | Score | Defect codes |
|---|---|---|---|
| northwind.com | Q1 | 8.6 | none |
| tailspin.io | Q2 | 7.1 | missing-field:headcount, stale-data |
| fabrikam.de | Q2 | 6.3 | wrong-size:too-small, wrong-funding, low-intent |
| contoso.com | Q3 | 5.0 | wrong-industry, tech-mismatch, missing-field:tech_stack |
| adventure-works.com | Q4 | 3.2 | wrong-size:too-large, wrong-geo, missing-field:revenue |

## Defect frequency table

| Defect code | Count | Action |
|---|---|---|
| missing-field:headcount | 2 | Re-enrich via Clay ZoomInfo column |
| stale-data | 2 | Re-run enrichment — last_enrichment_date > 90 days |
| wrong-size | 2 | Review headcount band in rubric — may be over-restricted |
| wrong-industry | 1 | Confirm industry mapping — SIC code may be miscategorized |
| wrong-geo | 1 | Remove if DACH-only campaign; keep for global list |
| wrong-funding | 1 | Move to pre-series A nurture vs. growth-stage ABM segment |
| tech-mismatch | 1 | Re-enrich tech stack via BuiltWith or Clay; remove if confirmed miss |
| low-intent | 1 | Move to nurture; re-activate when intent signal appears |
| missing-field:tech_stack | 1 | Re-enrich via BuiltWith or Clay tech-stack column |

## Remediation queue (by re-audit lift)

1. tailspin.io — add headcount; re-enrich; likely Q1 after fix.
2. fabrikam.de — low-intent flag only; already in-ICP. Activate when intent spikes.
3. contoso.com — re-enrich tech_stack; confirm industry; may move to Q2.

---
_Rubric SHA-256: 4f9c...a812 | Last edited 2026-05-01 by Sam Patel_
```

## Field contract for parsers

If you build a parser instead of consuming the markdown, these are the stable fields:

### List-level fields

- `list_name` — string
- `run_date` — ISO date string (YYYY-MM-DD)
- `list_quality_score` — integer, 0-100
- `total_accounts` — integer
- `q1_count`, `q2_count`, `q3_count`, `q4_count` — integers
- `recommendation` — string, one paragraph
- `defect_frequency[]` — array of `{defect_code, count, action}`
- `remediation_queue[]` — array of `{domain, rationale, estimated_tier_after_fix}`

### Per-account fields

- `domain` — string, lowercased
- `quality_tier` — enum: `Q1` / `Q2` / `Q3` / `Q4` / `disqualified`
- `score` — float, 0.0 to 10.0
- `defect_codes[]` — array of strings (defect code vocabulary from `references/2-defect-taxonomy.md`)
- `positive_flags[]` — array of strings (e.g. `intent-spike`)
- `rationale[]` — array of `{criterion, weight, tier, reason}` (same structure as lead-scoring skill)
- `data_notes` — string, e.g. "scored on potentially stale data (last_enrichment_date: 2025-02-14)"

### Salesforce CRM writeback mapping

| Audit field | Salesforce field | Field type |
|---|---|---|
| quality_tier | `ABM_Quality_Tier__c` | Picklist (Q1/Q2/Q3/Q4/disqualified) |
| defect_codes[] joined by `, ` | `ABM_Defect_Codes__c` | Text (255) |
| score | `ABM_ICP_Score__c` | Number (decimal, 1 place) |
| run_date | `ABM_Last_Audited__c` | Date |
| positive_flags[] joined by `, ` | `ABM_Intent_Flags__c` | Text (255) |