claude-skill

ClaudeによるICPルーブリックを使ったリードスコアリング

Difficulty

中級

Setup time

30min

For

revops

RevOps

Stack

任意のリード行を受け取り、チームのICPルーブリックに照らし合わせて実行し、0〜10のスコア、ルーブリックを引用した基準ごとの根拠、ティアごとの推奨次アクション、ボーダーラインケースのエスカレーションフラグを返すClaude スキルです。Clay AIカラム、HubSpotのカスタムコードアクション、またはCSVに対するスタンドアロンのCLI実行に組み込めるよう設計されています。昨年から誰も更新していないスプレッドシートのスコアリングマトリクスを置き換えますが、インテントスコアリングや行動スコアリングもできると偽ることはしません——それはできません。

バンドルは apps/web/public/artifacts/lead-scoring-icp-rubric-skill/ に収録されており、SKILL.md とユーザーが初回実行前に適応させる3つの参照テンプレートが含まれています。

使用すべきタイミング

インバウンドのMQLがSDRチームのトリアージ能力を超えて積み上がっており、既存のスコアリングが存在しない（「すべてがリードだ」）か古くなっている（「HubSpotのスコアリングマトリクスは2023年に最後にキャリブレーションされ、誰も信頼していない」）場合にこのスキルを使用します。アウトバウンドにも有用です：エンリッチメント済みのコールドリストをアサインする前にスコアリングすれば、表面上は問題なく見えてもICP外の企業にSDRの時間を費やさずに済みます。

このスキルは適合スコアリングであり、インテントスコアリングではありません。「これは私たちにとって適切な種類の会社か」に答えるものであり、「今週市場にいるか」には答えません。この区別は重要です。適合スコアリングだけを行うと、現在のニーズがない優良適合アカウントをシーケンスに入れる一方で、積極的に購入しようとしている低適合アカウントを無視することになります。このスキルを、インマーケット行動をシグナルするもの（Bombora、6sense、自社のプロダクト利用イベント、価格ページのヒット）と組み合わせて、正確なルーティングを実現してください。

具体的には次の場所から呼び出します：

リードテーブルのすべての新しい行に対して発火し、スコアと根拠を2つのカラムに書き戻すClay AIカラム。
Lifecycle stage = MQL によってトリガーされるワークフロー内のHubSpotカスタムコードアクション。スキルを呼び出し、スコアと根拠の両方をリードプロパティに書き戻します。
キャンペーン開始前の一回限りのリストスコアリングに便利な、CSVエクスポートに対するスタンドアロンCLI。

使用すべきでないタイミング

次の場合はこのスキルをスキップしてください：

人間が介在せずリードを自動却下したい場合。 アウトプットは推奨です。スキルはボーダーラインケースに明示的に escalate: needs_human_review タグを付けますが、CティアまたはそれC以下のリードを削除するよう接続すると、ルーブリックが古くなるたびにサイレントにパイプラインを破壊します。少なくともCティアについては常にSDRレビューのパスを維持してください。
「ルーブリック」が感覚に頼っている場合。 スキルは明示的な重みとティア値を持たないルーブリックに対してスコアリングすることを拒否します。チームがAティアの業界が実際にどうあるべきかについての議論をまだ行っていない場合は、まずその議論を行ってください。スキルはソースが明確でなければルーブリックを守れません。
行動スコアリングやインテントスコアリングが必要な場合。 これは適合スコアリングのみです。「エンゲージメントスコア」や「最後のウェブサイト訪問」をルーブリックにエンコードしようとすると、常に更新が必要になります。時間変動シグナルには専用のインテントツールを使用し、このスキルは静的な適合シグナルに留めてください。
基準ごとの根拠を超えた説明可能性を必要とする規制ドメインで運用している場合。 基準ごとのアウトプットは監査可能ですが、規制機関に対して守れるモデルカードとは同じではありません。それが必要な場合は、Claude スキルではなく適切なスコアリングサービスに投資してください。

セットアップ

ルーブリックを作成できれば、セットアップには約30分かかります。ルーブリック自体にはさらに時間がかかります——通常はSDRマネージャー、AE、RevOpsの誰かが重みについて議論する60分のワーキングセッションです。

スキルをインストールします。 apps/web/public/artifacts/lead-scoring-icp-rubric-skill/SKILL.md と references/ フォルダを .claude/skills/lead-scoring/ ディレクトリ（またはclaude.aiのスキルとしてアップロード）に配置します。フロントマターの name と description が関連するプロンプトでスキルをトリガーします。
ルーブリックテンプレートを置き換えます。 references/1-icp-rubric-template.md を開き、「Criteria」内のプレースホルダー行を実際の基準、重み（1〜5）、ティア値（A / B / C）に置き換えます。「Hard disqualifiers」セクションを埋めます——これらはLLMコールの前に決定論的なチェックとして実行されます。「Last edited」を更新すると、スキルがすべてのアウトプットフッターに印刷するSHA-256が現在のバージョンの所有者を反映します。
ティア別アクションマトリクスを置き換えます。 references/2-tier-to-action-matrix.md を開き、例の行をチームが各 (tier, source_of_lead) の組み合わせで実際に行うことに置き換えます。デフォルトは合理的ですが、チーム固有ではありません。
インプットソースを接続します。 Clayでは、AIカラムをスキルに向け、エンリッチメント済みのリード行を lead として、ルーブリックファイルを rubric として、ソースカラムを source_of_lead として渡します。HubSpotでは、スキルをカスタムコードアクションでラップし、コンタクト+企業プロパティを lead オブジェクトに読み込み、構造化されたアウトプットを書き戻します。スクリプトでは、CSVをglob処理し、各行をポストし、スコアと根拠を2つの新しいカラムに書き込みます。
デスティネーションを設定します。 スコアと根拠の両方をリードに渡します。スコアはナンバープロパティ（ルーティングロジック用）、根拠はロングテキストプロパティ（コール前に読むSDR用）に入れます。escalate フィールドを別のブール値またはenum プロパティに接続し、SDRマネージャーがレビューのためにフィルタリングできるようにします。
キャリブレーションします。 有効化する前に、過去6か月の成立案件20件と失注案件20件に対してスキルを実行します。スコア分布は2つのコホートを明確に分離するはずです。そうでない場合は、スキルではなくルーブリックに問題があります——ステップ2に戻り、重みを再議論してください。

スキルの実際の動作

スキルは固定の順序で4つのステップを実行します。前のステップが後のステップをゲートします。並列実行しないでください。

ステップ1 — 決定論的ファームグラフィックチェック。 LLMコールの前に、プレーンコードがルーブリックのハードディスクォリファイア（制裁対象国、ディスクォリファイ業界、フロア以下の従業員数、フリーメールドメイン）と必須フィールドチェック（email と company_domain が存在する必要がある）を実行します。ヒットすると引用とともに即座に返します——disqualified、またはフィールドが欠落している場合は escalate: insufficient_data。決定論的処理を先に行う理由：無料で、速く、幻覚を起こしません。3人のヘアサロンがエンタープライズSaaS ICPに該当しないことを確認するためにトークンを消費するのは無駄です。

ステップ2 — 明示的な重み付きによる基準ごとのLLMスコアリング。 残りの各基準について、モデルはティア（A / B / C）と、ルーブリック行を引用した一文の根拠を出力します。スキルはティア（A=3、B=2、C=1）に基準の重みを乗じて合計します。基準ごとに行う理由（全体的なプロンプトではなく）：全体的なアウトプットは基準を黙って混合し、リードが5ではなく8を取った理由をデバッグする能力が失われます。明示的な重み付けを行う理由（モデルにバランスを取らせるのではなく）：明示された重みだけがルーブリックを信頼できる情報源に保つ唯一の方法です。モデルが自分でバランスを決めると、ルーブリックのレビューが形だけのものになります。

ステップ3 — ボーダーラインケースの人間によるレビューへのフォールバック。 最終スコアがティア境界の0.5以内にある場合、または3つ以上の基準が欠落データまたは推測データでスコアリングされた場合、スキルは escalate: needs_human_review を設定し、欠落フィールドを名指しします。最もコストのかかるスコアリングの失敗は、自信のあるリードでの間違ったティアではなく——常にボーダーラインだったリードでの間違ったティアです。

ステップ4 — アウトプットの組み立て。 スキルは references/3-sample-output.md で説明されているMarkdownを出力します：ヘッドラインのスコアとティア、ティア別アクションマトリクスから結合した推奨次アクション、理由付きの基準ごとのテーブル、ディスクォリファイアチェック、データギャップリスト、ルーブリックのSHA-256と最終編集日のフッター。

コストの現実

リードごとのトークンコストはルーブリックのサイズによりますが、構造化された基準ごとのアウトプットを持つ典型的な6基準ルーブリックの場合、リードごとに概算で1,500〜2,500インプットトークン、400〜700アウトプットトークンを見込んでください。2026年後半のClaude Sonnet 4.x価格（インプット約$3/百万トークン、アウトプット約$15/百万トークン）では、スコアリング済みリードあたり約$0.01〜0.02です。

月間5,000件のインバウンドMQLを処理するチームは、Claudeトークンで月約$50〜100を支出します。月間50,000件のエンリッチメント済みアウトバウンドリードを処理するチームは月$500〜1,000を支出します——この規模ではバッチング、ルーブリックのプロンプトキャッシング、決定論的ステップによる事前フィルタリングが重要です。スキルはデフォルトでリードごとに単一の構造化プロンプト（6〜10の小さなプロンプトではなく）を使用し、トークン使用量を抑えています。

トークン以外のコストの方が大きいです。ルーブリックの構築は一度実施し四半期ごとに再実施する60分のワーキングセッションです。成立案件20件+失注案件20件によるキャリブレーションはさらに1時間かかります。ClayまたはHubSpotの統合を接続するのに半日かかります。その後は、ルーブリックがずれるまでスキルはハンズオフです。

成功指標

注目すべき指標はスコアから転換率の相関です：過去90日間にAにスコアリングされたリードのうち、商談に転換した割合はどれくらいか？Bにスコアリングされたリードは？Cは？曲線が単調であれば——AがBよりも高い転換率、BがCよりも高い転換率——ルーブリックは機能しています。CとBが同程度の転換率であれば、ルーブリックは適合と非適合を分離しておらず、再議論が必要です。

二次指標：Aティアリードへの最初のタッチまでのSDR時間。機能するスコアリングシステムはインバウンドの場合この時間を1時間未満に圧縮します。Aティアのリードが24時間キューで待機している場合、スコアリングではなくルーティングがボトルネックです。

代替手段との比較

vs HubSpot予測リードスコアリング。 HubSpotのビルトイン予測スコアは、過去の転換データでトレーニングされたブラックボックスです。十分な成立案件ボリュームがある場合に機能します（HubSpotは最低約500件の成立案件を推奨）。その閾値以下のチームには、モデルが学習するものがなく、スコアはノイズです。このスキルはルーブリックが手動で作成されるため初日から機能します。トレードオフ：HubSpotのモデルはルーブリック作成者が見逃すパターンをピックアップします。このスキルは書き留めたものしか知りません。ボリュームがある場合は両方実行してください——「何が驚きか」にHubSpotのスコアを使用し、「なぜこれがここにランキングされているか」にこのスキルの基準ごとの根拠を使用します。

vs Marketoの行動スコアリング。 Marketo（またはHubSpotの行動スコアリング）はエンゲージメントシグナル——メール開封、ページビュー、フォーム送信——を追跡してポイントを加算します。これはインテントスコアリングであり、適合スコアリングではなく、両者は異なる質問に答えます。メールを開封していない優良適合アカウントは依然として優良適合アカウントです。ブログを読みふけった低適合アカウントは依然として低適合アカウントです。このスキルに加えて行動スコアリングを使用してください。代替ではなく組み合わせとして：高適合+高インテント→AEダイレクト、高適合+低インテント→ナーチャリング、低適合+高インテント→SDRがAE接続前にフィットコールを実施。

vs 手動SDRレビュー。 週間50件未満のインバウンドリードの場合、SDRマネージャーによる手動レビューは本当に競争力があります——人間は「この会社がちょうど顧客を買収した、優先順位を上げよう」というニュアンスを捉えます。スキルが見逃すものです。週間約200件以上では、手動レビューがボトルネックになり一貫性が低下します。スキルはトークン予算に合わせてリニアにスケールしますが、人間はそうではありません。

注意点

ルーブリックのずれ。 誰かがMarkdownのルーブリックを編集してリリースし、新しいスコアを読むSDRがdiffを見ない。6週間後、チームは従業員数の重みが誤って4から2に変更され、200件のストレッチティアのアカウントがサイレントにCに格下げされたことに気づきます。対策： スキルはすべてのアウトプットフッターにルーブリックのSHA-256を記録し、ハッシュが実行間で変化するたびに「ルーブリックが YYYY-MM-DD に更新されました」バナーを先頭に付加します。四半期ごとのカレンダーリマインダーが、編集がなくてもレビューを強制します。
ソースバイアスの増幅。 成立案件セットから構築されたルーブリックは、すでに販売した先をエンコードします。それに対してスコアリングすると、隣接するICPに目が向かなくなり、パイプラインは時間の経過とともに昨年の顧客の類似企業に絞られます。対策： 四半期ごとに、スキルがCティアにスコアリングした20件のリードをサンプリングし、AEが実際に適合するかどうかを手動でレビューします。3件以上が誤分類されていた場合は、ルーブリックに「ストレッチICP」行を追加して再キャリブレーションしてください。
薄いデータへの誤った自信。 エンリッチメントが6基準中4基準で欠落している場合、7.4のスコアはほぼノイズですが権威があるように読めます。SDRはそれを自信のあるAティアとして扱い、コール準備をスキップします。対策： スキルは3つ以上の基準が欠落または推測データでスコアリングされた場合は常に escalate: needs_human_review を設定し、欠落フィールドをリストした「Data gaps」セクションを追加します。SDRはヘッドラインナンバーより先にギャップセクションを読むよう訓練されています。
保護クラスのプロキシ。 良い意図があっても、「地域」に重みを付けたルーブリックは国籍に、「業界」は法務チームが喜ばない企業人口統計のプロキシに変わる可能性があります。対策： スキルは保護クラスのプロキシとして認識するフィールド（名前由来の性別、写真、年齢シグナル）を拒否します。ルーブリックを毎年、あまり明白でないプロキシを見分けられる人間と一緒にレビューしてください。

スタック

Claude — スコアリングエンジンと根拠ジェネレーター。Sonnet 4.xはこのタスクのコスト対推論品質のスイートスポットです。HaikuはLLMステップで根拠品質が低下しますが、決定論的専用パスには機能します。
Clay — アウトバウンドおよびコールドリストスコアリング用の優先リードソースおよびエンリッチメントレイヤー。AIカラムはすっきりした統合ポイントです。
HubSpot — スコア、根拠、エスカレーションフラグ、ソース用のCRMデスティネーション。カスタムコードアクションはインバウンドMQLスコアリングの統合ポイントです。
Markdownエディタとカレンダー — 地味な部品。ルーブリックはMarkdownに存在し、四半期レビューは誰かのカレンダーに存在し、モデル選択よりも両方が重要です。

GitHubでこのページを編集

Files in this artifact

Download all (.zip)

---
name: lead-scoring-icp-rubric
description: Score a single lead or a batch of leads against an explicit ICP rubric. Returns a 0-10 score per lead, a per-criterion rationale citing the rubric, a recommended next action by tier, and an escalation flag for borderline cases. Use when triaging inbound or routing enriched outbound leads — not as a substitute for behavioral or intent-based scoring.
---

# Lead scoring (ICP rubric)

## When to invoke

Invoke whenever you need to score a single lead — or a CSV/JSON batch of leads — against your team's ICP rubric. Typical entry points: a Clay table column, a HubSpot custom-code action firing on a new MQL, a standalone CLI run over a marketing-list export, or a manual paste during deal-desk triage.

The skill takes structured lead data plus the rubric and returns a 0-10 score, per-criterion rationale, a recommended next action by tier, and an escalation flag when the data is too thin to score confidently.

Do NOT invoke this skill for:

- **Auto-rejecting leads.** The output is a recommendation. Disqualifying a lead from outreach without an SDR seeing the rationale silently destroys pipeline when the rubric is wrong (and the rubric is sometimes wrong).
- **Scoring on protected-class proxies.** Do not pass fields like name-derived gender, photo, age, country-of-origin signals. Even if your rubric weights "geography" legitimately for support-hours fit, never collapse that into ethnicity or nationality. The skill refuses fields it recognizes as protected-class proxies.
- **Replacing intent-based or behavioral scoring entirely.** This is fit scoring, not intent. A great-fit account that has not visited your pricing page in 90 days is still a great fit but not a hot lead. Pair this skill with whatever signals "they are in-market right now" — Bombora, 6sense, your own product-usage events.

## Inputs

Required:

- `lead` — a structured lead record. Minimum fields: `email`, `company_domain`. Strongly preferred: `headcount`, `industry`, `country`, `job_title`, `tech_stack` (array), `funding_stage`. Pass whatever your enrichment layer (Clay, Apollo, ZoomInfo, Clearbit) returns.
- `rubric` — path to or inline contents of the ICP rubric markdown (see `references/1-icp-rubric-template.md`). Must contain explicit criterion + weight + tier-value rows. The skill refuses to score against a rubric that has no weights — vibes are not a rubric.

Optional:

- `source_of_lead` — free-text or enum: `inbound_demo`, `inbound_content`, `outbound_sequence`, `partner_referral`, `event`, `cold_list`. Used to bias the recommended-next-action mapping (a partner referral with a B-tier score still gets a human reach-out; a cold-list lead at the same tier does not).
- `batch_size_hint` — when scoring more than one lead, the caller can pass an integer so the skill paces token usage and returns progress markers. Default: process serially, no progress markers.

## Reference files

Always load these from `references/` before scoring. They are the leverage point — a tight rubric makes a defensible score, a vague rubric makes a vibes score that an AE will (correctly) ignore.

- `references/1-icp-rubric-template.md` — the rubric template. Replace placeholder rows with the actual criteria, weights, and tier values your team has agreed on.
- `references/2-tier-to-action-matrix.md` — maps the four tiers (A / B / C / disqualified) and the `source_of_lead` enum to a recommended next action. Edit this once with your team's routing reality, not the defaults.
- `references/3-sample-output.md` — a literal example of the markdown the skill produces, for one fictional lead. Use as the reference when wiring downstream parsers.

## Method

The skill runs these steps in order. Earlier steps gate later steps — do not parallelize.

### 1. Deterministic firmographic checks (no LLM)

Before any LLM call, run plain code over the lead record:

- Hard disqualifiers from the rubric (e.g. `country in ["{sanctioned-country}"]`, `industry in {disqualified-industries}`, `headcount < 10` if the rubric sets that floor) → return tier `disqualified` with the citation, no LLM call.
- Required-field check: if `email` and `company_domain` are missing, return `escalate: insufficient_data`.

Why: deterministic checks are free, fast, and never hallucinate. Burning tokens to confirm that a 3-person hairdresser is not in your enterprise-SaaS ICP is wasteful and slightly embarrassing.

### 2. Per-criterion LLM scoring with explicit rubric weighting

For each remaining criterion in the rubric, prompt the model to produce a tier value (A / B / C) and a one-sentence rationale that cites the rubric row. The skill multiplies the tier-value (A=3, B=2, C=1) by the criterion's weight and sums.

Why per-criterion rather than one holistic prompt: holistic scoring blends criteria silently and you lose the ability to debug why a lead got an 8 instead of a 5. Per-criterion outputs make the score auditable. The cost is roughly 6-10 short prompts per lead (or a single prompt that emits a structured per-criterion response — both work; the skill defaults to a single structured prompt with explicit per-criterion fields to keep tokens down).

Why explicit weighting rather than "let the model balance them": stated weights are the only way the rubric stays the source of truth. If the model invents its own balance, the rubric stops being authoritative and rubric reviews become theatre.

### 3. Borderline case fallback to human review

If the final score is within `+/- 0.5` of a tier boundary, OR if the rubric has more than 3 criteria where the data was missing/insufficient, set `escalate: needs_human_review` with a note naming the missing fields.

Why: the most expensive scoring failure is not a wrong tier on a confident lead — it is a wrong tier on a lead that was always borderline. Surfacing those for human review preserves trust in the confident scores.

### 4. Output assembly

Render the markdown described in "Output format" below. Score is the headline number. Rationale is the per-criterion table. Next action comes from the tier-to-action matrix, joined with `source_of_lead` if provided. Escalation flag is surfaced at the top when set.

## Output format

Literal markdown the skill emits for a single lead:

```markdown
# Lead score — jane.doe@acme.com (acme.com)

**Score:** 7.4 / 10 — Tier B
**Source:** inbound_content
**Escalate:** no

## Recommended next action

Tier B + inbound_content → SDR personalized email within 24h, no auto-sequence. Reference content piece they engaged with.

## Rationale (per criterion)

| Criterion | Weight | Tier | Reason |
|---|---|---|---|
| Industry | 5 | A | "Vertical SaaS / RevOps" matches in-ICP row in rubric. |
| Headcount | 4 | B | 240 employees — in stretch range (200-500), not core (500-2000). |
| Geo | 3 | A | HQ US-east, in supported region. |
| Tech stack | 4 | B | Salesforce + Marketo present (fit signals); no data warehouse cited. |
| Funding stage | 2 | C | Bootstrapped — out of preferred Series B-D band. |
| Job title | 4 | A | "Director, RevOps" matches champion-target pattern. |

## Disqualifier check

None triggered.

## Data gaps

- `revenue` field not provided by enrichment.
```

For batch input, the skill emits one such block per lead, separated by `\n---\n`, plus a top-level summary table (`email | tier | escalate`).

## Watch-outs

- **Rubric drift.** The rubric is a markdown file that someone edits. Edits are silent — no diff is shown to the SDRs reading scores. **Guard:** the skill records the rubric's SHA-256 in every output footer and prepends a "Rubric updated {date}, last verified by {name}" line if the hash differs from the previous run's. A weekly job (or a calendar reminder, if you are not that fancy) opens a PR-style review of the rubric every quarter.
- **Source-bias amplification.** If the rubric was built from your closed-won set, it encodes who you have already sold to. Repeatedly scoring against it narrows your pipeline to lookalikes and makes you blind to adjacent ICP. **Guard:** every quarter, sample 20 leads the skill scored as C-tier and have an AE review whether any are actually fit. If more than 3 are misclassified, the rubric is over-fit and needs a "stretch ICP" row added.
- **False confidence on thin data.** When enrichment is missing 4 of the 6 criteria fields, a 7.4 score is mostly noise. **Guard:** the skill sets `escalate: needs_human_review` whenever more than 3 criteria are scored on missing/inferred data, and adds a "Data gaps" section listing the absent fields. SDRs are trained to read the gaps section before the headline number.

# ICP rubric — TEMPLATE

> Replace this template's contents with your team's actual ICP rubric.
> The lead-scoring skill scores each criterion against this rubric. Vague
> rows (no weights, no tier values) cause the skill to refuse the run.

## How the skill reads this file

- Each row in "Criteria" must have an explicit `weight` (1-5) and three tier values (A / B / C). Anything else is treated as malformed and the skill returns an error rather than guessing.
- Rows in "Hard disqualifiers" run as deterministic checks before any LLM call. Keep them tight; one wrong row here silently kills good pipeline.
- The "Last edited" line is hashed into the SHA-256 the skill records in every output footer. Update it when you make material changes so SDRs reading scores can see the rubric moved.

## Criteria

| Criterion | Weight | A (best fit) | B (stretch) | C (poor fit) |
|---|---|---|---|---|
| Industry | 5 | {industries you win in} | {adjacent industries} | {everything else} |
| Headcount | 4 | {core range, e.g. 500-2000} | {stretch range, e.g. 200-500 or 2000-5000} | {below/above stretch} |
| Geo | 3 | {primary regions} | {secondary regions} | {regions you do not support} |
| Tech stack | 4 | {tools that signal fit, e.g. Salesforce + Marketo} | {one of the fit tools present} | {competing system of record} |
| Funding stage | 2 | {preferred stages, e.g. Series B-D} | {adjacent stages} | {unfit, e.g. pre-seed or post-IPO} |
| Job title | 4 | {champion-target patterns} | {adjacent titles} | {non-buying-committee titles} |

## Hard disqualifiers

Single signals that drop a lead to `disqualified` regardless of other criteria. Run as deterministic checks before LLM scoring.

- `country in [{sanctioned-or-unsupported-list}]`
- `industry in [{disqualified-industries — e.g. adult, gambling if you do not serve them}]`
- `headcount < {floor — e.g. 10}` (if you have a floor)
- `email_domain in [{free-mail providers if your motion blocks them}]`

## Tier thresholds

The skill maps the weighted sum to a tier. Defaults shown — adjust to your team's calibration run.

| Score | Tier |
|---|---|
| 8.0 - 10.0 | A |
| 6.0 - 7.99 | B |
| 4.0 - 5.99 | C |
| < 4.0 | disqualified |

## Last edited

{YYYY-MM-DD} — by {name}

# Tier-to-action matrix — TEMPLATE

> Replace this template's contents with your team's actual routing reality.
> The lead-scoring skill joins the score's tier with the lead's
> `source_of_lead` to pick a recommended next action. Edit once with your
> SDR/AE manager so the recommendations match what your reps actually do.

## How the skill reads this file

- Rows are `(tier, source_of_lead) → action`. The skill picks the row whose tier matches the score and whose source matches the input. If the source is missing or unrecognized, it falls back to the row marked `*` (any source).
- An action is one short imperative sentence. The skill emits this verbatim under "Recommended next action" — keep it copy-pasteable.

## Matrix

| Tier | Source | Action |
|---|---|---|
| A | inbound_demo | Round-robin to AE within 5 minutes; book meeting in same business day. |
| A | inbound_content | SDR call within 1 hour; reference content piece. Auto-sequence as backup if no answer in 24h. |
| A | outbound_sequence | Move to high-touch sequence; SDR adds 2 personalized steps. |
| A | partner_referral | AE handles directly. Loop in partner manager for warm intro. |
| A | event | SDR call within 24h referencing the event session/booth conversation. |
| A | cold_list | Treat as outbound: enrich further, hand to SDR for personalized first touch. |
| A | * | SDR personalized outreach within 24h. |
| B | inbound_demo | SDR qualification call within 4 hours before AE handoff. |
| B | inbound_content | SDR personalized email within 24h, no auto-sequence. Reference content piece. |
| B | outbound_sequence | Standard outbound sequence, no escalation. |
| B | partner_referral | SDR call within 48h; loop in partner if no response. |
| B | event | SDR email + follow-up call within 48h. |
| B | cold_list | Standard outbound sequence. |
| B | * | SDR email within 48h. |
| C | inbound_demo | SDR fit-call within 24h; many will self-disqualify on the call. |
| C | inbound_content | Add to nurture; no SDR touch unless engagement signals appear. |
| C | outbound_sequence | Pause sequence; do not waste SDR cycles. |
| C | partner_referral | SDR courtesy call within 1 week (relationship cost of ignoring). |
| C | event | Add to nurture only. |
| C | cold_list | Drop. |
| C | * | Nurture only. |
| disqualified | * | Mark `Disqualified — out of ICP` with rubric citation. Do not auto-delete; archive for audit. |

## Escalation overrides

When the skill emits `escalate: needs_human_review`, the action above is replaced with:

> Hold for SDR manager review. Lead is borderline (within 0.5 of tier boundary) or scored on thin data. See "Data gaps" section.

When the skill emits `escalate: insufficient_data`, the action is:

> Re-enrich lead and re-score. Required fields missing: {list}.

## Last edited

{YYYY-MM-DD} — by {SDR manager name}

# Sample output — for parser wiring

> A literal example of what the skill emits for one fictional lead. Use
> this when wiring the downstream parser (Clay AI column → property
> mapping, HubSpot custom-code action → property writeback, CSV
> post-processor). The schema below is what the skill commits to; the
> values are illustrative.

## Single-lead output

```markdown
# Lead score — jane.doe@northwind.com (northwind.com)

**Score:** 7.4 / 10 — Tier B
**Source:** inbound_content
**Escalate:** no

## Recommended next action

Tier B + inbound_content → SDR personalized email within 24h, no auto-sequence. Reference content piece they engaged with.

## Rationale (per criterion)

| Criterion | Weight | Tier | Reason |
|---|---|---|---|
| Industry | 5 | A | "Vertical SaaS / RevOps" matches in-ICP row in rubric. |
| Headcount | 4 | B | 240 employees — in stretch range (200-500), not core (500-2000). |
| Geo | 3 | A | HQ US-east, in supported region. |
| Tech stack | 4 | B | Salesforce + Marketo present (fit signals); no data warehouse cited. |
| Funding stage | 2 | C | Bootstrapped — out of preferred Series B-D band. |
| Job title | 4 | A | "Director, RevOps" matches champion-target pattern. |

## Disqualifier check

None triggered.

## Data gaps

- `revenue` field not provided by enrichment.

---

_Rubric SHA-256: 4f9c...a812 | Last edited 2025-12-15 by Sam Patel_
```

## Batch output

For a batch of N leads, the skill prepends a summary table and emits one block per lead separated by `\n---\n`:

```markdown
# Batch summary (12 leads)

| Email | Tier | Score | Escalate |
|---|---|---|---|
| jane.doe@northwind.com | B | 7.4 | no |
| ahmed@tailspintoys.io | A | 8.9 | no |
| j.smith@gmail.com | disqualified | 0 | hard_disqualifier:free_email |
| ... | ... | ... | ... |

---

# Lead score — jane.doe@northwind.com (northwind.com)
...
---
# Lead score — ahmed@tailspintoys.io (tailspintoys.io)
...
```

## Field contract for parsers

If you write a parser instead of consuming the markdown, these are the stable fields:

- `email` — string, lowercased
- `domain` — string, lowercased
- `score` — float, 0.0 to 10.0, one decimal
- `tier` — enum: `A` / `B` / `C` / `disqualified`
- `source` — pass-through of the input `source_of_lead`, or `unknown`
- `escalate` — enum: `no` / `needs_human_review` / `insufficient_data` / `hard_disqualifier:{reason}`
- `next_action` — string, single sentence
- `rationale[]` — list of `{criterion, weight, tier, reason}`
- `data_gaps[]` — list of strings (field names)
- `rubric_sha256` — string, 8-character prefix in the markdown footer; full hash available via the skill's structured-output mode