claude-skill

Clay + Claudeによるリード自動エンリッチメント

Difficulty

中級

Setup time

30min

For

revops · sdr-leader

RevOps

Stack

domain をキーとするClayのテーブル行を受け取り、1回の構造化パスで3つの派生フィールドを返すClaude スキル（lead-enrichment）です。2文の企業サマリー、1〜10のICP適合スコアと一行の理由、企業の公開情報から具体的なシグナルを引用した50語以内のコールドメール冒頭文を生成します。冒頭文は常に担当者レビュー用のドラフトであり、自動送信ステップへの接続を拒否します。

アーティファクトバンドルは apps/web/public/artifacts/lead-enrichment-clay-claude/ に収録されており、スキル本体（SKILL.md）とオペレーターが一度設定すれば以降の全実行でスキルが読み込む3つの参照テンプレートが含まれています。

使用すべきタイミング

少なくとも domain カラムが入力済みのClayテーブル（またはClayにエクスポートできるリードのCSV）があり、エンリッチメント済みのレコードをHubSpot、Salesforce、Attio、またはシーケンサーにプッシュする必要がある場合。
ボリュームが数百件から数万件のバッチの場合。それ以下では手書きの冒頭文の方が速く、それ以上ではこのスキルのコスト規律（ICPゲートによる冒頭文生成、1回の呼び出しによる抽出、ホストごとのフェッチ上限）が行あたりのコストを合理的に保ちます。
担当者が送信前にドラフト冒頭文を確認する意思がある場合。設計全体がドラフトと送信の間に人の目があることを前提としており、それなしでは機能が大きく低下します。
チームがClayのAIカラムからも、エクスポートされたリストに対してAnthropic Batch APIを実行するClaude Codeセッションからも呼び出せる単一のスキルが必要な場合。スキル本体は両方の表面で同一です。

使用すべきでないタイミング

自動送信シーケンサー。 「opener_draft をそのまま最初のシーケンスステップに接続する」計画であれば、中止してください。冒頭文には opener_confidence フィールドが存在する理由があります。約5件に1件のドラフトは書き直しが必要です。そのまま自動送信することは、スキルが防ごうとしている失敗モードです。
処理の適法な根拠が文書化されていないリスト。 事前同意のないEUまたはUKの連絡先、法律25号下のケベックのリード、CCPAの経路がないカリフォルニアの消費者データ——スキルは指定されたものは何でもエンリッチメントします。だからこそオペレーターが先に確認する必要があります。SKILL.md の「Do NOT invoke」ブロックがこれをカバーしており、コメントアウトしないでください。
アカウントレベルのディスカバリーブリーフ。 代わりに account-research スキルを使用してください。そちらはペルソナのマッピングと課題の仮説を深掘りしますが、こちらはバッチボリュームと行あたりコストを最適化します。
ライセンスされた財務データが必要な判断（S&P、Pitchbook）。 このスキルは公開ウェブのみを読み取り、ホームページのコピーから売上高を捏造しません。
ソースデータがゴミのリスト。 ゴミのドメインを入れれば、ゴミのドラフトが出ます。上流でClayのドメイン検証カラムを実行してください。駐車ドメインと404はスキルがスキップしますが、それを確認するためのクレジットは消費されます。

セットアップ

スキルを環境にドロップします。Claude.aiの場合、apps/web/public/artifacts/lead-enrichment-clay-claude/SKILL.md からインポートし、references/ の3つのファイルをアップロードします。Claude Codeの場合、ディレクトリを ~/.claude/skills/lead-enrichment/ にコピーします。
references/1-icp-rubric.md 内の {...} プレースホルダーをチームの実際のICPに置き換えます。スキルは未置換のプレースホルダーを検出し、テンプレートに対してスコアリングすることを拒否します——推測する代わりに score: null, reason: "rubric not configured" を返します。これは意図的な動作です。間違ったルーブリックはルーブリックなしより悪いからです。
references/2-opener-style-guide.md をチームの声と禁止フレーズに合わせて編集します。デフォルトは明らかなLLMの兆候（「気づきました」「素晴らしい取り組みですね」など、あらゆる最上級表現）を禁止しています。担当者が指摘した会社固有の禁止事項を追加してください。
references/3-source-quality-matrix.md を編集して、このスキルの上流でClayテーブルに接続されているエンリッチメントベンダー間の優先順序を宣言します。順序が宣言されていないと、スナップショットステップが実行ごとにApolloとClearbitの間でフラフラし、ICPスコアがずれます。
Clayで3つのAIカラムを作成してスキルを参照させます：summary、icp_fit_score、opener_draft。SKILL.md の「Inputs」セクションに従ってインプットをマッピングします。デスティネーションプッシュをHubSpot（または自社CRM）に設定し、opener_draft を自動送信ではなく手動承認が必要なステップのシーケンス変数にルーティングします。
20行のサンプルで実行します。スポットチェック：スナップショットのファクトは実際のホームページコピーに追跡できるか、ICPスコアは正常な分布に収まっているか、冒頭文はスタイルガイドに合格するか。スケールアップ前にルーブリックとスタイルガイドを調整します。最初の100行はキャリブレーションデータです。そのように扱ってください。

スキルの実際の動作

行ごとに4つのサブタスクを順番に実行します：

解決とフェッチ。 https://{domain} に10秒タイムアウトと1回のリダイレクトホップでアクセスし、次に /about、/company、/customers にベストエフォートでアクセスします。駐車済み・404・空のボディのドメインは status: unreachable を返し、残りをスキップします。ホストごとの同時実行上限は2で、最小250msの遅延と429への1回のバックオフリトライがあります。
構造化スナップショットの抽出（1回のClaudeコール）： 業界、size_signal、value_prop、オプションのURLつき recent_signal。ラウンドトリップ数がスケール時の行あたりコストの主要ドライバーであり、抽出プロンプトが1パスで信頼性を維持するため、3回ではなく1回のコールを使用します。
ICPルーブリックによるスコアリング。 references/1-icp-rubric.md を読み込み、スコアを決定したルーブリックディメンションを名指しする一行の理由とともに1〜10を返します。
冒頭文の生成 — スコアが閾値を超えた場合のみ（デフォルト6/10）。プロンプト内のハードルール：50語の上限、スナップショットから正確に1つのファクトを引用、最上級表現なし、捏造された企業クレームなし、偽の課題質問のクロージングなし。opener_confidence 0.0〜1.0を返し、0.5未満は書き直しのフラグを立てます。

1行のアウトプットはMarkdown内に埋め込まれたJSONブロックです——ClayのAIカラムがパースし、実行ログを読む人間もスキャンできます。完全なスキーマと作業例は apps/web/public/artifacts/lead-enrichment-clay-claude/SKILL.md の「Output format」セクションにあります。

コストの現実

コストラインは2つあります：Anthropicのトークンとクレイのクレジットです。

Anthropicトークン（Sonnet価格、執筆時点でインプット $3/MTok、アウトプット $15/MTok）：

ステップ1+2（フェッチ+抽出）：約3,500インプットトークン（ホームページ+アバウトを約8,000文字にトランケート）+約250アウトプットトークン。約$0.014/行。
ステップ3（ICPスコア）：約1,000インプット+約80アウトプット。約$0.004/行。
ステップ4（冒頭文、スコアが閾値以上の場合のみ）：約1,200インプット+約120アウトプット。約$0.005/行。

閾値を超えた行は約$0.023、超えなかった行は約$0.018です。ワークロードが急がない場合（インバウンドMQLの夜間エンリッチメントが典型的なユースケース）はAnthropic Batch API経由で約50%オフになり、行あたり$0.01〜0.012のレンジになります。

スケール時：閾値通過率40%の週10万行バッチでは、バッチ割引前で月$1,500〜2,000、割引後で月$800〜1,000のAnthropicコストになります。

Clayのクレジットは上流に接続されているベンダーカラムによって異なります。Apolloは解決済みドメイン1件あたり約1クレジット、Clearbit Revealは2〜3、ZoomInfo（有料パススルー）はそれ以上です。ベンダーを3つスタックすると、スキル本体が実行される前に1行あたり8〜12クレジットに達する可能性があります。Starterプランは月5,000クレジット、Proプランは25,000クレジットです。そのベンダースタックで週10万行のバッチを処理するにはEnterpriseティアか、references/3-source-quality-matrix.md のより厳格なマトリクスが必要です。マトリクスは行あたりコストの上限が発動した際に最低ランクのベンダーを除外するために存在します。

計算が粗いと感じる場合、レバレッジポイントはICPの閾値です。6から7に引き上げると通常25〜35%多くの行で冒頭文生成が抑制され、8に引き上げるとさらに20%減ります。スキルはバッチ終了時にスコア分布をログに記録するため、オペレーターは勘ではなく実証的に調整できます。

成功指標

opener_confidence バケット別に分類した、担当者レビュー済み冒頭文の返信率。スキルのポイントは「1時間あたりの冒頭文数を増やす」ことではなく「担当者がゼロから書き直しをやめるほど良質な冒頭文」です。計測する価値のある2つのサブ指標：

書き直し率 — 担当者が opener_draft をそのまま送信する対実質的に書き直す割合。目標：最初の500行キャリブレーション後、confidence 0.7以上の行で35%未満。それ以上の場合はスタイルガイド、ルーブリック、またはスナップショットステップの幻覚が問題です。
confidenceバケット別返信率。 opener_confidence >= 0.7 の返信率は0.5未満のバケットの少なくとも1.5倍であるべきです。同程度であればconfidenceスコアはシグナルではありません——ルーティングインプットとして信頼する前に調査してください。

代替手段との比較

vs Apolloのネイティブシーケンスパーソナライズ。 Apolloは自社のエンリッチメントデータから冒頭文を生成します。立ち上げは速いですが、冒頭文は見え見えのテンプレートで、ApolloのICP経験則（自社のものではない）でスコアリングされ、どのファクトがドラフトを決定したかの監査証跡がありません。このスキルは立ち上げに時間がかかり行あたりコストも高いですが、冒頭文はワンクリックで確認できる日付付きURLを参照し、ルーブリックは自分で管理するファイルです。
vs Clearbit + Outreach Smart Variables。 Clearbitが提供するSmart Variablesは事実のメールマージ（"業界は${X}です"）を生成するものであって、冒頭文ではありません——担当者が変数の周りに実際の文を書く必要があります。担当者がすでに文を書いている場合はこのスキルより安く、書いていない場合は担当者の時間がトークンコストを上回るため全体的に高くなります。
vs 手動による冒頭文作成。 シニアSDRは1件のアカウントに対して高品質なコールドオープナーを書くのに4〜7分かかります。フル原価ベースで時給$60として、1冒頭文あたり約$5の担当者コストです。スキルは最大で約$0.025/冒頭文です。ただし担当者は書きながらアカウントレベルの思考も行っています。スキルはしません。ほとんどのチームにとって適切な構成は、トップオブファネルのボリューム（ティア1アカウントリスト以下のすべて）にスキルを使い、名前付きアカウントリストには担当者作成の冒頭文を使うことです。
vs 現状（エンリッチメントなし、汎用冒頭文）。 汎用冒頭文の返信率は公開ベンチマークで1〜2%程度、最近のシグナルに紐づけた軽くパーソナライズした冒頭文は4〜8%です。スキルは後者を目指します。ルーブリックとスタイルガイドを整備する意思がある場合にのみ実施する価値があります。それなしでは、スキルのアウトプットは現状より実質的に優れていません。

注意点

ベンダー間のソース品質のずれ。 Apollo、Clearbit、ZoomInfoが同じ行を別々にエンリッチメントし、従業員数や業界について不一致が生じた場合、スナップショットステップが実行ごとにフラフラします。対策：references/3-source-quality-matrix.md が優先順序を宣言し、スナップショットはフィールドごとに使用したベンダー（またはホームページの値）を引用するため、行ごとの conflicts ログでずれを監査できます。
データにない主張を冒頭文が捏造する。 厳格なプロンプトなしでは、冒頭文が自信に満ちた事実のように聞こえる内容を捏造します（「シリーズCおめでとうございます」でもシリーズCはない）。対策：冒頭文プロンプトはインラインでスナップショットを受け取り、「スナップショットにないファクトは禁止」という明示的なルールがあります。recent_signal はワンクリック確認用のURLを持ち、opener_confidence 0.5未満の冒頭文は書き直しのフラグが立てられ、自動送信されません。
ICPフィルターが緩い場合の行あたりコストの増大。 ほとんどの行を7以上にスコアするルーブリックは閾値ゲートを無効にし、すべての行で冒頭文が生成されて行あたりコストが3〜4倍に上昇します。対策：スキルはバッチごとに score_distribution サマリーを出力し、1,000行サンプルの60%以上が7以上になった場合は警告を出して次のバッチ前にルーブリックを厳格化することを推奨します。
古くなった recent_signal。 90日前に抽出したシグナルは負債になります——「3月のローンチを拝見して」と8月に書く担当者は時代遅れに見えます。対策：すべてのレコードに enriched_at があり、ClayカラムはURLの日付が enriched_at より60日以上古い場合に recent_signal の使用を拒否するよう設定されています。
同意と適法な根拠。 スキルは指定されたものを何でもエンリッチメントします。SKILL.md の「Do NOT invoke」ブロックはソースリストの適法な根拠を実行前に確認するよう注意するためのものです。削除しないでください。

スタック

Clay — テーブルの基盤、上流のエンリッチメントベンダースタック、デスティネーションCRMプッシュ。StarterプランはスキルがプラグインできるAIカラムのプリミティブをサポートします。本格的なバッチのクレジットボリュームにはProが必要です。
Claude（デフォルトはSonnet） — スナップショット抽出、ICPスコアリング、冒頭文生成のための推論レイヤー。ClayのネイティブAIカラム経由、またはClaude CodeセッションからAnthropic Batch API経由で急がないバッチを半額で実行できます。
HubSpot、Salesforce、またはAttio — デスティネーションCRM。summary → カスタムプロパティ、icp_fit_score → カスタムプロパティ+ビューフィルター、opener_draft → 手動承認ステップのファーストタッチシーケンス変数にマッピングします。

GitHubでこのページを編集

Files in this artifact

Download all (.zip)

---
name: lead-enrichment
description: Enrich a Clay table row with company summary, ICP fit score, and a personalized cold-email opener. Use this skill from Clay AI columns or from a Claude Code session driving the Anthropic Batch API over an exported lead list. Returns structured fields plus a draft opener for rep review — never auto-sends.
---

# Lead enrichment

## When to invoke

Whenever a Clay table or exported CSV of leads needs three derived fields in one pass: a 2-sentence company summary, an ICP fit score with one-line reasoning, and a sub-50-word cold-email opener that cites a real signal from the company's public footprint. Take a domain (and optionally a LinkedIn URL or a free-text ICP rubric) as input and produce a structured JSON-shaped Markdown record per row.

Do NOT invoke this skill for:

- Auto-sending email. The opener is a draft for rep review. Wiring the output directly into a sequence-send step bypasses the guardrail this skill exists to enforce.
- Enriching leads that have not consented to processing. If the source list was scraped from a region with prior-consent rules (EU/UK GDPR, Quebec Law 25, etc.) without a documented lawful basis, stop and surface the concern to the operator. This skill will not silently process unconsented data.
- Account-level research briefs for discovery calls — use the `account-research` skill instead. This one optimizes for batch volume and cost-per-row, not depth.
- Public-company financial scoring or any decision that needs licensed data (S&P, Pitchbook). This skill reads public web only.

## Inputs

- Required: `domain` — the company's primary domain (e.g. `acme.com`). Must resolve and serve HTML; parked or dead domains are returned with `status: unreachable` and skipped.
- Required at config time: a Clay table reference (table ID + column map) OR a CSV path if running outside Clay. The skill assumes the table has at least `domain`; `first_name`, `title`, and `linkedin_url` improve opener quality if present.
- Required at config time: `references/icp-rubric.md` populated with your real rubric. Without it, `icp_fit_score` is omitted (not guessed).
- Optional: `references/opener-style-guide.md` — your team's voice, banned phrases, max length override.
- Optional: `references/source-quality-matrix.md` — vendor preference order when multiple enrichment columns upstream of this skill disagree.
- Optional: `recent_news` — pre-fetched by an upstream Clay column. When present, the opener step uses it instead of re-fetching.

## Reference files

Always read the following from `references/` before generating any row. They encode the operator's positioning and quality bar — without them output is generic.

- `references/icp-rubric.md` — the rubric the score uses. Replace template content with your actual ICP definition before first run.
- `references/opener-style-guide.md` — voice, length cap, banned phrases (e.g. "I noticed", "love what you're doing", any superlative).
- `references/source-quality-matrix.md` — preference order for resolving conflicts between upstream enrichment vendors (Apollo vs. Clearbit vs. ZoomInfo vs. Clay's native People/Company API).

## Method

Run these four sub-tasks in order per row. Steps 1-3 produce the structured fields; step 4 only runs when the ICP score clears the configured threshold.

### 1. Resolve and fetch

Hit `https://{domain}` with a 10-second timeout and one redirect hop. On non-2xx, parked-domain heuristics, or empty body, mark `status: unreachable` and skip steps 2-4. Do not retry indefinitely — parked domains are a meaningful share of any scraped list and the skill should fail fast on them rather than burn API spend.

If the homepage resolves, additionally fetch `/about`, `/company`, and `/customers` (best-effort, ignore 404s). Concatenate cleaned text up to ~8K tokens; truncate from the end of the longest section, not the front, to preserve the homepage hero copy.

### 2. Extract structured snapshot

In one Claude call, extract:

- `industry` — single noun phrase, derived from copy not guessed
- `size_signal` — one of `solo | smb | mid | ent`, justified by a cited copy fragment (employee count claim, customer-logo density, funding mention) or `unknown`
- `value_prop` — one sentence in the company's own framing
- `recent_signal` — at most one verifiable fact (a launch, hire, funding round, customer win) with a URL anchor. If none found in the fetched pages, return `null` rather than inventing one.

Engineering choice: extraction is a single structured Claude call, not three. Each additional round-trip multiplies cost-per-row at 100K-row scale; the extraction prompt is small enough to stay reliable in one pass.

### 3. Score against ICP rubric

Load `references/icp-rubric.md`. Score the snapshot 1-10 with a one-line justification that names which rubric dimension drove the score. If the rubric file still contains template placeholders (`{...}` literals), return `score: null` and `reason: "rubric not configured"` rather than scoring against nothing.

Engineering choice: ICP scoring runs before opener generation, not after. Rationale: opener generation is the most expensive step (longer output, more careful prompt). Skipping it for sub-threshold rows is where the cost savings live. A common misconfiguration is to generate openers for every row "in case the rep wants to override" — that defeats the filter.

### 4. Generate opener (only if score >= threshold)

Default threshold: 6/10. Configurable per run.

Inputs to the opener step: `value_prop`, `recent_signal` (if non-null), the lead's `first_name` and `title` if present, and the style guide.

Hard rules baked into the opener prompt:

- Max 50 words.
- Must reference exactly one specific thing from `recent_signal` or `value_prop`. If both are null, return `opener: null` rather than writing flattery.
- No superlatives ("amazing", "incredible", "love").
- No invented company claims. The opener may only reference facts present in the snapshot. The prompt includes the snapshot inline and explicitly forbids facts not in it.
- No question close that pretends to know an internal pain ("how are you handling X?" without evidence X is happening).

The opener is always returned as a draft with a `confidence` field (0-1) reflecting how strong the cited signal was. Reps see the confidence; low-confidence openers are surfaced for rewrite, not auto-sent.

## Output format

One record per row, emitted as a fenced JSON block inside Markdown so Clay's AI column can parse it and a human reading the log can scan it.

```markdown
### acme.com

```json
{ "domain": "acme.com", "status": "ok", "industry": "Workforce management software", "size_signal": "mid", "size_signal_evidence": "About page cites '350+ employees across 4 offices'", "value_prop": "Schedules and pays hourly workforces for restaurants and retail.", "recent_signal": { "summary": "Launched a payroll module in March 2026.", "url": "https://acme.com/blog/payroll-launch" }, "icp_fit_score": 8, "icp_fit_reason": "Mid-market, hourly-workforce vertical — direct match on rubric line 2.", "opener_draft": "Saw the March payroll launch — folding pay into the same surface as scheduling is the move most ops leaders ask us about. Worth a 20-min on whether the multi-state tax piece is hitting the same wall others have?", "opener_confidence": 0.78, "sources": [ "https://acme.com/", "https://acme.com/about", "https://acme.com/blog/payroll-launch" ] }
```
```

For unreachable rows the record collapses to `{ "domain": "...", "status": "unreachable" }` with no other fields — no placeholder strings.

## Watch-outs

- **Source-quality drift across vendors.** When a Clay table has both Apollo and Clearbit enrichment columns upstream and they disagree on headcount or industry, this skill's snapshot can flip-flop run to run. Guard: `references/source-quality-matrix.md` declares a preference order; the snapshot step cites which vendor (or homepage-derived value) it used per field, so drift is auditable.

- **Opener inventing claims that aren't in the data.** Without strict prompting, openers drift toward confident-sounding fabrications ("congrats on your Series C" when there was no Series C). Guard: the opener prompt receives the snapshot inline and has an explicit "facts not in the snapshot are forbidden" rule; `recent_signal` carries a URL so a reviewer can verify in one click; openers with `opener_confidence` under 0.5 are flagged for rewrite, not sent.

- **Cost-per-row escalation if the ICP filter is loose.** A rubric that scores most rows 7+ defeats the threshold gate; opener generation runs on every row and per-row cost rises 3-4x. Guard: the skill logs a `score_distribution` summary at the end of each batch (`{ "1-3": N, "4-6": N, "7-10": N }`); if more than 60% of rows land 7+ across a 1K-row sample, the skill prints a warning to tighten the rubric before the next batch.

- **Rate limits on homepage fetches.** Hitting one root domain per row is fine; hitting `/about` and `/customers` triples the request count and some hosts throttle. Guard: per-host concurrency capped at 2, per-host minimum delay 250ms, and 429s are honored with a single back-off retry before the row is marked `status: unreachable`.

- **Stale enrichment.** A row enriched 90 days ago is not the same signal as one enriched today; reps treating old `recent_signal` values as fresh write embarrassing openers. Guard: every record carries an `enriched_at` timestamp, and the Clay column is configured to re-run when `enriched_at` is older than 30 days.

# ICP rubric — TEMPLATE

> Replace every `{...}` placeholder with your team's real values before
> the first run. The lead-enrichment skill checks for unsubstituted
> placeholders and refuses to score against a template — it returns
> `score: null, reason: "rubric not configured"` instead of guessing.

## Firmographics

| Dimension | In-ICP | Stretch | Out |
|---|---|---|---|
| Industry | {list, e.g. "B2B SaaS, fintech, vertical SaaS"} | {adjacent industries} | {hard out, e.g. "consumer, agencies, edu"} |
| Headcount | {range, e.g. "200-2000"} | {range, e.g. "100-200, 2000-5000"} | {below/above, e.g. "<100, >5000"} |
| Revenue | {range, e.g. "$25M-$500M ARR"} | {range} | {below/above} |
| Geo | {regions, e.g. "US, Canada, UK"} | {regions, e.g. "EU, ANZ"} | {regions, e.g. "China, Russia, sanctioned"} |
| Funding stage | {stages, e.g. "Series B-D, public"} | {stages} | {stages, e.g. "pre-seed, bootstrapped <$5M"} |

## Technographics

Tools whose presence on the prospect's stack signals fit (we win when they have these because the integration story is short or the pain is acute):

- {Tool 1 — e.g. "Salesforce + Outreach"}
- {Tool 2 — e.g. "dbt + Snowflake"}
- {Tool 3 — e.g. "Segment"}

Tools whose presence signals misfit (we lose to them, or their presence implies a build-not-buy culture):

- {Tool A — e.g. "Hightouch (we lose to)"}
- {Tool B — e.g. "in-house reverse-ETL (build-not-buy)"}

## Pain ranking

When the skill produces an ICP score, it weights against this priority order. Higher = more important to surface first.

1. {Pain category — e.g. "outbound data quality blocking pipeline"} — weight 5
2. {Pain category — e.g. "rep ramp time eating quota"} — weight 4
3. {Pain category — e.g. "RevOps reporting backlog"} — weight 3
4. {Pain category — e.g. "tooling sprawl / consolidation push"} — weight 2
5. {Pain category — e.g. "compliance / data-residency"} — weight 1

## Score interpretation

The skill returns `1-10`. Suggested interpretation (override per team):

- **9-10** — direct ICP, multiple positive signals, no disqualifiers. Send.
- **7-8** — strong fit on firmographics, signal is plausible. Send after rep glance.
- **5-6** — adjacent. Default threshold cuts here. Opener may not generate.
- **3-4** — stretch fit, weak signal. Skip unless rep has specific reason.
- **1-2** — out of ICP. Suppress.

## Disqualifiers

Single signals that drop a row out regardless of other fit. The skill flags these with `disqualifier: "..."` in the output and forces score to 1.

- {Disqualifier 1 — e.g. "uses {Competitor} as system of record"}
- {Disqualifier 2 — e.g. "regulated industry without our SOC 2 + HIPAA"}
- {Disqualifier 3 — e.g. "headcount under 50 — wrong motion"}

## Last edited

{YYYY-MM-DD} — bump on every material change. The skill prepends a "rubric may be stale" note when this date is more than 90 days old.

# Opener style guide — TEMPLATE

> Replace the placeholders with your team's actual voice rules. The
> lead-enrichment skill loads this file on every opener generation.
> Treat it as the single source of truth — do not duplicate rules into
> the prompt itself, where they will drift.

## Voice

One sentence on register: {e.g. "direct, plain, no jargon, no exclamation marks, no questions in the close unless the question is specific"}.

One sentence on stance: {e.g. "we're peers solving the same problem, not vendors selling at"}.

## Length

- Hard cap: 50 words. The skill enforces this and truncates with an ellipsis warning if the model overshoots.
- Soft target: 30-40 words. Three sentences max.
- No greeting line ("Hi {name},"). The sequence step adds the greeting; baking one into the opener double-greets.

## What every opener must contain

1. A specific reference to one thing from the snapshot — preferably `recent_signal`. If there is no recent signal, the opener references the `value_prop` and labels itself low-confidence.
2. A reason the reference matters to the operator's persona. Not "I noticed X" — "X is the move ops leaders ask us about".
3. A single, specific close. Either a 20-min ask or a question that would be answerable only by someone inside the company.

## Banned phrases

The prompt rejects any opener containing these. Add team-specific additions over time.

- "I hope this email finds you well"
- "I noticed"
- "I came across your"
- "love what you're doing"
- "quick question"
- "circling back" (in a first touch)
- Any superlative — "amazing", "incredible", "best-in-class", "leading"
- Any em-dash trio that reads as an LLM tic — "X — Y — Z"
- Compliments on growth/funding without a cited source

## Banned structures

- The "I was just on your website" opener. Always reads false.
- The "we work with companies like {logo}" name-drop in sentence one.
- The "are you the right person?" close. Wastes the reply.

## Confidence signal

The skill returns `opener_confidence: 0.0-1.0`:

- **0.8+** — opener references a dated `recent_signal` with a URL and the rep's persona is in `references/1-icp-rubric.md` buying committee.
- **0.5-0.8** — opener references `value_prop` only, or `recent_signal` is older than 60 days.
- **<0.5** — opener is generic; rep should rewrite or skip the row.

Threshold for auto-queueing into a sequence: configure per-team. Default recommendation: **never auto-queue**. The opener is a draft.

## Worked example

Snapshot: industry "scheduling for restaurants", recent_signal "March payroll launch", value_prop "schedules and pays hourly workforces".

Opener (35 words, confidence 0.78):

> Saw the March payroll launch — folding pay into the same surface as
> scheduling is the move most ops leaders ask us about. Worth a 20-min
> on whether the multi-state tax piece is hitting the same wall others
> have?

Why it scores 0.78 not 1.0: the close assumes a pain ("multi-state tax piece is hitting the same wall") that the snapshot does not directly evidence. A 1.0 opener would tie the close to a fact in the snapshot.

## Last edited

{YYYY-MM-DD}

# Source quality matrix — TEMPLATE

> Replace placeholders with the vendors actually wired into your Clay
> table. The lead-enrichment skill consults this matrix when two
> upstream sources disagree on the same field, and cites which source
> won in the per-row output for audit.

## Why this exists

Clay tables routinely stack 2-3 enrichment vendors per field — Apollo on industry, Clearbit on headcount, ZoomInfo on funding. They disagree often: Apollo will say a 200-person company is 50, Clearbit will assign the wrong industry, ZoomInfo will lag a recent acquisition by a quarter.

Without a declared preference order, the skill's snapshot step flip-flops between vendors run to run, ICP scores oscillate, and operators cannot reproduce the same row twice.

## Field-by-field preference order

Per field, list vendors top-to-bottom in trust order. The skill takes the highest-ranked vendor that returned a non-empty value. The homepage-derived value (extracted by the skill itself in step 1) is always the tiebreaker if all vendors disagree with each other.

### Industry / sector

1. {e.g. "Homepage About-page extraction (skill step 1)"}
2. {e.g. "Clearbit Reveal"}
3. {e.g. "Apollo Company"}
4. {e.g. "ZoomInfo Industry"}

### Headcount

1. {e.g. "LinkedIn employee count via Clay's Linked-In Profile column"}
2. {e.g. "Apollo employee_count"}
3. {e.g. "Clearbit metrics.employees"}
4. Skill homepage extraction (last resort — copy claims often inflated)

### Revenue / ARR

1. {e.g. "ZoomInfo revenue (paid only)"}
2. {e.g. "Clearbit metrics.annualRevenue (estimated, low confidence)"}
3. Omit (do not extract from homepage; vanity metrics are not revenue)

### Funding stage / last round

1. {e.g. "Crunchbase via Clay's native column"}
2. {e.g. "Press-release fetch in skill step 1, only if dated within 90d"}
3. Omit

### Recent signal (launches, hires, news)

1. Skill homepage / blog fetch in step 1 — must carry a dated URL
2. {e.g. "Clay's Built-With column for tech-stack changes"}
3. {e.g. "Google News column, capped to last 60 days"}

Important: a vendor returning "we don't know" is treated as missing, not as a contradicting value. The next vendor in the list is consulted.

## Conflict logging

When two vendors return different values for the same field, the skill adds a `conflicts` block to the per-row output:

```json
"conflicts": [
  { "field": "headcount", "values": { "linkedin": 380, "apollo": 120 }, "chose": "linkedin" }
]
```

Operators should sample 10-20 conflict rows weekly. If one vendor is consistently wrong, demote it in this matrix. The matrix is the control surface, not the prompt.

## Vendor cost ceiling

Each vendor has a per-row cost (Clay credits or external API spend). Set a soft cap here — if a row's combined enrichment cost exceeds the cap, skip the lowest-ranked vendor and accept lower confidence.

- Soft cap per row: {e.g. "12 Clay credits"}
- Hard cap per row: {e.g. "20 Clay credits — abort row"}

The skill respects these caps and emits `cost_ceiling_hit: true` when the hard cap fires, so the batch summary surfaces the count.

## Last edited

{YYYY-MM-DD}